linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric DeVolder <eric.devolder@oracle.com>
To: linux-kernel@vger.kernel.org, x86@kernel.org,
	kexec@lists.infradead.org, ebiederm@xmission.com,
	dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com,
	nramas@linux.microsoft.com, thomas.lendacky@amd.com,
	robh@kernel.org, efault@gmx.de, rppt@kernel.org,
	konrad.wilk@oracle.com, boris.ostrovsky@oracle.com,
	eric.devolder@oracle.com
Subject: [PATCH v4 00/10] crash: Kernel handling of CPU and memory hot un/plug
Date: Wed,  9 Feb 2022 14:56:56 -0500	[thread overview]
Message-ID: <20220209195706.51522-1-eric.devolder@oracle.com> (raw)

When the kdump service is loaded, if a CPU or memory is hot
un/plugged, the crash elfcorehdr (for x86), which describes the CPUs
and memory in the system, must also be updated, else the resulting
vmcore is inaccurate (eg. missing either CPU context or memory
regions).

The current solution utilizes udev to initiate an unload-then-reload
of the kdump image (e. kernel, initrd, boot_params, puratory and
elfcorehdr) by the userspace kexec utility. In previous posts I have
outlined the significant performance problems related to offloading
this activity to userspace.

This patchset introduces a generic crash hot un/plug handler that
registers with the CPU and memory notifiers. Upon CPU or memory
changes, this generic handler is invoked and performs important
housekeeping, for example obtaining the appropriate lock, and then
invokes an architecture specific handler to do the appropriate
updates.

In the case of x86_64, the arch specific handler generates a new
elfcorehdr, and overwrites the old one in memory. No involvement
with userspace needed.

To realize the benefits/test this patchset, one must make a couple
of minor changes to userspace:

 - Disable the udev rule for updating kdump on hot un/plug changes
   Eg. on RHEL: rm -f /usr/lib/udev/rules.d/98-kexec.rules
   or other technique to neuter the rule.

 - Change to the kexec_file_load for loading the kdump kernel:
   Eg. on RHEL: in /usr/bin/kdumpctl, change to:
    standard_kexec_args="-p -d -s"
   which adds the -s to select kexec_file_load syscall.

This patchset supports kexec_load with a modified kexec userspace
utility, and a working changeset to the kexec userspace utility
is provided here (and to use, the above change to standard_kexec_args
would be, for example, to append --hotplug-size=262144 instead of -s).

 diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
 index 9826f6d..06adb7e 100644
 --- a/kexec/arch/i386/crashdump-x86.c
 +++ b/kexec/arch/i386/crashdump-x86.c
 @@ -48,6 +48,7 @@
  #include <x86/x86-linux.h>
  
  extern struct arch_options_t arch_options;
 +extern unsigned long long hotplug_size;
  
  static int get_kernel_page_offset(struct kexec_info *UNUSED(info),
  				  struct crash_elf_info *elf_info)
 @@ -975,6 +976,13 @@ int load_crashdump_segments(struct kexec_info *info, char* mod_cmdline,
  	} else {
  		memsz = bufsz;
  	}
 +
 +    /* If hotplug support enabled, use that size */
 +    if (hotplug_size) {
 +        memsz = hotplug_size;
 +    }
 +
 +    info->elfcorehdr =
  	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
  							max_addr, -1);
  	dbgprintf("Created elf header segment at 0x%lx\n", elfcorehdr);
 diff --git a/kexec/kexec.c b/kexec/kexec.c
 index f63b36b..9569d9a 100644
 --- a/kexec/kexec.c
 +++ b/kexec/kexec.c
 @@ -58,6 +58,7 @@
  
  unsigned long long mem_min = 0;
  unsigned long long mem_max = ULONG_MAX;
 +unsigned long long hotplug_size = 0;
  static unsigned long kexec_flags = 0;
  /* Flags for kexec file (fd) based syscall */
  static unsigned long kexec_file_flags = 0;
 @@ -672,6 +673,12 @@ static void update_purgatory(struct kexec_info *info)
  		if (info->segment[i].mem == (void *)info->rhdr.rel_addr) {
  			continue;
  		}
 +        /* Don't include elfcorehdr in the checksum, if hotplug
 +         * support enabled.
 +         */
 +        if (hotplug_size && (info->segment[i].mem == (void *)info->elfcorehdr)) {
 +			continue;
 +		}
  		sha256_update(&ctx, info->segment[i].buf,
  			      info->segment[i].bufsz);
  		nullsz = info->segment[i].memsz - info->segment[i].bufsz;
 @@ -1504,6 +1511,17 @@ int main(int argc, char *argv[])
  		case OPT_PRINT_CKR_SIZE:
  			print_crashkernel_region_size();
  			return 0;
 +		case OPT_HOTPLUG_SIZE:
 +            /* Reserved the specified size for hotplug growth */
 +			hotplug_size = strtoul(optarg, &endptr, 0);
 +			if (*endptr) {
 +				fprintf(stderr,
 +					"Bad option value in --hotplug-size=%s\n",
 +					optarg);
 +				usage();
 +				return 1;
 +			}
 +			break;
  		default:
  			break;
  		}
 diff --git a/kexec/kexec.h b/kexec/kexec.h
 index 595dd68..b30dda4 100644
 --- a/kexec/kexec.h
 +++ b/kexec/kexec.h
 @@ -169,6 +169,7 @@ struct kexec_info {
  	int command_line_len;
  
  	int skip_checks;
 +    unsigned long elfcorehdr;
   };
  
  struct arch_map_entry {
 @@ -231,7 +232,8 @@ extern int file_types;
  #define OPT_PRINT_CKR_SIZE	262
  #define OPT_LOAD_LIVE_UPDATE	263
  #define OPT_EXEC_LIVE_UPDATE	264
 -#define OPT_MAX			265
 +#define OPT_HOTPLUG_SIZE	265
 +#define OPT_MAX			266
  #define KEXEC_OPTIONS \
  	{ "help",		0, 0, OPT_HELP }, \
  	{ "version",		0, 0, OPT_VERSION }, \
 @@ -258,6 +260,7 @@ extern int file_types;
  	{ "debug",		0, 0, OPT_DEBUG }, \
  	{ "status",		0, 0, OPT_STATUS }, \
  	{ "print-ckr-size",     0, 0, OPT_PRINT_CKR_SIZE }, \
 +	{ "hotplug-size",     2, 0, OPT_HOTPLUG_SIZE }, \
  
  #define KEXEC_OPT_STR "h?vdfixyluet:pscaS"
 

Regards,
eric
---
v4: 9feb2022
 - Refactored patches per Baoquan suggestsions.
 - A few corrections, per Baoquan.

v3: 10jan2022
 https://lkml.org/lkml/2022/1/10/1212
 - Rebasing per Baoquan He request.
 - Changed memory notifier per David Hildenbrand.
 - Providing example kexec userspace change in cover letter.

RFC v2: 7dec2021
 https://lkml.org/lkml/2021/12/7/1088
 - Acting upon Baoquan He suggestion of removing elfcorehdr from
   the purgatory list of segments, removed purgatory code from
   patchset, and it is signficiantly simpler now.

RFC v1: 18nov2021
 https://lkml.org/lkml/2021/11/18/845
 - working patchset demonstrating kernel handling of hotplug
   updates to x86 elfcorehdr for kexec_file_load

RFC: 14dec2020
 https://lkml.org/lkml/2020/12/14/532
 - proposed concept of allowing kernel to handle hotplug update
   of elfcorehdr
---

Eric DeVolder (10):
  crash: fix minor typo/bug in debug message
  crash hp: Introduce CRASH_HOTPLUG configuration options
  crash hp: definitions and prototype changes
  crash hp: prototype change for crash_prepare_elf64_headers
  crash hp: introduce helper functions un/map_crash_pages
  crash hp: generic crash hotplug support infrastructure
  crash hp: exclude elfcorehdr from the segment digest
  crash hp: exclude hot remove cpu from elfcorehdr notes
  crash hp: Add x86 crash hotplug support for kexec_file_load
  crash hp: Add x86 crash hotplug support for kexec_load

 arch/arm64/kernel/machine_kexec_file.c |   6 +-
 arch/powerpc/kexec/file_load_64.c      |   2 +-
 arch/x86/Kconfig                       |  26 +++++
 arch/x86/kernel/crash.c                | 123 ++++++++++++++++++++-
 include/linux/kexec.h                  |  23 +++-
 kernel/crash_core.c                    | 146 +++++++++++++++++++++++++
 kernel/kexec_file.c                    |  15 ++-
 7 files changed, 331 insertions(+), 10 deletions(-)

-- 
2.27.0


             reply	other threads:[~2022-02-09 20:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-09 19:56 Eric DeVolder [this message]
2022-02-09 19:56 ` [PATCH v4 01/10] crash: fix minor typo/bug in debug message Eric DeVolder
2022-02-09 19:56 ` [PATCH v4 02/10] crash hp: Introduce CRASH_HOTPLUG configuration options Eric DeVolder
2022-02-23  3:25   ` Baoquan He
2022-03-01 20:04     ` Eric DeVolder
2022-03-02  9:20       ` David Hildenbrand
2022-03-03 10:22         ` Baoquan He
2022-03-03 11:36           ` David Hildenbrand
2022-03-03 12:08             ` Baoquan He
2022-03-03 15:31               ` Eric DeVolder
2022-02-09 19:56 ` [PATCH v4 03/10] crash hp: definitions and prototype changes Eric DeVolder
2022-02-23  3:43   ` Baoquan He
2022-03-01 20:04     ` Eric DeVolder
2022-02-09 19:57 ` [PATCH v4 04/10] crash hp: prototype change for crash_prepare_elf64_headers Eric DeVolder
2022-02-23  3:46   ` Baoquan He
2022-03-01 20:05     ` Eric DeVolder
2022-02-09 19:57 ` [PATCH v4 05/10] crash hp: introduce helper functions un/map_crash_pages Eric DeVolder
2022-02-23  3:58   ` Baoquan He
2022-03-01 20:06     ` Eric DeVolder
2022-02-09 19:57 ` [PATCH v4 06/10] crash hp: generic crash hotplug support infrastructure Eric DeVolder
2022-02-23  3:59   ` Baoquan He
2022-03-01 20:07     ` Eric DeVolder
2022-02-09 19:57 ` [PATCH v4 07/10] crash hp: exclude elfcorehdr from the segment digest Eric DeVolder
2022-02-23  4:00   ` Baoquan He
2022-03-01 20:07     ` Eric DeVolder
2022-02-09 19:57 ` [PATCH v4 08/10] crash hp: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
2022-02-23  4:04   ` Baoquan He
2022-03-01 20:08     ` Eric DeVolder
2022-02-09 19:57 ` [PATCH v4 09/10] crash hp: Add x86 crash hotplug support for kexec_file_load Eric DeVolder
2022-02-23  4:10   ` Baoquan He
2022-03-01 20:12     ` Eric DeVolder
2022-03-03 10:27       ` Baoquan He
2022-02-09 19:57 ` [PATCH v4 10/10] crash hp: Add x86 crash hotplug support for kexec_load Eric DeVolder
2022-02-21  4:08 ` [PATCH v4 00/10] crash: Kernel handling of CPU and memory hot un/plug Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220209195706.51522-1-eric.devolder@oracle.com \
    --to=eric.devolder@oracle.com \
    --cc=bhe@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nramas@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).