All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sourabh Jain <sourabhjain@linux.ibm.com>
To: Hari Bathini <hbathini@linux.ibm.com>, linuxppc-dev@ozlabs.org
Cc: "Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
	Aditya Gupta <adityag@linux.ibm.com>,
	Mahesh Salgaonkar <mahesh@linux.ibm.com>,
	Naveen N Rao <naveen@kernel.org>
Subject: Re: [PATCH v7 1/3] powerpc: make fadump resilient with memory add/remove events
Date: Mon, 19 Feb 2024 16:02:24 +0530	[thread overview]
Message-ID: <425f4366-cb7d-4783-bbf8-53c55f2a0430@linux.ibm.com> (raw)
In-Reply-To: <41a0647e-8a8a-40ca-9a07-3e97f02cc369@linux.ibm.com>

Hello Hari,

On 23/01/24 15:39, Hari Bathini wrote:
>
>
> On 11/01/24 7:39 pm, Sourabh Jain wrote:
>> Due to changes in memory resources caused by either memory hotplug or
>> online/offline events, the elfcorehdr, which describes the CPUs and
>> memory of the crashed kernel to the kernel that collects the dump (known
>> as second/fadump kernel), becomes outdated. Consequently, attempting
>> dump collection with an outdated elfcorehdr can lead to failed or
>> inaccurate dump collection.
>>
>> Memory hotplug or online/offline events is referred as memory add/remove
>> events in reset of the commit message.
>>
>> The current solution to address the aforementioned issue is as follows:
>> Monitor memory add/remove events in userspace using udev rules, and
>> re-register fadump whenever there are changes in memory resources. This
>> leads to the creation of a new elfcorehdr with updated system memory
>> information.
>>
>> There are several notable issues associated with re-registering fadump
>> for every memory add/remove events.
>>
>> 1. Bulk memory add/remove events with udev-based fadump re-registration
>>     can lead to race conditions and, more importantly, it creates a wide
>>     window during which fadump is inactive until all memory add/remove
>>     events are settled.
>> 2. Re-registering fadump for every memory add/remove event is
>>     inefficient.
>> 3. The memory for elfcorehdr is allocated based on the memblock regions
>>     available during early boot and remains fixed thereafter. 
>> However, if
>>     elfcorehdr is later recreated with additional memblock regions, its
>>     size will increase, potentially leading to memory corruption.
>>
>> Address the aforementioned challenges by shifting the creation of
>> elfcorehdr from the first kernel (also referred as the crashed kernel),
>> where it was created and frequently recreated for every memory
>> add/remove event, to the fadump kernel. As a result, the elfcorehdr only
>> needs to be created once, thus eliminating the necessity to re-register
>> fadump during memory add/remove events.
>>
>> At present, the first kernel prepares the fadump header and stores it in
>> the fadump reserved area. The fadump header contains start address of
>> the elfcorehd, crashing CPU details, etc.  In the event of first kernel
>
> "elfcorehd" used instead of "elfcorehdr" at a couple of places..

Fixed it now. Thanks.

>
>> crash, the second/fadump boots and access the fadump header prepared by
>> first kernel and do the following in a platform-specific function
>> [rtas|opal]_fadump_process:
>>
>> At present, the first kernel is responsible for preparing the fadump
>> header and storing it in the fadump reserved area. The fadump header
>> includes the start address of the elfcorehd, crashing CPU details, and
>> other relevant information. In the event of a crash in the first kernel,
>> the second/fadump boots and accesses the fadump header prepared by the
>> first kernel. It then performs the following steps in a
>> platform-specific function [rtas|opal]_fadump_process:
>>
>> 1. Sanity check for fadump header
>> 2. Update CPU notes in elfcorehdr
>> 3. Set the global variable elfcorehdr_addr to the address of the
>>     fadump header's elfcorehdr. For vmcore module to process it later 
>> on.
>>
>> Along with the above, update the setup_fadump()/fadump.c to create
>> elfcorehdr in second/fadump kernel.
>>
>> Section below outlines the information required to create the elfcorehdr
>> and the changes made to make it available to the fadump kernel if it's
>> not already.
>>
>> To create elfcorehdr, the following crashed kernel information is
>> required: CPU notes, vmcoreinfo, and memory ranges.
>>
>> At present, the CPU notes are already prepared in the fadump kernel, so
>> no changes are needed in that regard. The fadump kernel has access to
>> all crashed kernel memory regions, including boot memory regions that
>> are relocated by firmware to fadump reserved areas, so no changes for
>> that either. However, it is necessary to add new members to the fadump
>> header, i.e., the 'fadump_crash_info_header' structure, in order to pass
>> the crashed kernel's vmcoreinfo address and its size to fadump kernel.
>>
>> In addition to the vmcoreinfo address and size, there are a few other
>> attributes also added to the fadump_crash_info_header structure.
>>
>> 1. version:
>>     It stores the fadump header version, which is currently set to 1.
>>     This provides flexibility to update the fadump crash info header in
>>     the future without changing the magic number. For each change in the
>>     fadump header, the version will be increased. This will help the
>>     updated kernel determine how to handle kernel dumps from older
>>     kernels. The magic number remains relevant for checking fadump 
>> header
>>     corruption.
>>
>> 2. elfcorehdr_size:
>>     since elfcorehdr is now prepared in the fadump/second kernel and
>>     it is not part of the reserved area, this attribute is needed to
>>     track the memory allocated for elfcorehdr to do the deallocation
>>     properly.
>>
>> 3. pt_regs_sz/cpu_mask_sz:
>>     Store size of pt_regs and cpu_mask strucutre in first kernel. These
>>     attributes are used avoid processing the dump if the sizes of 
>> pt_regs
>>     and cpu_mask are not the same across the crashed and fadump kernel.
>>
>> Note: if either first/crashed kernel or second/fadump kernel do not have
>> the changes introduced here then kernel fail to collect the dump and
>> prints relevant error message on the console.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> Cc: Aditya Gupta <adityag@linux.ibm.com>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Naveen N Rao <naveen@kernel.org>
>> ---
>>   arch/powerpc/include/asm/fadump-internal.h   |  31 +-
>>   arch/powerpc/kernel/fadump.c                 | 355 +++++++++++--------
>>   arch/powerpc/platforms/powernv/opal-fadump.c |  18 +-
>>   arch/powerpc/platforms/pseries/rtas-fadump.c |  23 +-
>>   4 files changed, 242 insertions(+), 185 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/fadump-internal.h 
>> b/arch/powerpc/include/asm/fadump-internal.h
>> index 27f9e11eda28..a632e9708610 100644
>> --- a/arch/powerpc/include/asm/fadump-internal.h
>> +++ b/arch/powerpc/include/asm/fadump-internal.h
>> @@ -42,13 +42,40 @@ static inline u64 fadump_str_to_u64(const char *str)
>>     #define FADUMP_CPU_UNKNOWN        (~((u32)0))
>>   -#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPINF")
>> +/*
>> + * The introduction of new fields in the fadump crash info header has
>> + * led to a change in the magic key from `FADMPINF` to `FADMPSIG` for
>> + * identifying a kernel crash from an old kernel.
>> + *
>> + * To prevent the need for further changes to the magic number in the
>> + * event of future modifications to the fadump crash info header, a
>> + * version field has been introduced to track the fadump crash info
>> + * header version.
>> + *
>> + * Consider a few points before adding new members to the fadump 
>> crash info
>> + * header structure:
>> + *
>> + *  - Append new members; avoid adding them in between.
>> + *  - Non-primitive members should have a size member as well.
>> + *  - For every change in the fadump header, increment the
>> + *    fadump header version. This helps the updated kernel decide 
>> how to
>> + *    handle kernel dumps from older kernels.
>> + */
>> +#define FADUMP_CRASH_INFO_MAGIC_OLD fadump_str_to_u64("FADMPINF")
>> +#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPSIG")
>> +#define FADUMP_HEADER_VERSION        1
>>     /* fadump crash info structure */
>>   struct fadump_crash_info_header {
>>       u64        magic_number;
>> -    u64        elfcorehdr_addr;
>> +    u32        version;
>>       u32        crashing_cpu;
>
>> +    u64        elfcorehdr_addr;
>> +    u64        elfcorehdr_size;
>
> fadump_crash_info_header structure is to share info across reboots.
> Now that elfcorehdr is prepared in second kernel and also dump capture
> of older kernel is not supported, get rid of elfcorehdr_addr &
> elfcorehdr_size from fadump_crash_info_header structure and put them
> in fw_dump structure instead..

Including elfcorehdr_addr and elfcorehdr_size in the fw_dump structure 
removes the
dependency on address translation from physical to virtual."

I have included the above suggestion in v8.
https://lore.kernel.org/all/20240217072004.148293-1-sourabhjain@linux.ibm.com/

Thanks for the suggestion.

- Sourabh

  reply	other threads:[~2024-02-19 10:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-11 14:09 [PATCH v7 0/3] powerpc: make fadump resilient with memory add/remove events Sourabh Jain
2024-01-11 14:09 ` [PATCH v7 1/3] " Sourabh Jain
2024-01-23 10:09   ` Hari Bathini
2024-02-19 10:32     ` Sourabh Jain [this message]
2024-01-11 14:09 ` [PATCH v7 2/3] powerpc/fadump: add hotplug_ready sysfs interface Sourabh Jain
2024-01-11 14:09 ` [PATCH v7 3/3] Documentation/powerpc: update fadump implementation details Sourabh Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=425f4366-cb7d-4783-bbf8-53c55f2a0430@linux.ibm.com \
    --to=sourabhjain@linux.ibm.com \
    --cc=adityag@linux.ibm.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=hbathini@linux.ibm.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mahesh@linux.ibm.com \
    --cc=naveen@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.