From: Hari Bathini <hbathini@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
"Oliver O'Halloran" <oohall@gmail.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
Nicholas Piggin <npiggin@gmail.com>,
linuxppc-dev <linuxppc-dev@ozlabs.org>,
Vasant Hegde <hegdevasant@linux.ibm.com>,
Daniel Axtens <dja@axtens.net>
Subject: Re: [PATCH v5 21/31] powernv/fadump: process architected register state data provided by firmware
Date: Tue, 10 Sep 2019 21:40:44 +0530 [thread overview]
Message-ID: <8b26238d-d8f6-c9a5-714e-b4682b94f8a5@linux.ibm.com> (raw)
In-Reply-To: <87sgp4z1ov.fsf@mpe.ellerman.id.au>
On 10/09/19 7:35 PM, Michael Ellerman wrote:
> Hari Bathini <hbathini@linux.ibm.com> writes:
>> On 09/09/19 9:03 PM, Oliver O'Halloran wrote:
>>> On Mon, Sep 9, 2019 at 11:23 PM Hari Bathini <hbathini@linux.ibm.com> wrote:
>>>> On 04/09/19 5:50 PM, Michael Ellerman wrote:
>>>>> Hari Bathini <hbathini@linux.ibm.com> writes:
>>>> [...]
>>>>
>>>>>> +/*
>>>>>> + * CPU state data is provided by f/w. Below are the definitions
>>>>>> + * provided in HDAT spec. Refer to latest HDAT specification for
>>>>>> + * any update to this format.
>>>>>> + */
>>>>>
>>>>> How is this meant to work? If HDAT ever changes the format they will
>>>>> break all existing kernels in the field.
>>>>>
>>>>>> +#define HDAT_FADUMP_CPU_DATA_VERSION 1
>>>>
>>>> Changes are not expected here. But this is just to cover for such scenario,
>>>> if that ever happens.
>>>
>>> The HDAT spec doesn't define the SPR numbers for NIA, MSR and the CR.
>>> As far as I can tell the values you've assumed here are chip-specific,
>>> non-architected SPR numbers that come from an array buried somewhere
>>> in the SBE codebase. I don't believe you for a second when you say
>>> that this will never change.
>>
>> At least, the understanding is that this numbers not change across processor
>> generations. If something changes, it is supposed to be handled in SBE. Also,
>> I am told this numbers would be listed in the HDAT Spec. Not sure if that
>> happened yet though. Vasant, you have anything to add?
>
> That doesn't help much because the HDAT spec is not public.
>
> The point is with the code written the way it is, these values *must
> not* change, or else all existing kernels will be broken, which is not
> acceptable.
Yeah. It is absurd to error out just by looking at version number...
>
>>>> Also, I think it is a bit far-fetched to error out if versions mismatch.
>>>> Warning and proceeding sounds worthier because the changes are usually
>>>> backward compatible, if and when there are any. Will update accordingly...
>>>
>>> Literally the only reason I didn't drop the CPU DATA parts of the OPAL
>>> MPIPL series was because I assumed the kernel would do the sensible
>>> thing and reject or ignore the structure if it did not know how to
>>> parse the data.
>>
>> I think, the changes if any, would have to be backward compatible for the sake
>> of sanity.
>
> People need to understand that this is an ABI between firmware and
> in-the-field distribution kernels which are only updated at customer
> discretion, or possibly never.
>
> Any changes *must be* backward compatible.
>
> Looking at the header struct:
>
> +struct hdat_fadump_thread_hdr {
> + __be32 pir;
> + /* 0x00 - 0x0F - The corresponding stop state of the core */
> + u8 core_state;
> + u8 reserved[3];
>
> You have those 3 reserved bytes, so a future revision could repurpose
> one of those as a flag to indicate a new format. And/or the hdr could be
> made bigger and new kernels could be taught to look for new things in
> the space after the hdr but before the reg entries.
>
> So I think there is a reasonable mechanism for extending the format in
> future, but my point is people must understand that this is an ABI and
> changes must be made accordingly.
True. The folks who make the changes to this format should be aware that
breaking kernel ABI is not going to be pretty and I think they are :)
>
>> Even if they are not, we are better off exporting the /proc/vmcore
>> with a warning and some crazy CPU register data (if parsing goes alright) than
>> no dump at all?
>
> If it's just a case of reg entries that we don't recognise then yes I
> think it would be OK to just skip them and continue exporting. But if
> there's any deeper misunderstanding of the format then we should bail
> out.
Sure. Will try and fix that by first trying to do a sanity check on the
fields that are needed for parsing the data and proceed with a warning if
nothing weird is detected and fallback to just appending crashing cpu as
done in patch 16/31, if anything weird is observed. That should hopefully
take care of all cases in the best possible way..
>
> I notice now that you don't do anything in opal_fadump_set_regval_regnum()
> if you are passed a register we don't understand, so that probably needs
> fixing.
f/w provides about 100 odd registers in the CPU state data. Most of them
pt_regs doesn't care about. So, opal_fadump_set_regval_regnum is happy as
long as it find the registers listed in it. Unless, pt_regs changes, we
can stick to this and ignore the rest of them?
- Hari
next prev parent reply other threads:[~2019-09-10 16:13 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-20 12:04 [PATCH v5 00/31] Add FADump support on PowerNV platform Hari Bathini
2019-08-20 12:04 ` [PATCH v5 01/31] powerpc/fadump: move internal macros/definitions to a new header Hari Bathini
2019-09-03 11:09 ` Michael Ellerman
2019-09-03 16:05 ` Hari Bathini
2019-08-20 12:04 ` [PATCH v5 02/31] powerpc/fadump: move internal code to a new file Hari Bathini
2019-09-03 11:09 ` Michael Ellerman
2019-09-03 16:05 ` Hari Bathini
2019-09-04 9:02 ` Mahesh Jagannath Salgaonkar
2019-09-04 18:26 ` Hari Bathini
2019-08-20 12:04 ` [PATCH v5 03/31] powerpc/fadump: Improve fadump documentation Hari Bathini
2019-08-20 12:04 ` [PATCH v5 04/31] pseries/fadump: move rtas specific definitions to platform code Hari Bathini
2019-08-20 12:04 ` [PATCH v5 05/31] pseries/fadump: introduce callbacks for platform specific operations Hari Bathini
2019-09-03 11:10 ` Michael Ellerman
2019-09-03 16:06 ` Hari Bathini
2019-09-06 6:39 ` Hari Bathini
2019-08-20 12:04 ` [PATCH v5 06/31] pseries/fadump: define register/un-register callback functions Hari Bathini
2019-09-03 11:10 ` Michael Ellerman
2019-09-03 17:15 ` Hari Bathini
2019-08-20 12:04 ` [PATCH v5 07/31] powerpc/fadump: release all the memory above boot memory size Hari Bathini
2019-09-03 11:10 ` Michael Ellerman
2019-09-03 16:27 ` Hari Bathini
2019-08-20 12:05 ` [PATCH v5 08/31] pseries/fadump: move out platform specific support from generic code Hari Bathini
2019-08-20 12:05 ` [PATCH v5 09/31] powerpc/fadump: use FADump instead of fadump for how it is pronounced Hari Bathini
2019-08-20 12:05 ` [PATCH v5 10/31] opal: add MPIPL interface definitions Hari Bathini
2019-09-03 11:10 ` Michael Ellerman
2019-09-03 16:28 ` Hari Bathini
2019-09-04 11:03 ` Michael Ellerman
2019-09-04 11:05 ` Michael Ellerman
2019-08-20 12:05 ` [PATCH v5 11/31] powernv/fadump: add fadump support on powernv Hari Bathini
2019-09-03 11:10 ` Michael Ellerman
2019-09-03 16:31 ` Hari Bathini
2019-09-04 14:33 ` Hari Bathini
2019-09-05 3:11 ` Michael Ellerman
2019-08-20 12:05 ` [PATCH v5 12/31] powernv/fadump: register kernel metadata address with opal Hari Bathini
2019-09-04 11:25 ` Michael Ellerman
2019-08-20 12:05 ` [PATCH v5 13/31] powernv/fadump: reset metadata address during clean up Hari Bathini
2019-08-27 12:00 ` Hari Bathini
2019-08-20 12:05 ` [PATCH v5 14/31] powernv/fadump: define register/un-register callback functions Hari Bathini
2019-09-05 4:15 ` Michael Ellerman
2019-09-05 7:23 ` Michael Ellerman
2019-09-05 9:54 ` Hari Bathini
2019-08-20 12:05 ` [PATCH v5 15/31] powernv/fadump: support copying multiple kernel boot memory regions Hari Bathini
2019-09-04 11:30 ` Michael Ellerman
2019-09-04 20:20 ` Hari Bathini
2019-09-05 3:13 ` Michael Ellerman
2019-08-20 12:06 ` [PATCH v5 16/31] powernv/fadump: process the crashdump by exporting it as /proc/vmcore Hari Bathini
2019-09-04 11:42 ` Michael Ellerman
2019-09-04 21:01 ` Hari Bathini
2019-08-20 12:06 ` [PATCH v5 17/31] powernv/fadump: Warn before processing partial crashdump Hari Bathini
2019-09-04 11:48 ` Michael Ellerman
2019-08-20 12:06 ` [PATCH v5 18/31] powernv/fadump: handle invalidation of crashdump and re-registraion Hari Bathini
2019-08-20 12:06 ` [PATCH v5 19/31] powerpc/fadump: Update documentation about OPAL platform support Hari Bathini
2019-09-04 11:51 ` Michael Ellerman
2019-09-04 12:08 ` Oliver O'Halloran
2019-09-05 3:15 ` Michael Ellerman
2019-08-20 12:06 ` [PATCH v5 20/31] powerpc/fadump: use smaller offset while finding memory for reservation Hari Bathini
2019-09-04 11:54 ` Michael Ellerman
2019-08-20 12:06 ` [PATCH v5 21/31] powernv/fadump: process architected register state data provided by firmware Hari Bathini
2019-09-04 12:20 ` Michael Ellerman
2019-09-09 13:23 ` Hari Bathini
2019-09-09 15:33 ` Oliver O'Halloran
2019-09-10 8:48 ` Hari Bathini
2019-09-10 14:05 ` Michael Ellerman
2019-09-10 16:10 ` Hari Bathini [this message]
2019-08-20 12:06 ` [PATCH v5 22/31] powerpc/fadump: make crash memory ranges array allocation generic Hari Bathini
2019-08-20 12:06 ` [PATCH v5 23/31] powerpc/fadump: consider reserved ranges while releasing memory Hari Bathini
2019-08-20 12:07 ` [PATCH v5 24/31] powerpc/fadump: improve how crashed kernel's memory is reserved Hari Bathini
2019-08-20 12:07 ` [PATCH v5 25/31] powernv/fadump: add support to preserve crash data on FADUMP disabled kernel Hari Bathini
2019-08-20 12:07 ` [PATCH v5 26/31] powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP Hari Bathini
2019-08-20 12:07 ` [PATCH v5 27/31] powernv/opalcore: export /sys/firmware/opal/core for analysing opal crashes Hari Bathini
2019-08-20 12:07 ` [PATCH v5 28/31] powernv/opalcore: provide an option to invalidate /sys/firmware/opal/core file Hari Bathini
2019-08-20 12:07 ` [PATCH v5 29/31] powerpc/fadump: consider f/w load area Hari Bathini
2019-08-20 12:07 ` [PATCH v5 30/31] powernv/fadump: update documentation about option to release opalcore Hari Bathini
2019-08-20 12:07 ` [PATCH v5 31/31] powernv/fadump: support holes in kernel boot memory area Hari Bathini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8b26238d-d8f6-c9a5-714e-b4682b94f8a5@linux.ibm.com \
--to=hbathini@linux.ibm.com \
--cc=ananth@linux.ibm.com \
--cc=dja@axtens.net \
--cc=hegdevasant@linux.ibm.com \
--cc=linuxppc-dev@ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).