From: tanxiaofei <tanxiaofei@huawei.com>
To: <linux-kernel@vger.kernel.org>
Cc: <linux-acpi@vger.kernel.org>, <linux-efi@vger.kernel.org>,
<rjw@rjwysocki.net>, <lenb@kernel.org>, <tony.luck@intel.com>,
<bp@alien8.de>, <ying.huang@intel.com>,
<ross.lagerwall@citrix.com>, <ard.biesheuvel@linaro.org>,
<james.morse@arm.com>
Subject: Re: [PATCH v2 1/1] efi: cper: print AER info of PCIe fatal error
Date: Mon, 12 Aug 2019 14:51:50 +0800 [thread overview]
Message-ID: <5D510C86.5040000@huawei.com> (raw)
In-Reply-To: <1564105417-232048-1-git-send-email-tanxiaofei@huawei.com>
ping...
On 2019/7/26 9:43, Xiaofei Tan wrote:
> AER info of PCIe fatal error is not printed in the current driver.
> Because APEI driver will panic directly for fatal error, and can't
> run to the place of printing AER info.
>
> An example log is as following:
> {763}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 11
> {763}[Hardware Error]: event severity: fatal
> {763}[Hardware Error]: Error 0, type: fatal
> {763}[Hardware Error]: section_type: PCIe error
> {763}[Hardware Error]: port_type: 0, PCIe end point
> {763}[Hardware Error]: version: 4.0
> {763}[Hardware Error]: command: 0x0000, status: 0x0010
> {763}[Hardware Error]: device_id: 0000:82:00.0
> {763}[Hardware Error]: slot: 0
> {763}[Hardware Error]: secondary_bus: 0x00
> {763}[Hardware Error]: vendor_id: 0x8086, device_id: 0x10fb
> {763}[Hardware Error]: class_code: 000002
> Kernel panic - not syncing: Fatal hardware error!
>
> This issue was imported by the patch, '37448adfc7ce ("aerdrv: Move
> cper_print_aer() call out of interrupt context")'. To fix this issue,
> this patch adds print of AER info in cper_print_pcie() for fatal error.
>
> Here is the example log after this patch applied:
> {24}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 10
> {24}[Hardware Error]: event severity: fatal
> {24}[Hardware Error]: Error 0, type: fatal
> {24}[Hardware Error]: section_type: PCIe error
> {24}[Hardware Error]: port_type: 0, PCIe end point
> {24}[Hardware Error]: version: 4.0
> {24}[Hardware Error]: command: 0x0546, status: 0x4010
> {24}[Hardware Error]: device_id: 0000:01:00.0
> {24}[Hardware Error]: slot: 0
> {24}[Hardware Error]: secondary_bus: 0x00
> {24}[Hardware Error]: vendor_id: 0x15b3, device_id: 0x1019
> {24}[Hardware Error]: class_code: 000002
> {24}[Hardware Error]: aer_uncor_status: 0x00040000, aer_uncor_mask: 0x00000000
> {24}[Hardware Error]: aer_uncor_severity: 0x00062010
> {24}[Hardware Error]: TLP Header: 000000c0 01010000 00000001 00000000
> Kernel panic - not syncing: Fatal hardware error!
>
> Fixes: 37448adfc7ce ("aerdrv: Move cper_print_aer() call out of interrupt context")
> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
> Reviewed-by: James Morse <james.morse@arm.com>
> ---
> drivers/firmware/efi/cper.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 8fa977c..78b8922 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -390,6 +390,21 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
> printk(
> "%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
> pfx, pcie->bridge.secondary_status, pcie->bridge.control);
> +
> + /* Fatal errors call __ghes_panic() before AER handler prints this */
> + if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO &&
> + gdata->error_severity & CPER_SEV_FATAL) {
> + struct aer_capability_regs *aer;
> +
> + aer = (struct aer_capability_regs *)pcie->aer_info;
> + printk("%saer_uncor_status: 0x%08x, aer_uncor_mask: 0x%08x\n",
> + pfx, aer->uncor_status, aer->uncor_mask);
> + printk("%saer_uncor_severity: 0x%08x\n",
> + pfx, aer->uncor_severity);
> + printk("%sTLP Header: %08x %08x %08x %08x\n", pfx,
> + aer->header_log.dw0, aer->header_log.dw1,
> + aer->header_log.dw2, aer->header_log.dw3);
> + }
> }
>
> static void cper_print_tstamp(const char *pfx,
>
--
thanks
tanxiaofei
prev parent reply other threads:[~2019-08-12 6:52 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-26 1:43 [PATCH v2 1/1] efi: cper: print AER info of PCIe fatal error Xiaofei Tan
2019-08-12 6:51 ` tanxiaofei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5D510C86.5040000@huawei.com \
--to=tanxiaofei@huawei.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bp@alien8.de \
--cc=james.morse@arm.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-efi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rjw@rjwysocki.net \
--cc=ross.lagerwall@citrix.com \
--cc=tony.luck@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).