All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Xiaofei Tan <tanxiaofei@huawei.com>,
	James Morse <james.morse@arm.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Sasha Levin <sashal@kernel.org>,
	linux-efi@vger.kernel.org
Subject: [PATCH AUTOSEL 4.4 19/44] efi: cper: print AER info of PCIe fatal error
Date: Sun, 22 Sep 2019 15:00:37 -0400	[thread overview]
Message-ID: <20190922190103.4906-19-sashal@kernel.org> (raw)
In-Reply-To: <20190922190103.4906-1-sashal@kernel.org>

From: Xiaofei Tan <tanxiaofei@huawei.com>

[ Upstream commit b194a77fcc4001dc40aecdd15d249648e8a436d1 ]

AER info of PCIe fatal error is not printed in the current driver.
Because APEI driver will panic directly for fatal error, and can't
run to the place of printing AER info.

An example log is as following:
{763}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 11
{763}[Hardware Error]: event severity: fatal
{763}[Hardware Error]:  Error 0, type: fatal
{763}[Hardware Error]:   section_type: PCIe error
{763}[Hardware Error]:   port_type: 0, PCIe end point
{763}[Hardware Error]:   version: 4.0
{763}[Hardware Error]:   command: 0x0000, status: 0x0010
{763}[Hardware Error]:   device_id: 0000:82:00.0
{763}[Hardware Error]:   slot: 0
{763}[Hardware Error]:   secondary_bus: 0x00
{763}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x10fb
{763}[Hardware Error]:   class_code: 000002
Kernel panic - not syncing: Fatal hardware error!

This issue was imported by the patch, '37448adfc7ce ("aerdrv: Move
cper_print_aer() call out of interrupt context")'. To fix this issue,
this patch adds print of AER info in cper_print_pcie() for fatal error.

Here is the example log after this patch applied:
{24}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 10
{24}[Hardware Error]: event severity: fatal
{24}[Hardware Error]:  Error 0, type: fatal
{24}[Hardware Error]:   section_type: PCIe error
{24}[Hardware Error]:   port_type: 0, PCIe end point
{24}[Hardware Error]:   version: 4.0
{24}[Hardware Error]:   command: 0x0546, status: 0x4010
{24}[Hardware Error]:   device_id: 0000:01:00.0
{24}[Hardware Error]:   slot: 0
{24}[Hardware Error]:   secondary_bus: 0x00
{24}[Hardware Error]:   vendor_id: 0x15b3, device_id: 0x1019
{24}[Hardware Error]:   class_code: 000002
{24}[Hardware Error]:   aer_uncor_status: 0x00040000, aer_uncor_mask: 0x00000000
{24}[Hardware Error]:   aer_uncor_severity: 0x00062010
{24}[Hardware Error]:   TLP Header: 000000c0 01010000 00000001 00000000
Kernel panic - not syncing: Fatal hardware error!

Fixes: 37448adfc7ce ("aerdrv: Move cper_print_aer() call out of interrupt context")
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
[ardb: put parens around terms of && operator]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/firmware/efi/cper.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index d425374254384..f40f7df4b7344 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -384,6 +384,21 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
 		printk(
 	"%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
 	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
+
+	/* Fatal errors call __ghes_panic() before AER handler prints this */
+	if ((pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) &&
+	    (gdata->error_severity & CPER_SEV_FATAL)) {
+		struct aer_capability_regs *aer;
+
+		aer = (struct aer_capability_regs *)pcie->aer_info;
+		printk("%saer_uncor_status: 0x%08x, aer_uncor_mask: 0x%08x\n",
+		       pfx, aer->uncor_status, aer->uncor_mask);
+		printk("%saer_uncor_severity: 0x%08x\n",
+		       pfx, aer->uncor_severity);
+		printk("%sTLP Header: %08x %08x %08x %08x\n", pfx,
+		       aer->header_log.dw0, aer->header_log.dw1,
+		       aer->header_log.dw2, aer->header_log.dw3);
+	}
 }
 
 static void cper_estatus_print_section(
-- 
2.20.1


  parent reply	other threads:[~2019-09-22 19:04 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-22 19:00 [PATCH AUTOSEL 4.4 01/44] ALSA: hda: Flush interrupts on disabling Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 02/44] ASoC: sgtl5000: Fix charge pump source assignment Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 03/44] dmaengine: bcm2835: Print error in case setting DMA mask fails Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 04/44] leds: leds-lp5562 allow firmware files up to the maximum length Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 05/44] media: dib0700: fix link error for dibx000_i2c_set_speed Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 06/44] media: hdpvr: Add device num check and handling Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 07/44] sched/fair: Fix imbalance due to CPU affinity Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 08/44] sched/core: Fix CPU controller for !RT_GROUP_SCHED Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 09/44] x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 10/44] x86/apic: Soft disable APIC before initializing it Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 11/44] ALSA: hda - Show the fatal CORB/RIRB error more clearly Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 12/44] ALSA: i2c: ak4xxx-adda: Fix a possible null pointer dereference in build_adc_controls() Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 13/44] media: iguanair: add sanity checks Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 14/44] base: soc: Export soc_device_register/unregister APIs Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 15/44] ALSA: usb-audio: Skip bSynchAddress endpoint check if it is invalid Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 16/44] ia64:unwind: fix double free for mod->arch.init_unw_table Sasha Levin
2019-09-22 19:00   ` Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 17/44] md: don't call spare_active in md_reap_sync_thread if all member devices can't work Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 18/44] md: don't set In_sync if array is frozen Sasha Levin
2019-09-22 19:00 ` Sasha Levin [this message]
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 20/44] media: gspca: zero usb_buf on error Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 21/44] dmaengine: iop-adma: use correct printk format strings Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 22/44] media: omap3isp: Don't set streaming state on random subdevs Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 23/44] net: lpc-enet: fix printk format strings Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 24/44] media: radio/si470x: kill urb on error Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 25/44] media: hdpvr: add terminating 0 at end of string Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 26/44] media: saa7146: add cleanup in hexium_attach() Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 27/44] media: cpia2_usb: fix memory leaks Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 28/44] media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate() Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 29/44] media: ov9650: add a sanity check Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 30/44] ACPI / CPPC: do not require the _PSD method Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 31/44] libtraceevent: Change users plugin directory Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 32/44] ACPI: custom_method: fix memory leaks Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 33/44] hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap' Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 34/44] md/raid1: fail run raid1 array when active disk less than one Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 35/44] dmaengine: ti: edma: Do not reset reserved paRAM slots Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 36/44] kprobes: Prohibit probing on BUG() and WARN() address Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 37/44] irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 38/44] x86/platform/uv: Fix kmalloc() NULL check routine Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 39/44] ASoC: dmaengine: Make the pcm->name equal to pcm->id if the name is not set Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 40/44] mmc: sdhci: Fix incorrect switch to HS mode Sasha Levin
2019-09-22 19:00 ` [PATCH AUTOSEL 4.4 41/44] libertas: Add missing sentinel at end of if_usb.c fw_table Sasha Levin
2019-09-22 19:01 ` [PATCH AUTOSEL 4.4 42/44] media: ttusb-dec: Fix info-leak in ttusb_dec_send_command() Sasha Levin
2019-09-22 19:01 ` [PATCH AUTOSEL 4.4 43/44] ALSA: hda/realtek - Blacklist PC beep for Lenovo ThinkCentre M73/93 Sasha Levin
2019-09-22 19:01 ` [PATCH AUTOSEL 4.4 44/44] btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190922190103.4906-19-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=james.morse@arm.com \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tanxiaofei@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.