From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Jake Lawrence <lawja@fb.com>,
Jakub Kicinski <kuba@kernel.org>,
Saeed Mahameed <saeedm@mellanox.com>,
"David S. Miller" <davem@davemloft.net>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 34/56] mlx4: disable device on shutdown
Date: Mon, 3 Aug 2020 14:19:49 +0200 [thread overview]
Message-ID: <20200803121851.999149414@linuxfoundation.org> (raw)
In-Reply-To: <20200803121850.306734207@linuxfoundation.org>
From: Jakub Kicinski <kuba@kernel.org>
[ Upstream commit 3cab8c65525920f00d8f4997b3e9bb73aecb3a8e ]
It appears that not disabling a PCI device on .shutdown may lead to
a Hardware Error with particular (perhaps buggy) BIOS versions:
mlx4_en: eth0: Close port called
mlx4_en 0000:04:00.0: removed PHC
reboot: Restarting system
{1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
{1}[Hardware Error]: event severity: fatal
{1}[Hardware Error]: Error 0, type: fatal
{1}[Hardware Error]: section_type: PCIe error
{1}[Hardware Error]: port_type: 4, root port
{1}[Hardware Error]: version: 1.16
{1}[Hardware Error]: command: 0x4010, status: 0x0143
{1}[Hardware Error]: device_id: 0000:00:02.2
{1}[Hardware Error]: slot: 0
{1}[Hardware Error]: secondary_bus: 0x04
{1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2f06
{1}[Hardware Error]: class_code: 000604
{1}[Hardware Error]: bridge: secondary_status: 0x2000, control: 0x0003
{1}[Hardware Error]: aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00000000
{1}[Hardware Error]: aer_uncor_severity: 0x00062030
{1}[Hardware Error]: TLP Header: 40000018 040000ff 791f4080 00000000
[hw error repeats]
Kernel panic - not syncing: Fatal hardware error!
CPU: 0 PID: 2189 Comm: reboot Kdump: loaded Not tainted 5.6.x-blabla #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 05/05/2017
Fix the mlx4 driver.
This is a very similar problem to what had been fixed in:
commit 0d98ba8d70b0 ("scsi: hpsa: disable device during shutdown")
to address https://bugzilla.kernel.org/show_bug.cgi?id=199779.
Fixes: 2ba5fbd62b25 ("net/mlx4_core: Handle AER flow properly")
Reported-by: Jake Lawrence <lawja@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index f7825c7b92fe3..8d7bb9a889677 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -4311,12 +4311,14 @@ end:
static void mlx4_shutdown(struct pci_dev *pdev)
{
struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
mlx4_info(persist->dev, "mlx4_shutdown was called\n");
mutex_lock(&persist->interface_state_mutex);
if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
mlx4_unload_one(pdev);
mutex_unlock(&persist->interface_state_mutex);
+ mlx4_pci_disable_device(dev);
}
static const struct pci_error_handlers mlx4_err_handler = {
--
2.25.1
next prev parent reply other threads:[~2020-08-03 12:41 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-03 12:19 [PATCH 4.19 00/56] 4.19.137-rc1 review Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 01/56] crypto: ccp - Release all allocated memory if sha type is invalid Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 02/56] media: rc: prevent memory leak in cx23888_ir_probe Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 03/56] iio: imu: adis16400: fix memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 04/56] drm/amdgpu: fix multiple memory leaks in acp_hw_init Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 05/56] tracing: Have error path in predicate_parse() free its allocated memory Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 06/56] ath9k_htc: release allocated buffer if timed out Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 07/56] ath9k: " Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 08/56] drm/amd/display: prevent memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 09/56] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Greg Kroah-Hartman
2020-08-04 7:11 ` Pavel Machek
2020-08-04 7:18 ` Greg Kroah-Hartman
2020-08-10 18:05 ` David Sterba
2020-08-03 12:19 ` [PATCH 4.19 10/56] sctp: implement memory accounting on tx path Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 11/56] Btrfs: fix selftests failure due to uninitialized i_mode in test inodes Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 12/56] PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 13/56] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 14/56] wireless: Use offsetof instead of custom macro Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 15/56] ARM: 8986/1: hw_breakpoint: Dont invoke overflow handler on uaccess watchpoints Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 16/56] random32: update the net random state on interrupt and activity Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 17/56] ARM: percpu.h: fix build error Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 18/56] Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 19/56] drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 20/56] drm: hold gem reference until object is no longer accessed Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 21/56] random: fix circular include dependency on arm64 after addition of percpu.h Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 22/56] random32: remove net_rand_state from the latent entropy gcc plugin Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 23/56] rds: Prevent kernel-infoleak in rds_notify_queue_get() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 24/56] xfs: fix missed wakeup on l_flush_wait Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 25/56] net/x25: Fix x25_neigh refcnt leak when x25 disconnect Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 26/56] net/x25: Fix null-ptr-deref in x25_disconnect Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 27/56] xfrm: Fix crash when the hold queue is used Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 28/56] selftests/net: rxtimestamp: fix clang issues for target arch PowerPC Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 29/56] selftests/net: psock_fanout: " Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 30/56] sh: Fix validation of system call number Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 31/56] net/mlx5: Verify Hardware supports requested ptp function on a given pin Greg Kroah-Hartman
2020-08-04 7:39 ` Pavel Machek
2020-08-03 12:19 ` [PATCH 4.19 32/56] net: lan78xx: add missing endpoint sanity check Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 33/56] net: lan78xx: fix transfer-buffer memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` Greg Kroah-Hartman [this message]
2020-08-03 12:19 ` [PATCH 4.19 35/56] mlxsw: core: Increase scope of RCU read-side critical section Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 36/56] mlxsw: core: Free EMAD transactions using kfree_rcu() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 37/56] ibmvnic: Fix IRQ mapping disposal in error path Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 38/56] bpf: Fix map leak in HASH_OF_MAPS map Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 39/56] mac80211: mesh: Free ie data when leaving mesh Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 40/56] mac80211: mesh: Free pending skb when destroying a mpath Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 41/56] arm64/alternatives: move length validation inside the subsection Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 42/56] arm64: csum: Fix handling of bad packets Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 43/56] Bluetooth: fix kernel oops in store_pending_adv_report Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 44/56] net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 45/56] net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 46/56] usb: hso: Fix debug compile warning on sparc32 Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 47/56] qed: Disable "MFW indication via attention" SPAM every 5 minutes Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 48/56] nfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 49/56] parisc: add support for cmpxchg on u8 pointers Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 50/56] net: ethernet: ravb: exit if re-initialization fails in tx timeout Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 51/56] Revert "i2c: cadence: Fix the hold bit setting" Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 52/56] x86/unwind/orc: Fix ORC for newly forked tasks Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 53/56] cxgb4: add missing release on skb in uld_send() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 54/56] xen-netfront: fix potential deadlock in xennet_remove() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 55/56] KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 56/56] x86/i8259: Use printk_deferred() to prevent deadlock Greg Kroah-Hartman
2020-08-04 7:46 ` [PATCH 4.19 00/56] 4.19.137-rc1 review Pavel Machek
2020-08-04 14:53 ` Chris Paterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200803121851.999149414@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=kuba@kernel.org \
--cc=lawja@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=saeedm@mellanox.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).