stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jake Lawrence <lawja@fb.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 34/56] mlx4: disable device on shutdown
Date: Mon,  3 Aug 2020 14:19:49 +0200	[thread overview]
Message-ID: <20200803121851.999149414@linuxfoundation.org> (raw)
In-Reply-To: <20200803121850.306734207@linuxfoundation.org>

From: Jakub Kicinski <kuba@kernel.org>

[ Upstream commit 3cab8c65525920f00d8f4997b3e9bb73aecb3a8e ]

It appears that not disabling a PCI device on .shutdown may lead to
a Hardware Error with particular (perhaps buggy) BIOS versions:

    mlx4_en: eth0: Close port called
    mlx4_en 0000:04:00.0: removed PHC
    reboot: Restarting system
    {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
    {1}[Hardware Error]: event severity: fatal
    {1}[Hardware Error]:  Error 0, type: fatal
    {1}[Hardware Error]:   section_type: PCIe error
    {1}[Hardware Error]:   port_type: 4, root port
    {1}[Hardware Error]:   version: 1.16
    {1}[Hardware Error]:   command: 0x4010, status: 0x0143
    {1}[Hardware Error]:   device_id: 0000:00:02.2
    {1}[Hardware Error]:   slot: 0
    {1}[Hardware Error]:   secondary_bus: 0x04
    {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2f06
    {1}[Hardware Error]:   class_code: 000604
    {1}[Hardware Error]:   bridge: secondary_status: 0x2000, control: 0x0003
    {1}[Hardware Error]:   aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00000000
    {1}[Hardware Error]:   aer_uncor_severity: 0x00062030
    {1}[Hardware Error]:   TLP Header: 40000018 040000ff 791f4080 00000000
[hw error repeats]
    Kernel panic - not syncing: Fatal hardware error!
    CPU: 0 PID: 2189 Comm: reboot Kdump: loaded Not tainted 5.6.x-blabla #1
    Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 05/05/2017

Fix the mlx4 driver.

This is a very similar problem to what had been fixed in:
commit 0d98ba8d70b0 ("scsi: hpsa: disable device during shutdown")
to address https://bugzilla.kernel.org/show_bug.cgi?id=199779.

Fixes: 2ba5fbd62b25 ("net/mlx4_core: Handle AER flow properly")
Reported-by: Jake Lawrence <lawja@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx4/main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index f7825c7b92fe3..8d7bb9a889677 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -4311,12 +4311,14 @@ end:
 static void mlx4_shutdown(struct pci_dev *pdev)
 {
 	struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+	struct mlx4_dev *dev = persist->dev;
 
 	mlx4_info(persist->dev, "mlx4_shutdown was called\n");
 	mutex_lock(&persist->interface_state_mutex);
 	if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
 		mlx4_unload_one(pdev);
 	mutex_unlock(&persist->interface_state_mutex);
+	mlx4_pci_disable_device(dev);
 }
 
 static const struct pci_error_handlers mlx4_err_handler = {
-- 
2.25.1




  parent reply	other threads:[~2020-08-03 12:41 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-03 12:19 [PATCH 4.19 00/56] 4.19.137-rc1 review Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 01/56] crypto: ccp - Release all allocated memory if sha type is invalid Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 02/56] media: rc: prevent memory leak in cx23888_ir_probe Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 03/56] iio: imu: adis16400: fix memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 04/56] drm/amdgpu: fix multiple memory leaks in acp_hw_init Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 05/56] tracing: Have error path in predicate_parse() free its allocated memory Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 06/56] ath9k_htc: release allocated buffer if timed out Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 07/56] ath9k: " Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 08/56] drm/amd/display: prevent memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 09/56] btrfs: inode: Verify inode mode to avoid NULL pointer dereference Greg Kroah-Hartman
2020-08-04  7:11   ` Pavel Machek
2020-08-04  7:18     ` Greg Kroah-Hartman
2020-08-10 18:05       ` David Sterba
2020-08-03 12:19 ` [PATCH 4.19 10/56] sctp: implement memory accounting on tx path Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 11/56] Btrfs: fix selftests failure due to uninitialized i_mode in test inodes Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 12/56] PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 13/56] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 14/56] wireless: Use offsetof instead of custom macro Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 15/56] ARM: 8986/1: hw_breakpoint: Dont invoke overflow handler on uaccess watchpoints Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 16/56] random32: update the net random state on interrupt and activity Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 17/56] ARM: percpu.h: fix build error Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 18/56] Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 19/56] drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 20/56] drm: hold gem reference until object is no longer accessed Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 21/56] random: fix circular include dependency on arm64 after addition of percpu.h Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 22/56] random32: remove net_rand_state from the latent entropy gcc plugin Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 23/56] rds: Prevent kernel-infoleak in rds_notify_queue_get() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 24/56] xfs: fix missed wakeup on l_flush_wait Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 25/56] net/x25: Fix x25_neigh refcnt leak when x25 disconnect Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 26/56] net/x25: Fix null-ptr-deref in x25_disconnect Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 27/56] xfrm: Fix crash when the hold queue is used Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 28/56] selftests/net: rxtimestamp: fix clang issues for target arch PowerPC Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 29/56] selftests/net: psock_fanout: " Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 30/56] sh: Fix validation of system call number Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 31/56] net/mlx5: Verify Hardware supports requested ptp function on a given pin Greg Kroah-Hartman
2020-08-04  7:39   ` Pavel Machek
2020-08-03 12:19 ` [PATCH 4.19 32/56] net: lan78xx: add missing endpoint sanity check Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 33/56] net: lan78xx: fix transfer-buffer memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` Greg Kroah-Hartman [this message]
2020-08-03 12:19 ` [PATCH 4.19 35/56] mlxsw: core: Increase scope of RCU read-side critical section Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 36/56] mlxsw: core: Free EMAD transactions using kfree_rcu() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 37/56] ibmvnic: Fix IRQ mapping disposal in error path Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 38/56] bpf: Fix map leak in HASH_OF_MAPS map Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 39/56] mac80211: mesh: Free ie data when leaving mesh Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 40/56] mac80211: mesh: Free pending skb when destroying a mpath Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 41/56] arm64/alternatives: move length validation inside the subsection Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 42/56] arm64: csum: Fix handling of bad packets Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 43/56] Bluetooth: fix kernel oops in store_pending_adv_report Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.19 44/56] net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 45/56] net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 46/56] usb: hso: Fix debug compile warning on sparc32 Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 47/56] qed: Disable "MFW indication via attention" SPAM every 5 minutes Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 48/56] nfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 49/56] parisc: add support for cmpxchg on u8 pointers Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 50/56] net: ethernet: ravb: exit if re-initialization fails in tx timeout Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 51/56] Revert "i2c: cadence: Fix the hold bit setting" Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 52/56] x86/unwind/orc: Fix ORC for newly forked tasks Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 53/56] cxgb4: add missing release on skb in uld_send() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 54/56] xen-netfront: fix potential deadlock in xennet_remove() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 55/56] KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.19 56/56] x86/i8259: Use printk_deferred() to prevent deadlock Greg Kroah-Hartman
2020-08-04  7:46 ` [PATCH 4.19 00/56] 4.19.137-rc1 review Pavel Machek
2020-08-04 14:53   ` Chris Paterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200803121851.999149414@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=lawja@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).