netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/5] bnxt_en: Bug fixes.
@ 2020-10-26  4:18 Michael Chan
  2020-10-26  4:18 ` [PATCH net 1/5] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one() Michael Chan
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Michael Chan @ 2020-10-26  4:18 UTC (permalink / raw)
  To: kuba; +Cc: netdev, gospo

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]

These 5 bug fixes are all related to the firmware reset or AER recovery.
2 patches fix the cleanup logic for the workqueue used to handle firmware
reset and recovery. 1 patch ensures that the chip will have the proper
BAR addresses latched after fatal AER recovery.  1 patch fixes the
open path to check for firmware reset abort error.  The last one
sends the fw reset command unconditionally to fix the AER reset logic.

Please queue these for -stable as well.  Thanks.

Michael Chan (1):
  bnxt_en: Check abort error state in bnxt_open_nic().

Vasundhara Volam (4):
  bnxt_en: Fix regression in workqueue cleanup logic in
    bnxt_remove_one().
  bnxt_en: Invoke cancel_delayed_work_sync() for PFs also.
  bnxt_en: Re-write PCI BARs after PCI fatal error.
  bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 49 ++++++++++++++---------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 32 insertions(+), 18 deletions(-)

-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4166 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net 1/5] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one().
  2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
@ 2020-10-26  4:18 ` Michael Chan
  2020-10-26  4:18 ` [PATCH net 2/5] bnxt_en: Invoke cancel_delayed_work_sync() for PFs also Michael Chan
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2020-10-26  4:18 UTC (permalink / raw)
  To: kuba; +Cc: netdev, gospo, Vasundhara Volam

[-- Attachment #1: Type: text/plain, Size: 1739 bytes --]

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

A recent patch has moved the workqueue cleanup logic before
calling unregister_netdev() in bnxt_remove_one().  This caused a
regression because the workqueue can be restarted if the device is
still open.  Workqueue cleanup must be done after unregister_netdev().
The workqueue will not restart itself after the device is closed.

Call bnxt_cancel_sp_work() after unregister_netdev() and
call bnxt_dl_fw_reporters_destroy() after that.  This fixes the
regession and the original NULL ptr dereference issue.

Fixes: b16939b59cc0 ("bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fa147865e33f..e4e5ea080391 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -12108,15 +12108,16 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	if (BNXT_PF(bp))
 		bnxt_sriov_disable(bp);
 
+	if (BNXT_PF(bp))
+		devlink_port_type_clear(&bp->dl_port);
+	pci_disable_pcie_error_reporting(pdev);
+	unregister_netdev(dev);
 	clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
+	/* Flush any pending tasks */
 	bnxt_cancel_sp_work(bp);
 	bp->sp_event = 0;
 
 	bnxt_dl_fw_reporters_destroy(bp, true);
-	if (BNXT_PF(bp))
-		devlink_port_type_clear(&bp->dl_port);
-	pci_disable_pcie_error_reporting(pdev);
-	unregister_netdev(dev);
 	bnxt_dl_unregister(bp);
 	bnxt_shutdown_tc(bp);
 
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4166 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 2/5] bnxt_en: Invoke cancel_delayed_work_sync() for PFs also.
  2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
  2020-10-26  4:18 ` [PATCH net 1/5] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one() Michael Chan
@ 2020-10-26  4:18 ` Michael Chan
  2020-10-26  4:18 ` [PATCH net 3/5] bnxt_en: Re-write PCI BARs after PCI fatal error Michael Chan
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2020-10-26  4:18 UTC (permalink / raw)
  To: kuba; +Cc: netdev, gospo, Vasundhara Volam

[-- Attachment #1: Type: text/plain, Size: 2195 bytes --]

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

As part of the commit b148bb238c02
("bnxt_en: Fix possible crash in bnxt_fw_reset_task()."),
cancel_delayed_work_sync() is called only for VFs to fix a possible
crash by cancelling any pending delayed work items. It was assumed
by mistake that the flush_workqueue() call on the PF would flush
delayed work items as well.

As flush_workqueue() does not cancel the delayed workqueue, extend
the fix for PFs. This fix will avoid the system crash, if there are
any pending delayed work items in fw_reset_task() during driver's
.remove() call.

Unify the workqueue cleanup logic for both PF and VF by calling
cancel_work_sync() and cancel_delayed_work_sync() directly in
bnxt_remove_one().

Fixes: b148bb238c02 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e4e5ea080391..7be232018015 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1160,16 +1160,6 @@ static void bnxt_queue_sp_work(struct bnxt *bp)
 		schedule_work(&bp->sp_task);
 }
 
-static void bnxt_cancel_sp_work(struct bnxt *bp)
-{
-	if (BNXT_PF(bp)) {
-		flush_workqueue(bnxt_pf_wq);
-	} else {
-		cancel_work_sync(&bp->sp_task);
-		cancel_delayed_work_sync(&bp->fw_reset_task);
-	}
-}
-
 static void bnxt_sched_reset(struct bnxt *bp, struct bnxt_rx_ring_info *rxr)
 {
 	if (!rxr->bnapi->in_reset) {
@@ -12114,7 +12104,8 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	unregister_netdev(dev);
 	clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
 	/* Flush any pending tasks */
-	bnxt_cancel_sp_work(bp);
+	cancel_work_sync(&bp->sp_task);
+	cancel_delayed_work_sync(&bp->fw_reset_task);
 	bp->sp_event = 0;
 
 	bnxt_dl_fw_reporters_destroy(bp, true);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4166 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 3/5] bnxt_en: Re-write PCI BARs after PCI fatal error.
  2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
  2020-10-26  4:18 ` [PATCH net 1/5] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one() Michael Chan
  2020-10-26  4:18 ` [PATCH net 2/5] bnxt_en: Invoke cancel_delayed_work_sync() for PFs also Michael Chan
@ 2020-10-26  4:18 ` Michael Chan
  2020-10-26  4:18 ` [PATCH net 4/5] bnxt_en: Check abort error state in bnxt_open_nic() Michael Chan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2020-10-26  4:18 UTC (permalink / raw)
  To: kuba; +Cc: netdev, gospo, Vasundhara Volam

[-- Attachment #1: Type: text/plain, Size: 3115 bytes --]

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

When a PCIe fatal error occurs, the internal latched BAR addresses
in the chip get reset even though the BAR register values in config
space are retained.

pci_restore_state() will not rewrite the BAR addresses if the
BAR address values are valid, causing the chip's internal BAR addresses
to stay invalid.  So we need to zero the BAR registers during PCIe fatal
error to force pci_restore_state() to restore the BAR addresses.  These
write cycles to the BAR registers will cause the proper BAR addresses to
latch internally.

Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 19 ++++++++++++++++++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7be232018015..8012386b4a0f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -12852,6 +12852,9 @@ static pci_ers_result_t bnxt_io_error_detected(struct pci_dev *pdev,
 		return PCI_ERS_RESULT_DISCONNECT;
 	}
 
+	if (state == pci_channel_io_frozen)
+		set_bit(BNXT_STATE_PCI_CHANNEL_IO_FROZEN, &bp->state);
+
 	if (netif_running(netdev))
 		bnxt_close(netdev);
 
@@ -12878,7 +12881,7 @@ static pci_ers_result_t bnxt_io_slot_reset(struct pci_dev *pdev)
 {
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct bnxt *bp = netdev_priv(netdev);
-	int err = 0;
+	int err = 0, off;
 	pci_ers_result_t result = PCI_ERS_RESULT_DISCONNECT;
 
 	netdev_info(bp->dev, "PCI Slot Reset\n");
@@ -12890,6 +12893,20 @@ static pci_ers_result_t bnxt_io_slot_reset(struct pci_dev *pdev)
 			"Cannot re-enable PCI device after reset.\n");
 	} else {
 		pci_set_master(pdev);
+		/* Upon fatal error, our device internal logic that latches to
+		 * BAR value is getting reset and will restore only upon
+		 * rewritting the BARs.
+		 *
+		 * As pci_restore_state() does not re-write the BARs if the
+		 * value is same as saved value earlier, driver needs to
+		 * write the BARs to 0 to force restore, in case of fatal error.
+		 */
+		if (test_and_clear_bit(BNXT_STATE_PCI_CHANNEL_IO_FROZEN,
+				       &bp->state)) {
+			for (off = PCI_BASE_ADDRESS_0;
+			     off <= PCI_BASE_ADDRESS_5; off += 4)
+				pci_write_config_dword(bp->pdev, off, 0);
+		}
 		pci_restore_state(pdev);
 		pci_save_state(pdev);
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 21ef1c21f602..47b3c3127879 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1781,6 +1781,7 @@ struct bnxt {
 #define BNXT_STATE_ABORT_ERR	5
 #define BNXT_STATE_FW_FATAL_COND	6
 #define BNXT_STATE_DRV_REGISTERED	7
+#define BNXT_STATE_PCI_CHANNEL_IO_FROZEN	8
 
 #define BNXT_NO_FW_ACCESS(bp)					\
 	(test_bit(BNXT_STATE_FW_FATAL_COND, &(bp)->state) ||	\
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4166 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 4/5] bnxt_en: Check abort error state in bnxt_open_nic().
  2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
                   ` (2 preceding siblings ...)
  2020-10-26  4:18 ` [PATCH net 3/5] bnxt_en: Re-write PCI BARs after PCI fatal error Michael Chan
@ 2020-10-26  4:18 ` Michael Chan
  2020-10-26  4:18 ` [PATCH net 5/5] bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally Michael Chan
  2020-10-27  1:36 ` [PATCH net 0/5] bnxt_en: Bug fixes Jakub Kicinski
  5 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2020-10-26  4:18 UTC (permalink / raw)
  To: kuba; +Cc: netdev, gospo

[-- Attachment #1: Type: text/plain, Size: 5318 bytes --]

bnxt_open_nic() is called during configuration changes that require
the NIC to be closed and then opened.  This call is protected by
rtnl_lock.  Firmware reset can be happening at the same time.  Only
critical portions of the entire firmware reset sequence are protected
by the rtnl_lock.  It is possible that bnxt_open_nic() can be called
when the firmware reset sequence is aborting.  In that case,
bnxt_open_nic() needs to check if the ABORT_ERR flag is set and
abort if it is.  The configuration change that resulted in the
bnxt_open_nic() call will fail but the NIC will be brought to a
consistent IF_DOWN state.

Without this patch, if bnxt_open_nic() were to continue in this error
state, it may crash like this:

[ 1648.659736] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1648.659768] IP: [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.659796] PGD 101e1b3067 PUD 101e1b2067 PMD 0
[ 1648.659813] Oops: 0000 [#1] SMP
[ 1648.659825] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dell_smbios dell_wmi_descriptor dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper vfat cryptd fat pcspkr ipmi_ssif sg k10temp i2c_piix4 wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_power_meter sch_fq_codel ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci megaraid_sas crct10dif_pclmul crct10dif_common
[ 1648.660063]  tg3 libata crc32c_intel bnxt_en(OE) drm_panel_orientation_quirks devlink ptp pps_core dm_mirror dm_region_hash dm_log dm_mod fuse
[ 1648.660105] CPU: 13 PID: 3867 Comm: ethtool Kdump: loaded Tainted: G           OE  ------------   3.10.0-1152.el7.x86_64 #1
[ 1648.660911] Hardware name: Dell Inc. PowerEdge R7515/0R4CNN, BIOS 1.2.14 01/28/2020
[ 1648.661662] task: ffff94e64cbc9080 ti: ffff94f55df1c000 task.ti: ffff94f55df1c000
[ 1648.662409] RIP: 0010:[<ffffffffc01e9b3a>]  [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.663171] RSP: 0018:ffff94f55df1fba8  EFLAGS: 00010202
[ 1648.663927] RAX: 0000000000000000 RBX: ffff94e6827e0000 RCX: 0000000000000000
[ 1648.664684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94e6827e08c0
[ 1648.665433] RBP: ffff94f55df1fc20 R08: 00000000000001ff R09: 0000000000000008
[ 1648.666184] R10: 0000000000000d53 R11: ffff94f55df1f7ce R12: ffff94e6827e08c0
[ 1648.666940] R13: ffff94e6827e08c0 R14: ffff94e6827e08c0 R15: ffffffffb9115e40
[ 1648.667695] FS:  00007f8aadba5740(0000) GS:ffff94f57eb40000(0000) knlGS:0000000000000000
[ 1648.668447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1648.669202] CR2: 0000000000000000 CR3: 0000001022772000 CR4: 0000000000340fe0
[ 1648.669966] Call Trace:
[ 1648.670730]  [<ffffffffc01f1d5d>] ? bnxt_need_reserve_rings+0x9d/0x170 [bnxt_en]
[ 1648.671496]  [<ffffffffc01fa7ea>] __bnxt_open_nic+0x8a/0x9a0 [bnxt_en]
[ 1648.672263]  [<ffffffffc01f7479>] ? bnxt_close_nic+0x59/0x1b0 [bnxt_en]
[ 1648.673031]  [<ffffffffc01fb11b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
[ 1648.673793]  [<ffffffffc020037c>] bnxt_set_ringparam+0x6c/0xa0 [bnxt_en]
[ 1648.674550]  [<ffffffffb8a5f564>] dev_ethtool+0x1334/0x21a0
[ 1648.675306]  [<ffffffffb8a719ff>] dev_ioctl+0x1ef/0x5f0
[ 1648.676061]  [<ffffffffb8a324bd>] sock_do_ioctl+0x4d/0x60
[ 1648.676810]  [<ffffffffb8a326bb>] sock_ioctl+0x1eb/0x2d0
[ 1648.677548]  [<ffffffffb8663230>] do_vfs_ioctl+0x3a0/0x5b0
[ 1648.678282]  [<ffffffffb8b8e678>] ? __do_page_fault+0x238/0x500
[ 1648.679016]  [<ffffffffb86634e1>] SyS_ioctl+0xa1/0xc0
[ 1648.679745]  [<ffffffffb8b93f92>] system_call_fastpath+0x25/0x2a
[ 1648.680461] Code: 9e 60 01 00 00 0f 1f 40 00 45 8b 8e 48 01 00 00 31 c9 45 85 c9 0f 8e 73 01 00 00 66 0f 1f 44 00 00 49 8b 86 a8 00 00 00 48 63 d1 <48> 8b 14 d0 48 85 d2 0f 84 46 01 00 00 41 8b 86 44 01 00 00 c7
[ 1648.681986] RIP  [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.682724]  RSP <ffff94f55df1fba8>
[ 1648.683451] CR2: 0000000000000000

Fixes: ec5d31e3c15d ("bnxt_en: Handle firmware reset status during IF_UP.")
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 8012386b4a0f..0165f70dba74 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9779,7 +9779,10 @@ int bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
 {
 	int rc = 0;
 
-	rc = __bnxt_open_nic(bp, irq_re_init, link_re_init);
+	if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state))
+		rc = -EIO;
+	if (!rc)
+		rc = __bnxt_open_nic(bp, irq_re_init, link_re_init);
 	if (rc) {
 		netdev_err(bp->dev, "nic open fail (rc: %x)\n", rc);
 		dev_close(bp->dev);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4166 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 5/5] bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally.
  2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
                   ` (3 preceding siblings ...)
  2020-10-26  4:18 ` [PATCH net 4/5] bnxt_en: Check abort error state in bnxt_open_nic() Michael Chan
@ 2020-10-26  4:18 ` Michael Chan
  2020-10-27  1:36 ` [PATCH net 0/5] bnxt_en: Bug fixes Jakub Kicinski
  5 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2020-10-26  4:18 UTC (permalink / raw)
  To: kuba; +Cc: netdev, gospo, Vasundhara Volam

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

In the AER or firmware reset flow, if we are in fatal error state or
if pci_channel_offline() is true, we don't send any commands to the
firmware because the commands will likely not reach the firmware and
most commands don't matter much because the firmware is likely to be
reset imminently.

However, the HWRM_FUNC_RESET command is different and we should always
attempt to send it.  In the AER flow for example, the .slot_reset()
call will trigger this fw command and we need to try to send it to
effect the proper reset.

Fixes: b340dc680ed4 ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 0165f70dba74..7975f59735d6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4352,7 +4352,8 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 	u32 bar_offset = BNXT_GRCPF_REG_CHIMP_COMM;
 	u16 dst = BNXT_HWRM_CHNL_CHIMP;
 
-	if (BNXT_NO_FW_ACCESS(bp))
+	if (BNXT_NO_FW_ACCESS(bp) &&
+	    le16_to_cpu(req->req_type) != HWRM_FUNC_RESET)
 		return -EBUSY;
 
 	if (msg_len > BNXT_HWRM_MAX_REQ_LEN) {
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4166 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/5] bnxt_en: Bug fixes.
  2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
                   ` (4 preceding siblings ...)
  2020-10-26  4:18 ` [PATCH net 5/5] bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally Michael Chan
@ 2020-10-27  1:36 ` Jakub Kicinski
  5 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-10-27  1:36 UTC (permalink / raw)
  To: Michael Chan; +Cc: netdev, gospo

On Mon, 26 Oct 2020 00:18:16 -0400 Michael Chan wrote:
> These 5 bug fixes are all related to the firmware reset or AER recovery.
> 2 patches fix the cleanup logic for the workqueue used to handle firmware
> reset and recovery. 1 patch ensures that the chip will have the proper
> BAR addresses latched after fatal AER recovery.  1 patch fixes the
> open path to check for firmware reset abort error.  The last one
> sends the fw reset command unconditionally to fix the AER reset logic.
> 
> Please queue these for -stable as well.  Thanks.

Applied, thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-27  1:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-26  4:18 [PATCH net 0/5] bnxt_en: Bug fixes Michael Chan
2020-10-26  4:18 ` [PATCH net 1/5] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one() Michael Chan
2020-10-26  4:18 ` [PATCH net 2/5] bnxt_en: Invoke cancel_delayed_work_sync() for PFs also Michael Chan
2020-10-26  4:18 ` [PATCH net 3/5] bnxt_en: Re-write PCI BARs after PCI fatal error Michael Chan
2020-10-26  4:18 ` [PATCH net 4/5] bnxt_en: Check abort error state in bnxt_open_nic() Michael Chan
2020-10-26  4:18 ` [PATCH net 5/5] bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally Michael Chan
2020-10-27  1:36 ` [PATCH net 0/5] bnxt_en: Bug fixes Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).