netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/4] bnxt_en: Bug fixes.
@ 2020-02-02  7:41 Michael Chan
  2020-02-02  7:41 ` [PATCH net 1/4] bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected Michael Chan
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Michael Chan @ 2020-02-02  7:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

3 patches that fix some issues in the firmware reset logic, starting
with a small patch to refactor the code that re-enables SRIOV.  The
last patch fixes a TC queue mapping issue.

Michael Chan (3):
  bnxt_en: Refactor logic to re-enable SRIOV after firmware reset
    detected.
  bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset.
  bnxt_en: Fix TC queue mapping.

Vasundhara Volam (1):
  bnxt_en: Fix logic that disables Bus Master during firmware reset.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 37 ++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 13 deletions(-)

-- 
2.5.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net 1/4] bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected.
  2020-02-02  7:41 [PATCH net 0/4] bnxt_en: Bug fixes Michael Chan
@ 2020-02-02  7:41 ` Michael Chan
  2020-02-02  7:41 ` [PATCH net 2/4] bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset Michael Chan
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Michael Chan @ 2020-02-02  7:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

Put the current logic in bnxt_open() to re-enable SRIOV after detecting
firmware reset into a new function bnxt_reenable_sriov().  This call
needs to be invoked in the firmware reset path also in the next patch.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 483935b..0253a8f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9241,6 +9241,17 @@ void bnxt_half_close_nic(struct bnxt *bp)
 	bnxt_free_mem(bp, false);
 }
 
+static void bnxt_reenable_sriov(struct bnxt *bp)
+{
+	if (BNXT_PF(bp)) {
+		struct bnxt_pf_info *pf = &bp->pf;
+		int n = pf->active_vfs;
+
+		if (n)
+			bnxt_cfg_hw_sriov(bp, &n, true);
+	}
+}
+
 static int bnxt_open(struct net_device *dev)
 {
 	struct bnxt *bp = netdev_priv(dev);
@@ -9259,13 +9270,7 @@ static int bnxt_open(struct net_device *dev)
 		bnxt_hwrm_if_change(bp, false);
 	} else {
 		if (test_and_clear_bit(BNXT_STATE_FW_RESET_DET, &bp->state)) {
-			if (BNXT_PF(bp)) {
-				struct bnxt_pf_info *pf = &bp->pf;
-				int n = pf->active_vfs;
-
-				if (n)
-					bnxt_cfg_hw_sriov(bp, &n, true);
-			}
+			bnxt_reenable_sriov(bp);
 			if (!test_bit(BNXT_STATE_IN_FW_RESET, &bp->state))
 				bnxt_ulp_start(bp, 0);
 		}
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 2/4] bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset.
  2020-02-02  7:41 [PATCH net 0/4] bnxt_en: Bug fixes Michael Chan
  2020-02-02  7:41 ` [PATCH net 1/4] bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected Michael Chan
@ 2020-02-02  7:41 ` Michael Chan
  2020-02-02  7:41 ` [PATCH net 3/4] bnxt_en: Fix logic that disables Bus Master during " Michael Chan
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Michael Chan @ 2020-02-02  7:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

bnxt_ulp_start() needs to be called before SRIOV is re-enabled after
firmware reset.  Re-enabling SRIOV may consume all the resources and
may cause the RDMA driver to fail to get MSIX and other resources.
Fix it by calling bnxt_ulp_start() first before calling
bnxt_reenable_sriov().

We re-arrange the logic so that we call bnxt_ulp_start() and
bnxt_reenable_sriov() in proper sequence in bnxt_fw_reset_task() and
bnxt_open().  The former is the normal coordinated firmware reset sequence
and the latter is firmware reset while the function is down.  This new
logic is now more straight forward and will now fix both scenarios.

Fixes: f3a6d206c25a ("bnxt_en: Call bnxt_ulp_stop()/bnxt_ulp_start() during error recovery.")
Reported-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 0253a8f..a69e4662 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9270,9 +9270,10 @@ static int bnxt_open(struct net_device *dev)
 		bnxt_hwrm_if_change(bp, false);
 	} else {
 		if (test_and_clear_bit(BNXT_STATE_FW_RESET_DET, &bp->state)) {
-			bnxt_reenable_sriov(bp);
-			if (!test_bit(BNXT_STATE_IN_FW_RESET, &bp->state))
+			if (!test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) {
 				bnxt_ulp_start(bp, 0);
+				bnxt_reenable_sriov(bp);
+			}
 		}
 		bnxt_hwmon_open(bp);
 	}
@@ -10836,6 +10837,8 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 		smp_mb__before_atomic();
 		clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
 		bnxt_ulp_start(bp, rc);
+		if (!rc)
+			bnxt_reenable_sriov(bp);
 		bnxt_dl_health_recovery_done(bp);
 		bnxt_dl_health_status_update(bp, true);
 		rtnl_unlock();
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 3/4] bnxt_en: Fix logic that disables Bus Master during firmware reset.
  2020-02-02  7:41 [PATCH net 0/4] bnxt_en: Bug fixes Michael Chan
  2020-02-02  7:41 ` [PATCH net 1/4] bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected Michael Chan
  2020-02-02  7:41 ` [PATCH net 2/4] bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset Michael Chan
@ 2020-02-02  7:41 ` Michael Chan
  2020-02-02  7:41 ` [PATCH net 4/4] bnxt_en: Fix TC queue mapping Michael Chan
  2020-02-03 23:22 ` [PATCH net 0/4] bnxt_en: Bug fixes Jakub Kicinski
  4 siblings, 0 replies; 6+ messages in thread
From: Michael Chan @ 2020-02-02  7:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, Vasundhara Volam

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

The current logic that calls pci_disable_device() in __bnxt_close_nic()
during firmware reset is flawed.  If firmware is still alive, we're
disabling the device too early, causing some firmware commands to
not reach the firmware.

Fix it by moving the logic to bnxt_reset_close().  If firmware is
in fatal condition, we call pci_disable_device() before we free
any of the rings to prevent DMA corruption of the freed rings.  If
firmware is still alive, we call pci_disable_device() after the
last firmware message has been sent.

Fixes: 3bc7d4a352ef ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index a69e4662..cea6033 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9313,10 +9313,6 @@ static void __bnxt_close_nic(struct bnxt *bp, bool irq_re_init,
 	bnxt_debug_dev_exit(bp);
 	bnxt_disable_napi(bp);
 	del_timer_sync(&bp->timer);
-	if (test_bit(BNXT_STATE_IN_FW_RESET, &bp->state) &&
-	    pci_is_enabled(bp->pdev))
-		pci_disable_device(bp->pdev);
-
 	bnxt_free_skbs(bp);
 
 	/* Save ring stats before shutdown */
@@ -10102,9 +10098,16 @@ static void bnxt_reset(struct bnxt *bp, bool silent)
 static void bnxt_fw_reset_close(struct bnxt *bp)
 {
 	bnxt_ulp_stop(bp);
+	/* When firmware is fatal state, disable PCI device to prevent
+	 * any potential bad DMAs before freeing kernel memory.
+	 */
+	if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state))
+		pci_disable_device(bp->pdev);
 	__bnxt_close_nic(bp, true, false);
 	bnxt_clear_int_mode(bp);
 	bnxt_hwrm_func_drv_unrgtr(bp);
+	if (pci_is_enabled(bp->pdev))
+		pci_disable_device(bp->pdev);
 	bnxt_free_ctx_mem(bp);
 	kfree(bp->ctx);
 	bp->ctx = NULL;
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 4/4] bnxt_en: Fix TC queue mapping.
  2020-02-02  7:41 [PATCH net 0/4] bnxt_en: Bug fixes Michael Chan
                   ` (2 preceding siblings ...)
  2020-02-02  7:41 ` [PATCH net 3/4] bnxt_en: Fix logic that disables Bus Master during " Michael Chan
@ 2020-02-02  7:41 ` Michael Chan
  2020-02-03 23:22 ` [PATCH net 0/4] bnxt_en: Bug fixes Jakub Kicinski
  4 siblings, 0 replies; 6+ messages in thread
From: Michael Chan @ 2020-02-02  7:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

The driver currently only calls netdev_set_tc_queue when the number of
TCs is greater than 1.  Instead, the comparison should be greater than
or equal to 1.  Even with 1 TC, we need to set the queue mapping.

This bug can cause warnings when the number of TCs is changed back to 1.

Fixes: 7809592d3e2e ("bnxt_en: Enable MSIX early in bnxt_init_one().")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index cea6033..597e6fd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7893,7 +7893,7 @@ static void bnxt_setup_msix(struct bnxt *bp)
 	int tcs, i;
 
 	tcs = netdev_get_num_tc(dev);
-	if (tcs > 1) {
+	if (tcs) {
 		int i, off, count;
 
 		for (i = 0; i < tcs; i++) {
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net 0/4] bnxt_en: Bug fixes.
  2020-02-02  7:41 [PATCH net 0/4] bnxt_en: Bug fixes Michael Chan
                   ` (3 preceding siblings ...)
  2020-02-02  7:41 ` [PATCH net 4/4] bnxt_en: Fix TC queue mapping Michael Chan
@ 2020-02-03 23:22 ` Jakub Kicinski
  4 siblings, 0 replies; 6+ messages in thread
From: Jakub Kicinski @ 2020-02-03 23:22 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev

On Sun,  2 Feb 2020 02:41:34 -0500, Michael Chan wrote:
> 3 patches that fix some issues in the firmware reset logic, starting
> with a small patch to refactor the code that re-enables SRIOV.  The
> last patch fixes a TC queue mapping issue.
> 
> Michael Chan (3):
>   bnxt_en: Refactor logic to re-enable SRIOV after firmware reset
>     detected.
>   bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset.
>   bnxt_en: Fix TC queue mapping.
> 
> Vasundhara Volam (1):
>   bnxt_en: Fix logic that disables Bus Master during firmware reset.

Applied and added to stable queued, thank you!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-02-03 23:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-02  7:41 [PATCH net 0/4] bnxt_en: Bug fixes Michael Chan
2020-02-02  7:41 ` [PATCH net 1/4] bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected Michael Chan
2020-02-02  7:41 ` [PATCH net 2/4] bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset Michael Chan
2020-02-02  7:41 ` [PATCH net 3/4] bnxt_en: Fix logic that disables Bus Master during " Michael Chan
2020-02-02  7:41 ` [PATCH net 4/4] bnxt_en: Fix TC queue mapping Michael Chan
2020-02-03 23:22 ` [PATCH net 0/4] bnxt_en: Bug fixes Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).