netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/3] bnxt_en: Bug fixes.
@ 2020-05-25 21:41 Michael Chan
  2020-05-25 21:41 ` [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev Michael Chan
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Michael Chan @ 2020-05-25 21:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

3 bnxt_en driver fixes, covering a bug in preserving the counters during
some resets, proper error code when flashing NVRAM fails, and an
endian bug when extracting the firmware response message length.

Please also queue these for -stable.  Thanks.

Edwin Peer (1):
  bnxt_en: fix firmware message length endianness

Michael Chan (1):
  bnxt_en: Fix accumulation of bp->net_stats_prev.

Vasundhara Volam (1):
  bnxt_en: Fix return code to "flash_device".

 drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 16 +++++-----------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h         |  5 -----
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  9 +++++----
 3 files changed, 10 insertions(+), 20 deletions(-)

-- 
2.5.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev.
  2020-05-25 21:41 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
@ 2020-05-25 21:41 ` Michael Chan
  2020-05-26 22:04   ` Jakub Kicinski
  2020-05-25 21:41 ` [PATCH net 2/3] bnxt_en: Fix return code to "flash_device" Michael Chan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Michael Chan @ 2020-05-25 21:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

We have logic to maintain network counters across resets by storing
the counters in bp->net_stats_prev before reset.  But not all resets
will clear the counters.  Certain resets that don't need to change
the number of rings do not clear the counters.  The current logic
accumulates the counters before all resets, causing big jumps in
the counters after some resets, such as ethtool -G.

Fix it by only accumulating the counters during reset if the irq_re_init
parameter is set.  The parameter signifies that all rings and interrupts
will be reset and that means that the counters will also be reset.

Reported-by: Vijayendra Suman <vijayendra.suman@oracle.com>
Fixes: b8875ca356f1 ("bnxt_en: Save ring statistics before reset.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d1a8371..abb203c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9310,7 +9310,7 @@ static void __bnxt_close_nic(struct bnxt *bp, bool irq_re_init,
 	bnxt_free_skbs(bp);
 
 	/* Save ring stats before shutdown */
-	if (bp->bnapi)
+	if (bp->bnapi && irq_re_init)
 		bnxt_get_ring_stats(bp, &bp->net_stats_prev);
 	if (irq_re_init) {
 		bnxt_free_irq(bp);
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net 2/3] bnxt_en: Fix return code to "flash_device".
  2020-05-25 21:41 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
  2020-05-25 21:41 ` [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev Michael Chan
@ 2020-05-25 21:41 ` Michael Chan
  2020-05-25 21:41 ` [PATCH net 3/3] bnxt_en: fix firmware message length endianness Michael Chan
  2020-05-27  3:31 ` [PATCH net 0/3] bnxt_en: Bug fixes David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: Michael Chan @ 2020-05-25 21:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>

When NVRAM directory is not found, return the error code
properly as per firmware command failure instead of the hardcode
-ENOBUFS.

Fixes: 3a707bed13b7 ("bnxt_en: Return -EAGAIN if fw command returns BUSY")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 34046a6..360f9a9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2012,11 +2012,12 @@ int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
 
 	bnxt_hwrm_fw_set_time(bp);
 
-	if (bnxt_find_nvram_item(dev, BNX_DIR_TYPE_UPDATE,
-				 BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
-				 &index, &item_len, NULL) != 0) {
+	rc = bnxt_find_nvram_item(dev, BNX_DIR_TYPE_UPDATE,
+				  BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
+				  &index, &item_len, NULL);
+	if (rc) {
 		netdev_err(dev, "PKG update area not created in nvram\n");
-		return -ENOBUFS;
+		return rc;
 	}
 
 	rc = request_firmware(&fw, filename, &dev->dev);
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net 3/3] bnxt_en: fix firmware message length endianness
  2020-05-25 21:41 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
  2020-05-25 21:41 ` [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev Michael Chan
  2020-05-25 21:41 ` [PATCH net 2/3] bnxt_en: Fix return code to "flash_device" Michael Chan
@ 2020-05-25 21:41 ` Michael Chan
  2020-05-27  3:31 ` [PATCH net 0/3] bnxt_en: Bug fixes David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: Michael Chan @ 2020-05-25 21:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Edwin Peer <edwin.peer@broadcom.com>

The explicit mask and shift is not the appropriate way to parse fields
out of a little endian struct. The length field is internally __le16
and the strategy employed only happens to work on little endian machines
because the offset used is actually incorrect (length is at offset 6).

Also remove the related and no longer used definitions from bnxt.h.

Fixes: 845adfe40c2a ("bnxt_en: Improve valid bit checking in firmware response message.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 14 ++++----------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  5 -----
 2 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index abb203c..58e0d9a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4176,14 +4176,12 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 	int i, intr_process, rc, tmo_count;
 	struct input *req = msg;
 	u32 *data = msg;
-	__le32 *resp_len;
 	u8 *valid;
 	u16 cp_ring_id, len = 0;
 	struct hwrm_err_output *resp = bp->hwrm_cmd_resp_addr;
 	u16 max_req_len = BNXT_HWRM_MAX_REQ_LEN;
 	struct hwrm_short_input short_input = {0};
 	u32 doorbell_offset = BNXT_GRCPF_REG_CHIMP_COMM_TRIGGER;
-	u8 *resp_addr = (u8 *)bp->hwrm_cmd_resp_addr;
 	u32 bar_offset = BNXT_GRCPF_REG_CHIMP_COMM;
 	u16 dst = BNXT_HWRM_CHNL_CHIMP;
 
@@ -4201,7 +4199,6 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 		bar_offset = BNXT_GRCPF_REG_KONG_COMM;
 		doorbell_offset = BNXT_GRCPF_REG_KONG_COMM_TRIGGER;
 		resp = bp->hwrm_cmd_kong_resp_addr;
-		resp_addr = (u8 *)bp->hwrm_cmd_kong_resp_addr;
 	}
 
 	memset(resp, 0, PAGE_SIZE);
@@ -4270,7 +4267,6 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 	tmo_count = HWRM_SHORT_TIMEOUT_COUNTER;
 	timeout = timeout - HWRM_SHORT_MIN_TIMEOUT * HWRM_SHORT_TIMEOUT_COUNTER;
 	tmo_count += DIV_ROUND_UP(timeout, HWRM_MIN_TIMEOUT);
-	resp_len = (__le32 *)(resp_addr + HWRM_RESP_LEN_OFFSET);
 
 	if (intr_process) {
 		u16 seq_id = bp->hwrm_intr_seq_id;
@@ -4298,9 +4294,8 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 					   le16_to_cpu(req->req_type));
 			return -EBUSY;
 		}
-		len = (le32_to_cpu(*resp_len) & HWRM_RESP_LEN_MASK) >>
-		      HWRM_RESP_LEN_SFT;
-		valid = resp_addr + len - 1;
+		len = le16_to_cpu(resp->resp_len);
+		valid = ((u8 *)resp) + len - 1;
 	} else {
 		int j;
 
@@ -4311,8 +4306,7 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 			 */
 			if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state))
 				return -EBUSY;
-			len = (le32_to_cpu(*resp_len) & HWRM_RESP_LEN_MASK) >>
-			      HWRM_RESP_LEN_SFT;
+			len = le16_to_cpu(resp->resp_len);
 			if (len)
 				break;
 			/* on first few passes, just barely sleep */
@@ -4334,7 +4328,7 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 		}
 
 		/* Last byte of resp contains valid bit */
-		valid = resp_addr + len - 1;
+		valid = ((u8 *)resp) + len - 1;
 		for (j = 0; j < HWRM_VALID_BIT_DELAY_USEC; j++) {
 			/* make sure we read from updated DMA memory */
 			dma_rmb();
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index f6a3250..3d39638 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -656,11 +656,6 @@ struct nqe_cn {
 #define HWRM_CMD_TIMEOUT		(bp->hwrm_cmd_timeout)
 #define HWRM_RESET_TIMEOUT		((HWRM_CMD_TIMEOUT) * 4)
 #define HWRM_COREDUMP_TIMEOUT		((HWRM_CMD_TIMEOUT) * 12)
-#define HWRM_RESP_ERR_CODE_MASK		0xffff
-#define HWRM_RESP_LEN_OFFSET		4
-#define HWRM_RESP_LEN_MASK		0xffff0000
-#define HWRM_RESP_LEN_SFT		16
-#define HWRM_RESP_VALID_MASK		0xff000000
 #define BNXT_HWRM_REQ_MAX_SIZE		128
 #define BNXT_HWRM_REQS_PER_PAGE		(BNXT_PAGE_SIZE /	\
 					 BNXT_HWRM_REQ_MAX_SIZE)
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev.
  2020-05-25 21:41 ` [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev Michael Chan
@ 2020-05-26 22:04   ` Jakub Kicinski
  2020-05-26 22:43     ` Michael Chan
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2020-05-26 22:04 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev

On Mon, 25 May 2020 17:41:17 -0400 Michael Chan wrote:
> We have logic to maintain network counters across resets by storing
> the counters in bp->net_stats_prev before reset.  But not all resets
> will clear the counters.  Certain resets that don't need to change
> the number of rings do not clear the counters.  The current logic
> accumulates the counters before all resets, causing big jumps in
> the counters after some resets, such as ethtool -G.
> 
> Fix it by only accumulating the counters during reset if the irq_re_init
> parameter is set.  The parameter signifies that all rings and interrupts
> will be reset and that means that the counters will also be reset.
> 
> Reported-by: Vijayendra Suman <vijayendra.suman@oracle.com>
> Fixes: b8875ca356f1 ("bnxt_en: Save ring statistics before reset.")
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>

Hi Michael! 

Could you explain why accumulating counters causes a jump?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev.
  2020-05-26 22:04   ` Jakub Kicinski
@ 2020-05-26 22:43     ` Michael Chan
  2020-05-26 22:47       ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Chan @ 2020-05-26 22:43 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David Miller, Netdev

On Tue, May 26, 2020 at 3:04 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 25 May 2020 17:41:17 -0400 Michael Chan wrote:
> > We have logic to maintain network counters across resets by storing
> > the counters in bp->net_stats_prev before reset.  But not all resets
> > will clear the counters.  Certain resets that don't need to change
> > the number of rings do not clear the counters.  The current logic
> > accumulates the counters before all resets, causing big jumps in
> > the counters after some resets, such as ethtool -G.
> >
> > Fix it by only accumulating the counters during reset if the irq_re_init
> > parameter is set.  The parameter signifies that all rings and interrupts
> > will be reset and that means that the counters will also be reset.
> >
> > Reported-by: Vijayendra Suman <vijayendra.suman@oracle.com>
> > Fixes: b8875ca356f1 ("bnxt_en: Save ring statistics before reset.")
> > Signed-off-by: Michael Chan <michael.chan@broadcom.com>
>
> Hi Michael!
>
> Could you explain why accumulating counters causes a jump?

Yes, during chip reset, we free most hardware resources including
possibly hardware counter resources.  After freeing the hardware
counters, the counters will go to zero.  To preserve the counters, we
take a snapshot of the hardware counters and add them to the
bp->net_stats_prev.  The counters in bp->net_stats_prev are always
added to the current hardware counters to provide the true counters.

The problem is that not all resets will free the hardware counters.
The old code is unconditionally taking the snapshot during reset.  So
when we don't free the hardware counters, the snapshot will cause us
to effectively double the hardware counters after the reset.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev.
  2020-05-26 22:43     ` Michael Chan
@ 2020-05-26 22:47       ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2020-05-26 22:47 UTC (permalink / raw)
  To: Michael Chan; +Cc: David Miller, Netdev

On Tue, 26 May 2020 15:43:52 -0700 Michael Chan wrote:
> On Tue, May 26, 2020 at 3:04 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Mon, 25 May 2020 17:41:17 -0400 Michael Chan wrote:  
> > > We have logic to maintain network counters across resets by storing
> > > the counters in bp->net_stats_prev before reset.  But not all resets
> > > will clear the counters.  Certain resets that don't need to change
> > > the number of rings do not clear the counters.  The current logic
> > > accumulates the counters before all resets, causing big jumps in
> > > the counters after some resets, such as ethtool -G.
> > >
> > > Fix it by only accumulating the counters during reset if the irq_re_init
> > > parameter is set.  The parameter signifies that all rings and interrupts
> > > will be reset and that means that the counters will also be reset.
> > >
> > > Reported-by: Vijayendra Suman <vijayendra.suman@oracle.com>
> > > Fixes: b8875ca356f1 ("bnxt_en: Save ring statistics before reset.")
> > > Signed-off-by: Michael Chan <michael.chan@broadcom.com>  
> >
> > Hi Michael!
> >
> > Could you explain why accumulating counters causes a jump?  
> 
> Yes, during chip reset, we free most hardware resources including
> possibly hardware counter resources.  After freeing the hardware
> counters, the counters will go to zero.  To preserve the counters, we
> take a snapshot of the hardware counters and add them to the
> bp->net_stats_prev.  The counters in bp->net_stats_prev are always
> added to the current hardware counters to provide the true counters.
> 
> The problem is that not all resets will free the hardware counters.
> The old code is unconditionally taking the snapshot during reset.  So
> when we don't free the hardware counters, the snapshot will cause us
> to effectively double the hardware counters after the reset.

Aw! I see what you mean now, thanks for the explanation!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net 0/3] bnxt_en: Bug fixes.
  2020-05-25 21:41 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
                   ` (2 preceding siblings ...)
  2020-05-25 21:41 ` [PATCH net 3/3] bnxt_en: fix firmware message length endianness Michael Chan
@ 2020-05-27  3:31 ` David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2020-05-27  3:31 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev

From: Michael Chan <michael.chan@broadcom.com>
Date: Mon, 25 May 2020 17:41:16 -0400

> 3 bnxt_en driver fixes, covering a bug in preserving the counters during
> some resets, proper error code when flashing NVRAM fails, and an
> endian bug when extracting the firmware response message length.

Series applied.

> Please also queue these for -stable.  Thanks.

Queued up.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-27  3:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-25 21:41 [PATCH net 0/3] bnxt_en: Bug fixes Michael Chan
2020-05-25 21:41 ` [PATCH net 1/3] bnxt_en: Fix accumulation of bp->net_stats_prev Michael Chan
2020-05-26 22:04   ` Jakub Kicinski
2020-05-26 22:43     ` Michael Chan
2020-05-26 22:47       ` Jakub Kicinski
2020-05-25 21:41 ` [PATCH net 2/3] bnxt_en: Fix return code to "flash_device" Michael Chan
2020-05-25 21:41 ` [PATCH net 3/3] bnxt_en: fix firmware message length endianness Michael Chan
2020-05-27  3:31 ` [PATCH net 0/3] bnxt_en: Bug fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).