netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/9] bnxt_en: Updates.
@ 2022-03-05  8:54 Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 1/9] bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE Michael Chan
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

This patch series contains mainly NVRAM related features.  More
NVRAM error checking and logging are added when installing firmware
packages.  A new devlink hw health report is now added to report
and diagnose NVRAM issues.  Other miscellaneous patches include
reporting correctly cards that don't support link pause, adding
an internal unknown link state, and avoiding unnecessary link
toggle during firmware reset.

Edwin Peer (2):
  bnxt_en: introduce initial link state of unknown
  bnxt_en: Do not destroy health reporters during reset

Kalesh AP (4):
  bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE
  bnxt_en: add more error checks to HWRM_NVM_INSTALL_UPDATE
  bnxt_en: parse result field when NVRAM package install fails
  bnxt_en: implement hw health reporter

Michael Chan (2):
  bnxt_en: Properly report no pause support on some cards
  bnxt_en: Eliminate unintended link toggle during FW reset

Vikas Gupta (1):
  bnxt_en: add an nvm test for hw diagnose

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  61 ++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  57 +++++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c |   3 +-
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 191 ++++++++++++++++--
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.h |   3 +-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 119 ++++++++---
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.h |   7 +
 7 files changed, 367 insertions(+), 74 deletions(-)

-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/9] bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 2/9] bnxt_en: add more error checks to HWRM_NVM_INSTALL_UPDATE Michael Chan
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

This is in anticipation of handling more "cmd_err" from FW in the next
patch.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 26 +++++++++++++------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index fecb03b49f01..59838a4f45fb 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2512,6 +2512,7 @@ int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware
 	u8 *kmem = NULL;
 	u32 modify_len;
 	u32 item_len;
+	u8 cmd_err;
 	u16 index;
 	int rc;
 
@@ -2595,6 +2596,8 @@ int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware
 		}
 
 		rc = hwrm_req_send_silent(bp, install);
+		if (!rc)
+			break;
 
 		if (defrag_attempted) {
 			/* We have tried to defragment already in the previous
@@ -2603,15 +2606,20 @@ int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware
 			break;
 		}
 
-		if (rc && ((struct hwrm_err_output *)resp)->cmd_err ==
-		    NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR) {
+		cmd_err = ((struct hwrm_err_output *)resp)->cmd_err;
+
+		switch (cmd_err) {
+		case NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR:
 			install->flags =
 				cpu_to_le16(NVM_INSTALL_UPDATE_REQ_FLAGS_ALLOWED_TO_DEFRAG);
 
 			rc = hwrm_req_send_silent(bp, install);
+			if (!rc)
+				break;
+
+			cmd_err = ((struct hwrm_err_output *)resp)->cmd_err;
 
-			if (rc && ((struct hwrm_err_output *)resp)->cmd_err ==
-			    NVM_INSTALL_UPDATE_CMD_ERR_CODE_NO_SPACE) {
+			if (cmd_err == NVM_INSTALL_UPDATE_CMD_ERR_CODE_NO_SPACE) {
 				/* FW has cleared NVM area, driver will create
 				 * UPDATE directory and try the flash again
 				 */
@@ -2621,11 +2629,13 @@ int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware
 						      BNX_DIR_TYPE_UPDATE,
 						      BNX_DIR_ORDINAL_FIRST,
 						      0, 0, item_len, NULL, 0);
-			} else if (rc) {
-				netdev_err(dev, "HWRM_NVM_INSTALL_UPDATE failure rc :%x\n", rc);
+				if (!rc)
+					break;
 			}
-		} else if (rc) {
-			netdev_err(dev, "HWRM_NVM_INSTALL_UPDATE failure rc :%x\n", rc);
+			fallthrough;
+		default:
+			netdev_err(dev, "HWRM_NVM_INSTALL_UPDATE failure rc :%x cmd_err :%x\n",
+				   rc, cmd_err);
 		}
 	} while (defrag_attempted && !rc);
 
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/9] bnxt_en: add more error checks to HWRM_NVM_INSTALL_UPDATE
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 1/9] bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 3/9] bnxt_en: parse result field when NVRAM package install fails Michael Chan
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

FW returns error code "NVM_INSTALL_UPDATE_CMD_ERR_CODE_ANTI_ROLLBACK"
in the response to indicate that HWRM_NVM_INSTALL_UPDATE command has
failed due to Anti-rollback feature. Parse the error and return an
appropriate error code to the user.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 59838a4f45fb..a3151af9a279 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2609,6 +2609,10 @@ int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware
 		cmd_err = ((struct hwrm_err_output *)resp)->cmd_err;
 
 		switch (cmd_err) {
+		case NVM_INSTALL_UPDATE_CMD_ERR_CODE_ANTI_ROLLBACK:
+			netdev_err(dev, "HWRM_NVM_INSTALL_UPDATE failure Anti-rollback detected\n");
+			rc = -EALREADY;
+			break;
 		case NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR:
 			install->flags =
 				cpu_to_le16(NVM_INSTALL_UPDATE_REQ_FLAGS_ALLOWED_TO_DEFRAG);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 3/9] bnxt_en: parse result field when NVRAM package install fails
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 1/9] bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 2/9] bnxt_en: add more error checks to HWRM_NVM_INSTALL_UPDATE Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-07 22:13   ` Jakub Kicinski
  2022-03-05  8:54 ` [PATCH net-next 4/9] bnxt_en: introduce initial link state of unknown Michael Chan
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 3548 bytes --]

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

Instead of always returning -ENOPKG, decode the firmware error
code further when the HWRM_NVM_INSTALL_UPDATE firmware call fails.
Return a more suitable error code to userspace and log an error
in dmesg.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 44 ++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index a3151af9a279..3927ceb581da 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2496,6 +2496,48 @@ static int bnxt_flash_firmware_from_file(struct net_device *dev,
 	return rc;
 }
 
+static int nvm_update_err_to_stderr(struct net_device *dev, u8 result)
+{
+	switch (result) {
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_TYPE_PARAMETER:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_INDEX_PARAMETER:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INSTALL_DATA_ERROR:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INSTALL_CHECKSUM_ERROR:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_ITEM_NOT_FOUND:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_ITEM_LOCKED:
+		netdev_err(dev, "PKG install error : Data integrity on NVM\n");
+		return -EINVAL;
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_PREREQUISITE:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_FILE_HEADER:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_SIGNATURE:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_PROP_STREAM:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_PROP_LENGTH:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_MANIFEST:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_TRAILER:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_CHECKSUM:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_ITEM_CHECKSUM:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_DATA_LENGTH:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INVALID_DIRECTIVE:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_DUPLICATE_ITEM:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_ZERO_LENGTH_ITEM:
+		netdev_err(dev, "PKG install error : Invalid package\n");
+		return -ENOPKG;
+	case NVM_INSTALL_UPDATE_RESP_RESULT_INSTALL_AUTHENTICATION_ERROR:
+		netdev_err(dev, "PKG install error : Authentication error\n");
+		return -EPERM;
+	case NVM_INSTALL_UPDATE_RESP_RESULT_UNSUPPORTED_CHIP_REV:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_UNSUPPORTED_DEVICE_ID:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_UNSUPPORTED_SUBSYS_VENDOR:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_UNSUPPORTED_SUBSYS_ID:
+	case NVM_INSTALL_UPDATE_RESP_RESULT_UNSUPPORTED_PLATFORM:
+		netdev_err(dev, "PKG install error : Invalid device\n");
+		return -EOPNOTSUPP;
+	default:
+		netdev_err(dev, "PKG install error : Internal error\n");
+		return -EIO;
+	}
+}
+
 #define BNXT_PKG_DMA_SIZE	0x40000
 #define BNXT_NVM_MORE_FLAG	(cpu_to_le16(NVM_MODIFY_REQ_FLAGS_BATCH_MODE))
 #define BNXT_NVM_LAST_FLAG	(cpu_to_le16(NVM_MODIFY_REQ_FLAGS_BATCH_LAST))
@@ -2650,7 +2692,7 @@ int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware
 	if (resp->result) {
 		netdev_err(dev, "PKG install error = %d, problem_item = %d\n",
 			   (s8)resp->result, (int)resp->problem_item);
-		rc = -ENOPKG;
+		rc = nvm_update_err_to_stderr(dev, resp->result);
 	}
 	if (rc == -EACCES)
 		bnxt_print_admin_err(bp);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 4/9] bnxt_en: introduce initial link state of unknown
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (2 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 3/9] bnxt_en: parse result field when NVRAM package install fails Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 5/9] bnxt_en: Properly report no pause support on some cards Michael Chan
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 5227 bytes --]

From: Edwin Peer <edwin.peer@broadcom.com>

This will force link state to always be logged for initial NIC open.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 33 ++++++++++++-------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  6 +++-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  4 +--
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 37facef47846..1ed71b6fed8a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9300,7 +9300,7 @@ void bnxt_tx_enable(struct bnxt *bp)
 	/* Make sure napi polls see @dev_state change */
 	synchronize_net();
 	netif_tx_wake_all_queues(bp->dev);
-	if (bp->link_info.link_up)
+	if (BNXT_LINK_IS_UP(bp))
 		netif_carrier_on(bp->dev);
 }
 
@@ -9330,7 +9330,7 @@ static char *bnxt_report_fec(struct bnxt_link_info *link_info)
 
 void bnxt_report_link(struct bnxt *bp)
 {
-	if (bp->link_info.link_up) {
+	if (BNXT_LINK_IS_UP(bp)) {
 		const char *signal = "";
 		const char *flow_ctrl;
 		const char *duplex;
@@ -9466,7 +9466,7 @@ int bnxt_update_link(struct bnxt *bp, bool chng_link_state)
 	struct bnxt_link_info *link_info = &bp->link_info;
 	struct hwrm_port_phy_qcfg_output *resp;
 	struct hwrm_port_phy_qcfg_input *req;
-	u8 link_up = link_info->link_up;
+	u8 link_state = link_info->link_state;
 	bool support_changed = false;
 	int rc;
 
@@ -9567,14 +9567,14 @@ int bnxt_update_link(struct bnxt *bp, bool chng_link_state)
 	/* TODO: need to add more logic to report VF link */
 	if (chng_link_state) {
 		if (link_info->phy_link_status == BNXT_LINK_LINK)
-			link_info->link_up = 1;
+			link_info->link_state = BNXT_LINK_STATE_UP;
 		else
-			link_info->link_up = 0;
-		if (link_up != link_info->link_up)
+			link_info->link_state = BNXT_LINK_STATE_DOWN;
+		if (link_state != link_info->link_state)
 			bnxt_report_link(bp);
 	} else {
-		/* alwasy link down if not require to update link state */
-		link_info->link_up = 0;
+		/* always link down if not require to update link state */
+		link_info->link_state = BNXT_LINK_STATE_DOWN;
 	}
 	hwrm_req_drop(bp, req);
 
@@ -9774,7 +9774,18 @@ static int bnxt_hwrm_shutdown_link(struct bnxt *bp)
 		return rc;
 
 	req->flags = cpu_to_le32(PORT_PHY_CFG_REQ_FLAGS_FORCE_LINK_DWN);
-	return hwrm_req_send(bp, req);
+	rc = hwrm_req_send(bp, req);
+	if (!rc) {
+		mutex_lock(&bp->link_lock);
+		/* Device is not obliged link down in certain scenarios, even
+		 * when forced. Setting the state unknown is consistent with
+		 * driver startup and will force link state to be reported
+		 * during subsequent open based on PORT_PHY_QCFG.
+		 */
+		bp->link_info.link_state = BNXT_LINK_STATE_UNKNOWN;
+		mutex_unlock(&bp->link_lock);
+	}
+	return rc;
 }
 
 static int bnxt_fw_reset_via_optee(struct bnxt *bp)
@@ -10205,7 +10216,7 @@ static int bnxt_update_phy_setting(struct bnxt *bp)
 	/* The last close may have shutdown the link, so need to call
 	 * PHY_CFG to bring it back up.
 	 */
-	if (!bp->link_info.link_up)
+	if (!BNXT_LINK_IS_UP(bp))
 		update_link = true;
 
 	if (!bnxt_eee_config_ok(bp))
@@ -11437,7 +11448,7 @@ static void bnxt_timer(struct timer_list *t)
 	if (bp->fw_cap & BNXT_FW_CAP_ERROR_RECOVERY)
 		bnxt_fw_health_check(bp);
 
-	if (bp->link_info.link_up && bp->stats_coal_ticks) {
+	if (BNXT_LINK_IS_UP(bp) && bp->stats_coal_ticks) {
 		set_bit(BNXT_PERIODIC_STATS_SP_EVENT, &bp->sp_event);
 		bnxt_queue_sp_work(bp);
 	}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 802ec1e9956d..5d743976cb35 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1175,7 +1175,11 @@ struct bnxt_link_info {
 #define BNXT_PHY_STATE_ENABLED		0
 #define BNXT_PHY_STATE_DISABLED		1
 
-	u8			link_up;
+	u8			link_state;
+#define BNXT_LINK_STATE_UNKNOWN	0
+#define BNXT_LINK_STATE_DOWN	1
+#define BNXT_LINK_STATE_UP	2
+#define BNXT_LINK_IS_UP(bp)	((bp)->link_info.link_state == BNXT_LINK_STATE_UP)
 	u8			duplex;
 #define BNXT_LINK_DUPLEX_HALF	PORT_PHY_QCFG_RESP_DUPLEX_STATE_HALF
 #define BNXT_LINK_DUPLEX_FULL	PORT_PHY_QCFG_RESP_DUPLEX_STATE_FULL
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 3927ceb581da..519edad70f16 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2135,7 +2135,7 @@ static u32 bnxt_get_link(struct net_device *dev)
 	struct bnxt *bp = netdev_priv(dev);
 
 	/* TODO: handle MF, VF, driver close case */
-	return bp->link_info.link_up;
+	return BNXT_LINK_IS_UP(bp);
 }
 
 int bnxt_hwrm_nvm_get_dev_info(struct bnxt *bp,
@@ -3383,7 +3383,7 @@ static int bnxt_disable_an_for_lpbk(struct bnxt *bp,
 		return rc;
 
 	fw_speed = PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_1GB;
-	if (bp->link_info.link_up)
+	if (BNXT_LINK_IS_UP(bp))
 		fw_speed = bp->link_info.link_speed;
 	else if (fw_advertising & BNXT_LINK_SPEED_MSK_10GB)
 		fw_speed = PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_10GB;
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 5/9] bnxt_en: Properly report no pause support on some cards
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (3 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 4/9] bnxt_en: introduce initial link state of unknown Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 6/9] bnxt_en: Eliminate unintended link toggle during FW reset Michael Chan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 4969 bytes --]

Some cards are configured to never support link pause or PFC.  Discover
these cards and properly report no pause support to ethtool.  Disable
PFC settings from DCBNL if PFC is unsupported.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c         |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h         |  6 ++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c     |  3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 15 ++++++++++-----
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1ed71b6fed8a..2280b189f3d6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9416,7 +9416,7 @@ static int bnxt_hwrm_phy_qcaps(struct bnxt *bp)
 	if (rc)
 		goto hwrm_phy_qcaps_exit;
 
-	bp->phy_flags = resp->flags;
+	bp->phy_flags = resp->flags | (le16_to_cpu(resp->flags2) << 8);
 	if (resp->flags & PORT_PHY_QCAPS_RESP_FLAGS_EEE_SUPPORTED) {
 		struct ethtool_eee *eee = &bp->eee;
 		u16 fw_speeds = le16_to_cpu(resp->supported_speeds_eee_mode);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 5d743976cb35..447a9406b8a2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2104,8 +2104,8 @@ struct bnxt {
 	u32			lpi_tmr_lo;
 	u32			lpi_tmr_hi;
 
-	/* copied from flags in hwrm_port_phy_qcaps_output */
-	u8			phy_flags;
+	/* copied from flags and flags2 in hwrm_port_phy_qcaps_output */
+	u32			phy_flags;
 #define BNXT_PHY_FL_EEE_CAP		PORT_PHY_QCAPS_RESP_FLAGS_EEE_SUPPORTED
 #define BNXT_PHY_FL_EXT_LPBK		PORT_PHY_QCAPS_RESP_FLAGS_EXTERNAL_LPBK_SUPPORTED
 #define BNXT_PHY_FL_AN_PHY_LPBK		PORT_PHY_QCAPS_RESP_FLAGS_AUTONEG_LPBK_SUPPORTED
@@ -2114,6 +2114,8 @@ struct bnxt {
 #define BNXT_PHY_FL_NO_PHY_LPBK		PORT_PHY_QCAPS_RESP_FLAGS_LOCAL_LPBK_NOT_SUPPORTED
 #define BNXT_PHY_FL_FW_MANAGED_LKDN	PORT_PHY_QCAPS_RESP_FLAGS_FW_MANAGED_LINK_DOWN
 #define BNXT_PHY_FL_NO_FCS		PORT_PHY_QCAPS_RESP_FLAGS_NO_FCS
+#define BNXT_PHY_FL_NO_PAUSE		(PORT_PHY_QCAPS_RESP_FLAGS2_PAUSE_UNSUPPORTED << 8)
+#define BNXT_PHY_FL_NO_PFC		(PORT_PHY_QCAPS_RESP_FLAGS2_PFC_UNSUPPORTED << 8)
 
 	u8			num_tests;
 	struct bnxt_test_info	*test_info;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
index 217ff597cdf2..caab3d626a2a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
@@ -627,7 +627,8 @@ static int bnxt_dcbnl_ieee_setpfc(struct net_device *dev, struct ieee_pfc *pfc)
 	int rc;
 
 	if (!(bp->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) ||
-	    !(bp->dcbx_cap & DCB_CAP_DCBX_HOST))
+	    !(bp->dcbx_cap & DCB_CAP_DCBX_HOST) ||
+	    (bp->phy_flags & BNXT_PHY_FL_NO_PAUSE))
 		return -EINVAL;
 
 	if (!my_pfc) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 519edad70f16..7cc69957e529 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -1662,15 +1662,19 @@ static void bnxt_fw_to_ethtool_support_fec(struct bnxt_link_info *link_info,
 static void bnxt_fw_to_ethtool_support_spds(struct bnxt_link_info *link_info,
 				struct ethtool_link_ksettings *lk_ksettings)
 {
+	struct bnxt *bp = container_of(link_info, struct bnxt, link_info);
 	u16 fw_speeds = link_info->support_speeds;
 
 	BNXT_FW_TO_ETHTOOL_SPDS(fw_speeds, 0, lk_ksettings, supported);
 	fw_speeds = link_info->support_pam4_speeds;
 	BNXT_FW_TO_ETHTOOL_PAM4_SPDS(fw_speeds, lk_ksettings, supported);
 
-	ethtool_link_ksettings_add_link_mode(lk_ksettings, supported, Pause);
-	ethtool_link_ksettings_add_link_mode(lk_ksettings, supported,
-					     Asym_Pause);
+	if (!(bp->phy_flags & BNXT_PHY_FL_NO_PAUSE)) {
+		ethtool_link_ksettings_add_link_mode(lk_ksettings, supported,
+						     Pause);
+		ethtool_link_ksettings_add_link_mode(lk_ksettings, supported,
+						     Asym_Pause);
+	}
 
 	if (link_info->support_auto_speeds ||
 	    link_info->support_pam4_auto_speeds)
@@ -1901,7 +1905,8 @@ static int bnxt_set_link_ksettings(struct net_device *dev,
 		/* any change to autoneg will cause link change, therefore the
 		 * driver should put back the original pause setting in autoneg
 		 */
-		set_pause = true;
+		if (!(bp->phy_flags & BNXT_PHY_FL_NO_PAUSE))
+			set_pause = true;
 	} else {
 		u8 phy_type = link_info->phy_type;
 
@@ -2093,7 +2098,7 @@ static int bnxt_set_pauseparam(struct net_device *dev,
 	struct bnxt *bp = netdev_priv(dev);
 	struct bnxt_link_info *link_info = &bp->link_info;
 
-	if (!BNXT_PHY_CFG_ABLE(bp))
+	if (!BNXT_PHY_CFG_ABLE(bp) || (bp->phy_flags & BNXT_PHY_FL_NO_PAUSE))
 		return -EOPNOTSUPP;
 
 	mutex_lock(&bp->link_lock);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 6/9] bnxt_en: Eliminate unintended link toggle during FW reset
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (4 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 5/9] bnxt_en: Properly report no pause support on some cards Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 7/9] bnxt_en: Do not destroy health reporters during reset Michael Chan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1528 bytes --]

If the flow control settings have been changed, a subsequent FW reset
may cause the ethernet link to toggle unnecessarily.  This link toggle
will increase the down time by a few seconds.

The problem is caused by bnxt_update_phy_setting() detecting a false
mismatch in the flow control settings between the stored software
settings and the current FW settings after the FW reset.  This mismatch
is caused by the AUTONEG bit added to link_info->req_flow_ctrl in an
inconsistent way in bnxt_set_pauseparam() in autoneg mode.  The AUTONEG
bit should not be added to link_info->req_flow_ctrl.

Reviewed-by: Colin Winegarden <colin.winegarden@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 7cc69957e529..eadaca42ed96 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2109,9 +2109,7 @@ static int bnxt_set_pauseparam(struct net_device *dev,
 		}
 
 		link_info->autoneg |= BNXT_AUTONEG_FLOW_CTRL;
-		if (bp->hwrm_spec_code >= 0x10201)
-			link_info->req_flow_ctrl =
-				PORT_PHY_CFG_REQ_AUTO_PAUSE_AUTONEG_PAUSE;
+		link_info->req_flow_ctrl = 0;
 	} else {
 		/* when transition from auto pause to force pause,
 		 * force a link change
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 7/9] bnxt_en: Do not destroy health reporters during reset
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (5 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 6/9] bnxt_en: Eliminate unintended link toggle during FW reset Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-05  8:54 ` [PATCH net-next 8/9] bnxt_en: implement hw health reporter Michael Chan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 4330 bytes --]

From: Edwin Peer <edwin.peer@broadcom.com>

Health reporter state should be maintained over resets. Previously
reporters were destroyed if the device capabilities changed, but
since none of the reporters depend on capabilities anymore, this
logic should be removed.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  7 +--
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 44 +++++++++----------
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.h |  2 +-
 3 files changed, 24 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2280b189f3d6..2de02950086f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -12149,11 +12149,6 @@ int bnxt_fw_init_one(struct bnxt *bp)
 	if (rc)
 		return rc;
 
-	/* In case fw capabilities have changed, destroy the unneeded
-	 * reporters and create newly capable ones.
-	 */
-	bnxt_dl_fw_reporters_destroy(bp, false);
-	bnxt_dl_fw_reporters_create(bp);
 	bnxt_fw_init_one_p3(bp);
 	return 0;
 }
@@ -12982,7 +12977,7 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	cancel_delayed_work_sync(&bp->fw_reset_task);
 	bp->sp_event = 0;
 
-	bnxt_dl_fw_reporters_destroy(bp, true);
+	bnxt_dl_fw_reporters_destroy(bp);
 	bnxt_dl_unregister(bp);
 	bnxt_shutdown_tc(bp);
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index f6e21fac0e69..0c17f90d44a2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -241,37 +241,37 @@ static const struct devlink_health_reporter_ops bnxt_dl_fw_reporter_ops = {
 	.recover = bnxt_fw_recover,
 };
 
-void bnxt_dl_fw_reporters_create(struct bnxt *bp)
+static struct devlink_health_reporter *
+__bnxt_dl_reporter_create(struct bnxt *bp,
+			  const struct devlink_health_reporter_ops *ops)
 {
-	struct bnxt_fw_health *health = bp->fw_health;
-
-	if (!health || health->fw_reporter)
-		return;
+	struct devlink_health_reporter *reporter;
 
-	health->fw_reporter =
-		devlink_health_reporter_create(bp->dl, &bnxt_dl_fw_reporter_ops,
-					       0, bp);
-	if (IS_ERR(health->fw_reporter)) {
-		netdev_warn(bp->dev, "Failed to create FW health reporter, rc = %ld\n",
-			    PTR_ERR(health->fw_reporter));
-		health->fw_reporter = NULL;
-		bp->fw_cap &= ~BNXT_FW_CAP_ERROR_RECOVERY;
+	reporter = devlink_health_reporter_create(bp->dl, ops, 0, bp);
+	if (IS_ERR(reporter)) {
+		netdev_warn(bp->dev, "Failed to create %s health reporter, rc = %ld\n",
+			    ops->name, PTR_ERR(reporter));
+		return NULL;
 	}
+
+	return reporter;
 }
 
-void bnxt_dl_fw_reporters_destroy(struct bnxt *bp, bool all)
+void bnxt_dl_fw_reporters_create(struct bnxt *bp)
 {
-	struct bnxt_fw_health *health = bp->fw_health;
+	struct bnxt_fw_health *fw_health = bp->fw_health;
 
-	if (!health)
-		return;
+	if (fw_health && !fw_health->fw_reporter)
+		fw_health->fw_reporter = __bnxt_dl_reporter_create(bp, &bnxt_dl_fw_reporter_ops);
+}
 
-	if ((bp->fw_cap & BNXT_FW_CAP_ERROR_RECOVERY) && !all)
-		return;
+void bnxt_dl_fw_reporters_destroy(struct bnxt *bp)
+{
+	struct bnxt_fw_health *fw_health = bp->fw_health;
 
-	if (health->fw_reporter) {
-		devlink_health_reporter_destroy(health->fw_reporter);
-		health->fw_reporter = NULL;
+	if (fw_health && fw_health->fw_reporter) {
+		devlink_health_reporter_destroy(fw_health->fw_reporter);
+		fw_health->fw_reporter = NULL;
 	}
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
index a715458abc30..b8105065367b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
@@ -75,7 +75,7 @@ void bnxt_devlink_health_fw_report(struct bnxt *bp);
 void bnxt_dl_health_fw_status_update(struct bnxt *bp, bool healthy);
 void bnxt_dl_health_fw_recovery_done(struct bnxt *bp);
 void bnxt_dl_fw_reporters_create(struct bnxt *bp);
-void bnxt_dl_fw_reporters_destroy(struct bnxt *bp, bool all);
+void bnxt_dl_fw_reporters_destroy(struct bnxt *bp);
 int bnxt_dl_register(struct bnxt *bp);
 void bnxt_dl_unregister(struct bnxt *bp);
 
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 8/9] bnxt_en: implement hw health reporter
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (6 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 7/9] bnxt_en: Do not destroy health reporters during reset Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-07 22:21   ` Jakub Kicinski
  2022-03-05  8:54 ` [PATCH net-next 9/9] bnxt_en: add an nvm test for hw diagnose Michael Chan
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 8365 bytes --]

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

This reporter will report NVM errors which are non-fatal.
When we receive these NVM error events, we'll report it
through this new hw health reporter.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 19 +++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     | 33 +++++++++
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 73 +++++++++++++++++++
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.h |  1 +
 4 files changed, 126 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2de02950086f..63b8fc4f9d42 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2061,6 +2061,22 @@ static void bnxt_event_error_report(struct bnxt *bp, u32 data1, u32 data2)
 	case ASYNC_EVENT_CMPL_ERROR_REPORT_BASE_EVENT_DATA1_ERROR_TYPE_DOORBELL_DROP_THRESHOLD:
 		netdev_warn(bp->dev, "One or more MMIO doorbells dropped by the device!\n");
 		break;
+	case ASYNC_EVENT_CMPL_ERROR_REPORT_BASE_EVENT_DATA1_ERROR_TYPE_NVM: {
+		struct bnxt_hw_health *hw_health = &bp->hw_health;
+
+		hw_health->nvm_err_address = EVENT_DATA2_NVM_ERR_ADDR(data2);
+		if (EVENT_DATA1_NVM_ERR_TYPE_WRITE(data1)) {
+			hw_health->synd = BNXT_HW_STATUS_NVM_WRITE_ERR;
+			hw_health->nvm_write_errors++;
+		} else if (EVENT_DATA1_NVM_ERR_TYPE_ERASE(data1)) {
+			hw_health->synd = BNXT_HW_STATUS_NVM_ERASE_ERR;
+			hw_health->nvm_erase_errors++;
+		} else {
+			hw_health->synd = BNXT_HW_STATUS_NVM_UNKNOWN_ERR;
+		}
+		set_bit(BNXT_FW_NVM_ERR_SP_EVENT, &bp->sp_event);
+		break;
+	}
 	default:
 		netdev_err(bp->dev, "FW reported unknown error type %u\n",
 			   err_type);
@@ -11887,6 +11903,9 @@ static void bnxt_sp_task(struct work_struct *work)
 	if (test_and_clear_bit(BNXT_FW_ECHO_REQUEST_SP_EVENT, &bp->sp_event))
 		bnxt_fw_echo_reply(bp);
 
+	if (test_and_clear_bit(BNXT_FW_NVM_ERR_SP_EVENT, &bp->sp_event))
+		bnxt_devlink_health_hw_report(bp);
+
 	/* These functions below will clear BNXT_STATE_IN_SP_TASK.  They
 	 * must be the last functions to be called before exiting.
 	 */
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 447a9406b8a2..fa0df43ddc1a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -516,6 +516,21 @@ struct rx_tpa_end_cmp_ext {
 	  ASYNC_EVENT_CMPL_ERROR_REPORT_INVALID_SIGNAL_EVENT_DATA2_PIN_ID_MASK) >>\
 	 ASYNC_EVENT_CMPL_ERROR_REPORT_INVALID_SIGNAL_EVENT_DATA2_PIN_ID_SFT)
 
+#define EVENT_DATA2_NVM_ERR_ADDR(data2)					\
+	(((data2) &							\
+	  ASYNC_EVENT_CMPL_ERROR_REPORT_NVM_EVENT_DATA2_ERR_ADDR_MASK) >>\
+	 ASYNC_EVENT_CMPL_ERROR_REPORT_NVM_EVENT_DATA2_ERR_ADDR_SFT)
+
+#define EVENT_DATA1_NVM_ERR_TYPE_WRITE(data1)				\
+	(((data1) &							\
+	  ASYNC_EVENT_CMPL_ERROR_REPORT_NVM_EVENT_DATA1_NVM_ERR_TYPE_MASK) ==\
+	 ASYNC_EVENT_CMPL_ERROR_REPORT_NVM_EVENT_DATA1_NVM_ERR_TYPE_WRITE)
+
+#define EVENT_DATA1_NVM_ERR_TYPE_ERASE(data1)				\
+	(((data1) &							\
+	  ASYNC_EVENT_CMPL_ERROR_REPORT_NVM_EVENT_DATA1_NVM_ERR_TYPE_MASK) ==\
+	 ASYNC_EVENT_CMPL_ERROR_REPORT_NVM_EVENT_DATA1_NVM_ERR_TYPE_ERASE)
+
 struct nqe_cn {
 	__le16	type;
 	#define NQ_CN_TYPE_MASK           0x3fUL
@@ -1528,6 +1543,21 @@ struct bnxt_ctx_mem_info {
 	struct bnxt_mem_init	mem_init[BNXT_CTX_MEM_INIT_MAX];
 };
 
+enum bnxt_hw_err {
+	BNXT_HW_STATUS_HEALTHY		= 0x0,
+	BNXT_HW_STATUS_NVM_WRITE_ERR	= 0x1,
+	BNXT_HW_STATUS_NVM_ERASE_ERR	= 0x2,
+	BNXT_HW_STATUS_NVM_UNKNOWN_ERR	= 0x3,
+};
+
+struct bnxt_hw_health {
+	u32 nvm_err_address;
+	u32 nvm_write_errors;
+	u32 nvm_erase_errors;
+	u8 synd;
+	struct devlink_health_reporter *hw_reporter;
+};
+
 enum bnxt_health_severity {
 	SEVERITY_NORMAL = 0,
 	SEVERITY_WARNING,
@@ -2045,6 +2075,7 @@ struct bnxt {
 #define BNXT_FW_EXCEPTION_SP_EVENT	19
 #define BNXT_LINK_CFG_CHANGE_SP_EVENT	21
 #define BNXT_FW_ECHO_REQUEST_SP_EVENT	23
+#define BNXT_FW_NVM_ERR_SP_EVENT	25
 
 	struct delayed_work	fw_reset_task;
 	int			fw_reset_state;
@@ -2145,6 +2176,8 @@ struct bnxt {
 	struct dentry		*debugfs_pdev;
 	struct device		*hwmon_dev;
 	enum board_idx		board_idx;
+
+	struct bnxt_hw_health	hw_health;
 };
 
 #define BNXT_NUM_RX_RING_STATS			8
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index 0c17f90d44a2..a802bbda1c27 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -241,6 +241,69 @@ static const struct devlink_health_reporter_ops bnxt_dl_fw_reporter_ops = {
 	.recover = bnxt_fw_recover,
 };
 
+static int bnxt_hw_recover(struct devlink_health_reporter *reporter,
+			   void *priv_ctx,
+			   struct netlink_ext_ack *extack)
+{
+	struct bnxt *bp = devlink_health_reporter_priv(reporter);
+	struct bnxt_hw_health *hw_health = &bp->hw_health;
+
+	hw_health->synd = BNXT_HW_STATUS_HEALTHY;
+	return 0;
+}
+
+static const char *hw_err_str(u8 synd)
+{
+	switch (synd) {
+	case BNXT_HW_STATUS_HEALTHY:
+		return "healthy";
+	case BNXT_HW_STATUS_NVM_WRITE_ERR:
+		return "nvm write error";
+	case BNXT_HW_STATUS_NVM_ERASE_ERR:
+		return "nvm erase error";
+	case BNXT_HW_STATUS_NVM_UNKNOWN_ERR:
+		return "unrecognized nvm error";
+	default:
+		return "unknown hw error";
+	}
+}
+
+static int bnxt_hw_diagnose(struct devlink_health_reporter *reporter,
+			    struct devlink_fmsg *fmsg,
+			    struct netlink_ext_ack *extack)
+{
+	struct bnxt *bp = devlink_health_reporter_priv(reporter);
+	struct bnxt_hw_health *h = &bp->hw_health;
+	int rc;
+
+	rc = devlink_fmsg_string_pair_put(fmsg, "Status", hw_err_str(h->synd));
+	if (rc)
+		return rc;
+	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_write_errors", h->nvm_write_errors);
+	if (rc)
+		return rc;
+	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_erase_errors", h->nvm_erase_errors);
+	if (rc)
+		return rc;
+	return 0;
+}
+
+void bnxt_devlink_health_hw_report(struct bnxt *bp)
+{
+	struct bnxt_hw_health *hw_health = &bp->hw_health;
+
+	netdev_warn(bp->dev, "%s reported at address 0x%x\n", hw_err_str(hw_health->synd),
+		    hw_health->nvm_err_address);
+
+	devlink_health_report(hw_health->hw_reporter, hw_err_str(hw_health->synd), NULL);
+}
+
+static const struct devlink_health_reporter_ops bnxt_dl_hw_reporter_ops = {
+	.name = "hw",
+	.diagnose = bnxt_hw_diagnose,
+	.recover = bnxt_hw_recover,
+};
+
 static struct devlink_health_reporter *
 __bnxt_dl_reporter_create(struct bnxt *bp,
 			  const struct devlink_health_reporter_ops *ops)
@@ -260,6 +323,10 @@ __bnxt_dl_reporter_create(struct bnxt *bp,
 void bnxt_dl_fw_reporters_create(struct bnxt *bp)
 {
 	struct bnxt_fw_health *fw_health = bp->fw_health;
+	struct bnxt_hw_health *hw_health = &bp->hw_health;
+
+	if (!hw_health->hw_reporter)
+		hw_health->hw_reporter = __bnxt_dl_reporter_create(bp, &bnxt_dl_hw_reporter_ops);
 
 	if (fw_health && !fw_health->fw_reporter)
 		fw_health->fw_reporter = __bnxt_dl_reporter_create(bp, &bnxt_dl_fw_reporter_ops);
@@ -268,6 +335,12 @@ void bnxt_dl_fw_reporters_create(struct bnxt *bp)
 void bnxt_dl_fw_reporters_destroy(struct bnxt *bp)
 {
 	struct bnxt_fw_health *fw_health = bp->fw_health;
+	struct bnxt_hw_health *hw_health = &bp->hw_health;
+
+	if (hw_health->hw_reporter) {
+		devlink_health_reporter_destroy(hw_health->hw_reporter);
+		hw_health->hw_reporter = NULL;
+	}
 
 	if (fw_health && fw_health->fw_reporter) {
 		devlink_health_reporter_destroy(fw_health->fw_reporter);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
index b8105065367b..056962e4b177 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h
@@ -74,6 +74,7 @@ enum bnxt_dl_version_type {
 void bnxt_devlink_health_fw_report(struct bnxt *bp);
 void bnxt_dl_health_fw_status_update(struct bnxt *bp, bool healthy);
 void bnxt_dl_health_fw_recovery_done(struct bnxt *bp);
+void bnxt_devlink_health_hw_report(struct bnxt *bp);
 void bnxt_dl_fw_reporters_create(struct bnxt *bp);
 void bnxt_dl_fw_reporters_destroy(struct bnxt *bp);
 int bnxt_dl_register(struct bnxt *bp);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 9/9] bnxt_en: add an nvm test for hw diagnose
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (7 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 8/9] bnxt_en: implement hw health reporter Michael Chan
@ 2022-03-05  8:54 ` Michael Chan
  2022-03-07 22:24   ` Jakub Kicinski
  2022-03-05 11:20 ` [PATCH net-next 0/9] bnxt_en: Updates patchwork-bot+netdevbpf
  2022-03-07 22:27 ` Jakub Kicinski
  10 siblings, 1 reply; 16+ messages in thread
From: Michael Chan @ 2022-03-05  8:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 8643 bytes --]

From: Vikas Gupta <vikas.gupta@broadcom.com>

Add an NVM test function for devlink hw reporter.
In this function an NVM VPD area is read followed by
a write. Test result is cached and if it is successful then
the next test can be conducted only after HW_RETEST_MIN_TIME to
avoid frequent writes to the NVM.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     | 20 ++++-
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 82 ++++++++++++++++++-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 22 ++---
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.h |  7 ++
 4 files changed, 113 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index fa0df43ddc1a..9dd878def3c2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1544,17 +1544,29 @@ struct bnxt_ctx_mem_info {
 };
 
 enum bnxt_hw_err {
-	BNXT_HW_STATUS_HEALTHY		= 0x0,
-	BNXT_HW_STATUS_NVM_WRITE_ERR	= 0x1,
-	BNXT_HW_STATUS_NVM_ERASE_ERR	= 0x2,
-	BNXT_HW_STATUS_NVM_UNKNOWN_ERR	= 0x3,
+	BNXT_HW_STATUS_HEALTHY			= 0x0,
+	BNXT_HW_STATUS_NVM_WRITE_ERR		= 0x1,
+	BNXT_HW_STATUS_NVM_ERASE_ERR		= 0x2,
+	BNXT_HW_STATUS_NVM_UNKNOWN_ERR		= 0x3,
+	BNXT_HW_STATUS_NVM_TEST_VPD_ENT_ERR	= 0x4,
+	BNXT_HW_STATUS_NVM_TEST_VPD_READ_ERR	= 0x5,
+	BNXT_HW_STATUS_NVM_TEST_VPD_WRITE_ERR	= 0x6,
+	BNXT_HW_STATUS_NVM_TEST_INCMPL_ERR	= 0x7,
 };
 
 struct bnxt_hw_health {
 	u32 nvm_err_address;
 	u32 nvm_write_errors;
 	u32 nvm_erase_errors;
+	u32 nvm_test_vpd_ent_errors;
+	u32 nvm_test_vpd_read_errors;
+	u32 nvm_test_vpd_write_errors;
+	u32 nvm_test_incmpl_errors;
 	u8 synd;
+	/* max a test in a day if previous test was successful */
+#define HW_RETEST_MIN_TIME	(1000 * 3600 * 24)
+	u8 nvm_test_result;
+	unsigned long nvm_test_timestamp;
 	struct devlink_health_reporter *hw_reporter;
 };
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index a802bbda1c27..77e55105d645 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -20,6 +20,7 @@
 #include "bnxt_ulp.h"
 #include "bnxt_ptp.h"
 #include "bnxt_coredump.h"
+#include "bnxt_nvm_defs.h"	/* NVRAM content constant and structure defs */
 
 static void __bnxt_fw_recover(struct bnxt *bp)
 {
@@ -263,20 +264,82 @@ static const char *hw_err_str(u8 synd)
 		return "nvm erase error";
 	case BNXT_HW_STATUS_NVM_UNKNOWN_ERR:
 		return "unrecognized nvm error";
+	case BNXT_HW_STATUS_NVM_TEST_VPD_ENT_ERR:
+		return "nvm test vpd entry error";
+	case BNXT_HW_STATUS_NVM_TEST_VPD_READ_ERR:
+		return "nvm test vpd read error";
+	case BNXT_HW_STATUS_NVM_TEST_VPD_WRITE_ERR:
+		return "nvm test vpd write error";
+	case BNXT_HW_STATUS_NVM_TEST_INCMPL_ERR:
+		return "nvm test incomplete error";
 	default:
 		return "unknown hw error";
 	}
 }
 
+static void bnxt_nvm_test(struct bnxt *bp)
+{
+	struct bnxt_hw_health *h = &bp->hw_health;
+	u32 datalen;
+	u16 index;
+	u8 *buf;
+
+	if (!h->nvm_test_result) {
+		if (!h->nvm_test_timestamp ||
+		    time_after(jiffies, h->nvm_test_timestamp +
+					msecs_to_jiffies(HW_RETEST_MIN_TIME)))
+			h->nvm_test_timestamp = jiffies;
+		else
+			return;
+	}
+
+	if (bnxt_find_nvram_item(bp->dev, BNX_DIR_TYPE_VPD,
+				 BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
+				 &index, NULL, &datalen) || !datalen) {
+		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_VPD_ENT_ERR;
+		h->nvm_test_vpd_ent_errors++;
+		return;
+	}
+
+	buf = kzalloc(datalen, GFP_KERNEL);
+	if (!buf) {
+		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_INCMPL_ERR;
+		h->nvm_test_incmpl_errors++;
+		return;
+	}
+
+	if (bnxt_get_nvram_item(bp->dev, index, 0, datalen, buf)) {
+		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_VPD_READ_ERR;
+		h->nvm_test_vpd_read_errors++;
+		goto err;
+	}
+
+	if (bnxt_flash_nvram(bp->dev, BNX_DIR_TYPE_VPD, BNX_DIR_ORDINAL_FIRST,
+			     BNX_DIR_EXT_NONE, 0, 0, buf, datalen)) {
+		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_VPD_WRITE_ERR;
+		h->nvm_test_vpd_write_errors++;
+	}
+
+err:
+	kfree(buf);
+}
+
 static int bnxt_hw_diagnose(struct devlink_health_reporter *reporter,
 			    struct devlink_fmsg *fmsg,
 			    struct netlink_ext_ack *extack)
 {
 	struct bnxt *bp = devlink_health_reporter_priv(reporter);
 	struct bnxt_hw_health *h = &bp->hw_health;
+	u8 synd = h->synd;
 	int rc;
 
-	rc = devlink_fmsg_string_pair_put(fmsg, "Status", hw_err_str(h->synd));
+	bnxt_nvm_test(bp);
+	if (h->nvm_test_result) {
+		synd = h->nvm_test_result;
+		devlink_health_report(h->hw_reporter, hw_err_str(synd), NULL);
+	}
+
+	rc = devlink_fmsg_string_pair_put(fmsg, "Status", hw_err_str(synd));
 	if (rc)
 		return rc;
 	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_write_errors", h->nvm_write_errors);
@@ -285,6 +348,23 @@ static int bnxt_hw_diagnose(struct devlink_health_reporter *reporter,
 	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_erase_errors", h->nvm_erase_errors);
 	if (rc)
 		return rc;
+	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_vpd_ent_errors",
+				       h->nvm_test_vpd_ent_errors);
+	if (rc)
+		return rc;
+	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_vpd_read_errors",
+				       h->nvm_test_vpd_read_errors);
+	if (rc)
+		return rc;
+	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_vpd_write_errors",
+				       h->nvm_test_vpd_write_errors);
+	if (rc)
+		return rc;
+	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_incomplete_errors",
+				       h->nvm_test_incmpl_errors);
+	if (rc)
+		return rc;
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index eadaca42ed96..178074795b27 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2168,14 +2168,10 @@ static void bnxt_print_admin_err(struct bnxt *bp)
 	netdev_info(bp->dev, "PF does not have admin privileges to flash or reset the device\n");
 }
 
-static int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
-				u16 ext, u16 *index, u32 *item_length,
-				u32 *data_length);
-
-static int bnxt_flash_nvram(struct net_device *dev, u16 dir_type,
-			    u16 dir_ordinal, u16 dir_ext, u16 dir_attr,
-			    u32 dir_item_len, const u8 *data,
-			    size_t data_len)
+int bnxt_flash_nvram(struct net_device *dev, u16 dir_type,
+		     u16 dir_ordinal, u16 dir_ext, u16 dir_attr,
+		     u32 dir_item_len, const u8 *data,
+		     size_t data_len)
 {
 	struct bnxt *bp = netdev_priv(dev);
 	struct hwrm_nvm_write_input *req;
@@ -2819,8 +2815,8 @@ static int bnxt_get_nvram_directory(struct net_device *dev, u32 len, u8 *data)
 	return rc;
 }
 
-static int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
-			       u32 length, u8 *data)
+int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
+			u32 length, u8 *data)
 {
 	struct bnxt *bp = netdev_priv(dev);
 	int rc;
@@ -2854,9 +2850,9 @@ static int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
 	return rc;
 }
 
-static int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
-				u16 ext, u16 *index, u32 *item_length,
-				u32 *data_length)
+int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
+			 u16 ext, u16 *index, u32 *item_length,
+			 u32 *data_length)
 {
 	struct hwrm_nvm_find_dir_entry_output *output;
 	struct hwrm_nvm_find_dir_entry_input *req;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
index 6aa44840f13a..2593e0049582 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
@@ -56,6 +56,13 @@ int bnxt_hwrm_firmware_reset(struct net_device *dev, u8 proc_type,
 int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware *fw,
 				   u32 install_type);
 int bnxt_get_pkginfo(struct net_device *dev, char *ver, int size);
+int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal, u16 ext,
+			 u16 *index, u32 *item_length, u32 *data_length);
+int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
+			u32 length, u8 *data);
+int bnxt_flash_nvram(struct net_device *dev, u16 dir_type, u16 dir_ordinal,
+		     u16 dir_ext, u16 dir_attr, u32 dir_item_len,
+		     const u8 *data, size_t data_len);
 void bnxt_ethtool_init(struct bnxt *bp);
 void bnxt_ethtool_free(struct bnxt *bp);
 
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 0/9] bnxt_en: Updates.
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (8 preceding siblings ...)
  2022-03-05  8:54 ` [PATCH net-next 9/9] bnxt_en: add an nvm test for hw diagnose Michael Chan
@ 2022-03-05 11:20 ` patchwork-bot+netdevbpf
  2022-03-07 22:27 ` Jakub Kicinski
  10 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-05 11:20 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev, kuba, gospo

Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Sat,  5 Mar 2022 03:54:33 -0500 you wrote:
> This patch series contains mainly NVRAM related features.  More
> NVRAM error checking and logging are added when installing firmware
> packages.  A new devlink hw health report is now added to report
> and diagnose NVRAM issues.  Other miscellaneous patches include
> reporting correctly cards that don't support link pause, adding
> an internal unknown link state, and avoiding unnecessary link
> toggle during firmware reset.
> 
> [...]

Here is the summary with links:
  - [net-next,1/9] bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE
    https://git.kernel.org/netdev/net-next/c/8e42aef0b730
  - [net-next,2/9] bnxt_en: add more error checks to HWRM_NVM_INSTALL_UPDATE
    https://git.kernel.org/netdev/net-next/c/54ff1e3e8fc3
  - [net-next,3/9] bnxt_en: parse result field when NVRAM package install fails
    https://git.kernel.org/netdev/net-next/c/02acd399533e
  - [net-next,4/9] bnxt_en: introduce initial link state of unknown
    https://git.kernel.org/netdev/net-next/c/0f5a4841f2ec
  - [net-next,5/9] bnxt_en: Properly report no pause support on some cards
    https://git.kernel.org/netdev/net-next/c/9a3bc77ec65e
  - [net-next,6/9] bnxt_en: Eliminate unintended link toggle during FW reset
    https://git.kernel.org/netdev/net-next/c/7c492a2530c1
  - [net-next,7/9] bnxt_en: Do not destroy health reporters during reset
    https://git.kernel.org/netdev/net-next/c/f16a91692866
  - [net-next,8/9] bnxt_en: implement hw health reporter
    https://git.kernel.org/netdev/net-next/c/bafed3f231f7
  - [net-next,9/9] bnxt_en: add an nvm test for hw diagnose
    https://git.kernel.org/netdev/net-next/c/22f5dba5065d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/9] bnxt_en: parse result field when NVRAM package install fails
  2022-03-05  8:54 ` [PATCH net-next 3/9] bnxt_en: parse result field when NVRAM package install fails Michael Chan
@ 2022-03-07 22:13   ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2022-03-07 22:13 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev, gospo

On Sat,  5 Mar 2022 03:54:36 -0500 Michael Chan wrote:
> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> 
> Instead of always returning -ENOPKG, decode the firmware error
> code further when the HWRM_NVM_INSTALL_UPDATE firmware call fails.
> Return a more suitable error code to userspace and log an error
> in dmesg.

devlink fw flashing allows for the msg to be reported directly 
to the user, that's more friendly than having to scan logs.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 8/9] bnxt_en: implement hw health reporter
  2022-03-05  8:54 ` [PATCH net-next 8/9] bnxt_en: implement hw health reporter Michael Chan
@ 2022-03-07 22:21   ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2022-03-07 22:21 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev, gospo

On Sat,  5 Mar 2022 03:54:41 -0500 Michael Chan wrote:
> +static int bnxt_hw_recover(struct devlink_health_reporter *reporter,
> +			   void *priv_ctx,
> +			   struct netlink_ext_ack *extack)
> +{
> +	struct bnxt *bp = devlink_health_reporter_priv(reporter);
> +	struct bnxt_hw_health *hw_health = &bp->hw_health;
> +
> +	hw_health->synd = BNXT_HW_STATUS_HEALTHY;
> +	return 0;
> +}

This seems like a very weird recovery function.

I guess the FW automatically does auto-recovery and the driver 
has no control over it?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 9/9] bnxt_en: add an nvm test for hw diagnose
  2022-03-05  8:54 ` [PATCH net-next 9/9] bnxt_en: add an nvm test for hw diagnose Michael Chan
@ 2022-03-07 22:24   ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2022-03-07 22:24 UTC (permalink / raw)
  To: Michael Chan, Jiri Pirko; +Cc: davem, netdev, gospo, eranbe

On Sat,  5 Mar 2022 03:54:42 -0500 Michael Chan wrote:
> From: Vikas Gupta <vikas.gupta@broadcom.com>
> 
> Add an NVM test function for devlink hw reporter.
> In this function an NVM VPD area is read followed by
> a write. Test result is cached and if it is successful then
> the next test can be conducted only after HW_RETEST_MIN_TIME to
> avoid frequent writes to the NVM.

You seem to execute a self-test from the .diganose callback.
That really seems like an abuse of the API. It's not hard to
add a separate self-test callback.

Jiri, WDYT?

> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index fa0df43ddc1a..9dd878def3c2 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -1544,17 +1544,29 @@ struct bnxt_ctx_mem_info {
>  };
>  
>  enum bnxt_hw_err {
> -	BNXT_HW_STATUS_HEALTHY		= 0x0,
> -	BNXT_HW_STATUS_NVM_WRITE_ERR	= 0x1,
> -	BNXT_HW_STATUS_NVM_ERASE_ERR	= 0x2,
> -	BNXT_HW_STATUS_NVM_UNKNOWN_ERR	= 0x3,
> +	BNXT_HW_STATUS_HEALTHY			= 0x0,
> +	BNXT_HW_STATUS_NVM_WRITE_ERR		= 0x1,
> +	BNXT_HW_STATUS_NVM_ERASE_ERR		= 0x2,
> +	BNXT_HW_STATUS_NVM_UNKNOWN_ERR		= 0x3,
> +	BNXT_HW_STATUS_NVM_TEST_VPD_ENT_ERR	= 0x4,
> +	BNXT_HW_STATUS_NVM_TEST_VPD_READ_ERR	= 0x5,
> +	BNXT_HW_STATUS_NVM_TEST_VPD_WRITE_ERR	= 0x6,
> +	BNXT_HW_STATUS_NVM_TEST_INCMPL_ERR	= 0x7,
>  };
>  
>  struct bnxt_hw_health {
>  	u32 nvm_err_address;
>  	u32 nvm_write_errors;
>  	u32 nvm_erase_errors;
> +	u32 nvm_test_vpd_ent_errors;
> +	u32 nvm_test_vpd_read_errors;
> +	u32 nvm_test_vpd_write_errors;
> +	u32 nvm_test_incmpl_errors;
>  	u8 synd;
> +	/* max a test in a day if previous test was successful */
> +#define HW_RETEST_MIN_TIME	(1000 * 3600 * 24)
> +	u8 nvm_test_result;
> +	unsigned long nvm_test_timestamp;
>  	struct devlink_health_reporter *hw_reporter;
>  };
>  
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
> index a802bbda1c27..77e55105d645 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
> @@ -20,6 +20,7 @@
>  #include "bnxt_ulp.h"
>  #include "bnxt_ptp.h"
>  #include "bnxt_coredump.h"
> +#include "bnxt_nvm_defs.h"	/* NVRAM content constant and structure defs */
>  
>  static void __bnxt_fw_recover(struct bnxt *bp)
>  {
> @@ -263,20 +264,82 @@ static const char *hw_err_str(u8 synd)
>  		return "nvm erase error";
>  	case BNXT_HW_STATUS_NVM_UNKNOWN_ERR:
>  		return "unrecognized nvm error";
> +	case BNXT_HW_STATUS_NVM_TEST_VPD_ENT_ERR:
> +		return "nvm test vpd entry error";
> +	case BNXT_HW_STATUS_NVM_TEST_VPD_READ_ERR:
> +		return "nvm test vpd read error";
> +	case BNXT_HW_STATUS_NVM_TEST_VPD_WRITE_ERR:
> +		return "nvm test vpd write error";
> +	case BNXT_HW_STATUS_NVM_TEST_INCMPL_ERR:
> +		return "nvm test incomplete error";
>  	default:
>  		return "unknown hw error";
>  	}
>  }
>  
> +static void bnxt_nvm_test(struct bnxt *bp)
> +{
> +	struct bnxt_hw_health *h = &bp->hw_health;
> +	u32 datalen;
> +	u16 index;
> +	u8 *buf;
> +
> +	if (!h->nvm_test_result) {
> +		if (!h->nvm_test_timestamp ||
> +		    time_after(jiffies, h->nvm_test_timestamp +
> +					msecs_to_jiffies(HW_RETEST_MIN_TIME)))
> +			h->nvm_test_timestamp = jiffies;
> +		else
> +			return;
> +	}
> +
> +	if (bnxt_find_nvram_item(bp->dev, BNX_DIR_TYPE_VPD,
> +				 BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
> +				 &index, NULL, &datalen) || !datalen) {
> +		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_VPD_ENT_ERR;
> +		h->nvm_test_vpd_ent_errors++;
> +		return;
> +	}
> +
> +	buf = kzalloc(datalen, GFP_KERNEL);
> +	if (!buf) {
> +		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_INCMPL_ERR;
> +		h->nvm_test_incmpl_errors++;
> +		return;
> +	}
> +
> +	if (bnxt_get_nvram_item(bp->dev, index, 0, datalen, buf)) {
> +		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_VPD_READ_ERR;
> +		h->nvm_test_vpd_read_errors++;
> +		goto err;
> +	}
> +
> +	if (bnxt_flash_nvram(bp->dev, BNX_DIR_TYPE_VPD, BNX_DIR_ORDINAL_FIRST,
> +			     BNX_DIR_EXT_NONE, 0, 0, buf, datalen)) {
> +		h->nvm_test_result = BNXT_HW_STATUS_NVM_TEST_VPD_WRITE_ERR;
> +		h->nvm_test_vpd_write_errors++;
> +	}
> +
> +err:
> +	kfree(buf);
> +}
> +
>  static int bnxt_hw_diagnose(struct devlink_health_reporter *reporter,
>  			    struct devlink_fmsg *fmsg,
>  			    struct netlink_ext_ack *extack)
>  {
>  	struct bnxt *bp = devlink_health_reporter_priv(reporter);
>  	struct bnxt_hw_health *h = &bp->hw_health;
> +	u8 synd = h->synd;
>  	int rc;
>  
> -	rc = devlink_fmsg_string_pair_put(fmsg, "Status", hw_err_str(h->synd));
> +	bnxt_nvm_test(bp);
> +	if (h->nvm_test_result) {
> +		synd = h->nvm_test_result;
> +		devlink_health_report(h->hw_reporter, hw_err_str(synd), NULL);
> +	}
> +
> +	rc = devlink_fmsg_string_pair_put(fmsg, "Status", hw_err_str(synd));
>  	if (rc)
>  		return rc;
>  	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_write_errors", h->nvm_write_errors);
> @@ -285,6 +348,23 @@ static int bnxt_hw_diagnose(struct devlink_health_reporter *reporter,
>  	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_erase_errors", h->nvm_erase_errors);
>  	if (rc)
>  		return rc;
> +	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_vpd_ent_errors",
> +				       h->nvm_test_vpd_ent_errors);
> +	if (rc)
> +		return rc;
> +	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_vpd_read_errors",
> +				       h->nvm_test_vpd_read_errors);
> +	if (rc)
> +		return rc;
> +	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_vpd_write_errors",
> +				       h->nvm_test_vpd_write_errors);
> +	if (rc)
> +		return rc;
> +	rc = devlink_fmsg_u32_pair_put(fmsg, "nvm_test_incomplete_errors",
> +				       h->nvm_test_incmpl_errors);
> +	if (rc)
> +		return rc;
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> index eadaca42ed96..178074795b27 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> @@ -2168,14 +2168,10 @@ static void bnxt_print_admin_err(struct bnxt *bp)
>  	netdev_info(bp->dev, "PF does not have admin privileges to flash or reset the device\n");
>  }
>  
> -static int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
> -				u16 ext, u16 *index, u32 *item_length,
> -				u32 *data_length);
> -
> -static int bnxt_flash_nvram(struct net_device *dev, u16 dir_type,
> -			    u16 dir_ordinal, u16 dir_ext, u16 dir_attr,
> -			    u32 dir_item_len, const u8 *data,
> -			    size_t data_len)
> +int bnxt_flash_nvram(struct net_device *dev, u16 dir_type,
> +		     u16 dir_ordinal, u16 dir_ext, u16 dir_attr,
> +		     u32 dir_item_len, const u8 *data,
> +		     size_t data_len)
>  {
>  	struct bnxt *bp = netdev_priv(dev);
>  	struct hwrm_nvm_write_input *req;
> @@ -2819,8 +2815,8 @@ static int bnxt_get_nvram_directory(struct net_device *dev, u32 len, u8 *data)
>  	return rc;
>  }
>  
> -static int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
> -			       u32 length, u8 *data)
> +int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
> +			u32 length, u8 *data)
>  {
>  	struct bnxt *bp = netdev_priv(dev);
>  	int rc;
> @@ -2854,9 +2850,9 @@ static int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
>  	return rc;
>  }
>  
> -static int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
> -				u16 ext, u16 *index, u32 *item_length,
> -				u32 *data_length)
> +int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
> +			 u16 ext, u16 *index, u32 *item_length,
> +			 u32 *data_length)
>  {
>  	struct hwrm_nvm_find_dir_entry_output *output;
>  	struct hwrm_nvm_find_dir_entry_input *req;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
> index 6aa44840f13a..2593e0049582 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
> @@ -56,6 +56,13 @@ int bnxt_hwrm_firmware_reset(struct net_device *dev, u8 proc_type,
>  int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware *fw,
>  				   u32 install_type);
>  int bnxt_get_pkginfo(struct net_device *dev, char *ver, int size);
> +int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal, u16 ext,
> +			 u16 *index, u32 *item_length, u32 *data_length);
> +int bnxt_get_nvram_item(struct net_device *dev, u32 index, u32 offset,
> +			u32 length, u8 *data);
> +int bnxt_flash_nvram(struct net_device *dev, u16 dir_type, u16 dir_ordinal,
> +		     u16 dir_ext, u16 dir_attr, u32 dir_item_len,
> +		     const u8 *data, size_t data_len);
>  void bnxt_ethtool_init(struct bnxt *bp);
>  void bnxt_ethtool_free(struct bnxt *bp);
>  


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 0/9] bnxt_en: Updates.
  2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
                   ` (9 preceding siblings ...)
  2022-03-05 11:20 ` [PATCH net-next 0/9] bnxt_en: Updates patchwork-bot+netdevbpf
@ 2022-03-07 22:27 ` Jakub Kicinski
  2022-03-07 23:56   ` Michael Chan
  10 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2022-03-07 22:27 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev, gospo

On Sat,  5 Mar 2022 03:54:33 -0500 Michael Chan wrote:
> This patch series contains mainly NVRAM related features.  More
> NVRAM error checking and logging are added when installing firmware
> packages.  A new devlink hw health report is now added to report
> and diagnose NVRAM issues.  Other miscellaneous patches include
> reporting correctly cards that don't support link pause, adding
> an internal unknown link state, and avoiding unnecessary link
> toggle during firmware reset.

The devlink parts are missing documentation, I'm not sure how it's
supposed to operate and AFAICT it's not just a normal health reporter
implementation.

Please stop posting patches during the weekend.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 0/9] bnxt_en: Updates.
  2022-03-07 22:27 ` Jakub Kicinski
@ 2022-03-07 23:56   ` Michael Chan
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Chan @ 2022-03-07 23:56 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David Miller, Netdev, Andrew Gospodarek

[-- Attachment #1: Type: text/plain, Size: 1002 bytes --]

On Mon, Mar 7, 2022 at 2:27 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat,  5 Mar 2022 03:54:33 -0500 Michael Chan wrote:
> > This patch series contains mainly NVRAM related features.  More
> > NVRAM error checking and logging are added when installing firmware
> > packages.  A new devlink hw health report is now added to report
> > and diagnose NVRAM issues.  Other miscellaneous patches include
> > reporting correctly cards that don't support link pause, adding
> > an internal unknown link state, and avoiding unnecessary link
> > toggle during firmware reset.
>
> The devlink parts are missing documentation, I'm not sure how it's
> supposed to operate and AFAICT it's not just a normal health reporter
> implementation.

Sorry, missed the documentation update.  My colleagues will respond to
your other comments and we will add the documentation.

>
> Please stop posting patches during the weekend.

This is the first time I'm hearing that patches cannot be posted
during the weekend.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-03-07 23:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-05  8:54 [PATCH net-next 0/9] bnxt_en: Updates Michael Chan
2022-03-05  8:54 ` [PATCH net-next 1/9] bnxt_en: refactor error handling of HWRM_NVM_INSTALL_UPDATE Michael Chan
2022-03-05  8:54 ` [PATCH net-next 2/9] bnxt_en: add more error checks to HWRM_NVM_INSTALL_UPDATE Michael Chan
2022-03-05  8:54 ` [PATCH net-next 3/9] bnxt_en: parse result field when NVRAM package install fails Michael Chan
2022-03-07 22:13   ` Jakub Kicinski
2022-03-05  8:54 ` [PATCH net-next 4/9] bnxt_en: introduce initial link state of unknown Michael Chan
2022-03-05  8:54 ` [PATCH net-next 5/9] bnxt_en: Properly report no pause support on some cards Michael Chan
2022-03-05  8:54 ` [PATCH net-next 6/9] bnxt_en: Eliminate unintended link toggle during FW reset Michael Chan
2022-03-05  8:54 ` [PATCH net-next 7/9] bnxt_en: Do not destroy health reporters during reset Michael Chan
2022-03-05  8:54 ` [PATCH net-next 8/9] bnxt_en: implement hw health reporter Michael Chan
2022-03-07 22:21   ` Jakub Kicinski
2022-03-05  8:54 ` [PATCH net-next 9/9] bnxt_en: add an nvm test for hw diagnose Michael Chan
2022-03-07 22:24   ` Jakub Kicinski
2022-03-05 11:20 ` [PATCH net-next 0/9] bnxt_en: Updates patchwork-bot+netdevbpf
2022-03-07 22:27 ` Jakub Kicinski
2022-03-07 23:56   ` Michael Chan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).