netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/9] bnxt_en: Bug fixes
@ 2021-07-18 19:36 Michael Chan
  2021-07-18 19:36 ` [PATCH net 1/9] bnxt_en: don't disable an already disabled PCI device Michael Chan
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

Most of the fixes in this series have to do with error recovery.  They
include error path handling when the error recovery has to abort, and
the rediscovery of capabilities (PTP and RoCE) after firmware reset
that may result in capability changes.

Two other fixes are to reject invalid ETS settings and to validate
VLAN protocol in the RX path.

Edwin Peer (1):
  bnxt_en: reject ETS settings that will starve a TC

Kalesh AP (1):
  bnxt_en: don't disable an already disabled PCI device

Michael Chan (5):
  bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe()
  bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in
    bnxt_fw_rset_task()
  bnxt_en: Validate vlan protocol ID on RX packets
  bnxt_en: Move bnxt_ptp_init() to bnxt_open()
  bnxt_en: Fix PTP capability discovery

Somnath Kotur (2):
  bnxt_en: fix error path of FW reset
  bnxt_en: Check abort error state in bnxt_half_open_nic()

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 85 ++++++++++++++-----
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 10 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c | 24 ++----
 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h |  1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |  9 +-
 5 files changed, 83 insertions(+), 46 deletions(-)

-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net 1/9] bnxt_en: don't disable an already disabled PCI device
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 2/9] bnxt_en: reject ETS settings that will starve a TC Michael Chan
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1137 bytes --]

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

If device is already disabled in reset path and PCI io error is
detected before the device could be enabled, driver could
call pci_disable_device() for already disabled device. Fix this
problem by calling pci_disable_device() only if the device is already
enabled.

Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f56245eeef7b..fdfb75a1608d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -13436,7 +13436,8 @@ static pci_ers_result_t bnxt_io_error_detected(struct pci_dev *pdev,
 	if (netif_running(netdev))
 		bnxt_close(netdev);
 
-	pci_disable_device(pdev);
+	if (pci_is_enabled(pdev))
+		pci_disable_device(pdev);
 	bnxt_free_ctx_mem(bp);
 	kfree(bp->ctx);
 	bp->ctx = NULL;
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 2/9] bnxt_en: reject ETS settings that will starve a TC
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
  2021-07-18 19:36 ` [PATCH net 1/9] bnxt_en: don't disable an already disabled PCI device Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 3/9] bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe() Michael Chan
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1742 bytes --]

From: Edwin Peer <edwin.peer@broadcom.com>

ETS proportions are presented to HWRM_QUEUE_COS2BW_CFG as minimum
bandwidth constraints. Thus, zero is a legal value for a given TC.
However, if all the other TCs sum up to 100%, then at least one
hardware queue will starve, resulting in guaranteed TX timeouts.
Reject such nonsensical configurations.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
index 8e90224c43a2..8a68df4d9e59 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
@@ -433,6 +433,7 @@ static int bnxt_hwrm_queue_dscp2pri_cfg(struct bnxt *bp, struct dcb_app *app,
 static int bnxt_ets_validate(struct bnxt *bp, struct ieee_ets *ets, u8 *tc)
 {
 	int total_ets_bw = 0;
+	bool zero = false;
 	u8 max_tc = 0;
 	int i;
 
@@ -453,13 +454,20 @@ static int bnxt_ets_validate(struct bnxt *bp, struct ieee_ets *ets, u8 *tc)
 			break;
 		case IEEE_8021QAZ_TSA_ETS:
 			total_ets_bw += ets->tc_tx_bw[i];
+			zero = zero || !ets->tc_tx_bw[i];
 			break;
 		default:
 			return -ENOTSUPP;
 		}
 	}
-	if (total_ets_bw > 100)
+	if (total_ets_bw > 100) {
+		netdev_warn(bp->dev, "rejecting ETS config exceeding available bandwidth\n");
 		return -EINVAL;
+	}
+	if (zero && total_ets_bw == 100) {
+		netdev_warn(bp->dev, "rejecting ETS config starving a TC\n");
+		return -EINVAL;
+	}
 
 	if (max_tc >= bp->max_tc)
 		*tc = bp->max_tc;
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 3/9] bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe()
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
  2021-07-18 19:36 ` [PATCH net 1/9] bnxt_en: don't disable an already disabled PCI device Michael Chan
  2021-07-18 19:36 ` [PATCH net 2/9] bnxt_en: reject ETS settings that will starve a TC Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 4/9] bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in bnxt_fw_rset_task() Michael Chan
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1505 bytes --]

The capabilities can change after firmware upgrade/downgrade, so we
should get the up-to-date RoCE capabilities everytime bnxt_ulp_probe()
is called.

Fixes: 2151fe0830fd ("bnxt_en: Handle RESET_NOTIFY async event from firmware.")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index a918e374f3c5..187ff643ad2a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -479,16 +479,17 @@ struct bnxt_en_dev *bnxt_ulp_probe(struct net_device *dev)
 		if (!edev)
 			return ERR_PTR(-ENOMEM);
 		edev->en_ops = &bnxt_en_ops_tbl;
-		if (bp->flags & BNXT_FLAG_ROCEV1_CAP)
-			edev->flags |= BNXT_EN_FLAG_ROCEV1_CAP;
-		if (bp->flags & BNXT_FLAG_ROCEV2_CAP)
-			edev->flags |= BNXT_EN_FLAG_ROCEV2_CAP;
 		edev->net = dev;
 		edev->pdev = bp->pdev;
 		edev->l2_db_size = bp->db_size;
 		edev->l2_db_size_nc = bp->db_size;
 		bp->edev = edev;
 	}
+	edev->flags &= ~BNXT_EN_FLAG_ROCE_CAP;
+	if (bp->flags & BNXT_FLAG_ROCEV1_CAP)
+		edev->flags |= BNXT_EN_FLAG_ROCEV1_CAP;
+	if (bp->flags & BNXT_FLAG_ROCEV2_CAP)
+		edev->flags |= BNXT_EN_FLAG_ROCEV2_CAP;
 	return bp->edev;
 }
 EXPORT_SYMBOL(bnxt_ulp_probe);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 4/9] bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in bnxt_fw_rset_task()
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (2 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 3/9] bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe() Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 5/9] bnxt_en: fix error path of FW reset Michael Chan
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 1044 bytes --]

In the BNXT_FW_RESET_STATE_POLL_VF state in bnxt_fw_reset_task() after all
VFs have unregistered, we need to check for BNXT_STATE_ABORT_ERR after
we acquire the rtnl_lock.  If the flag is set, we need to abort.

Fixes: 230d1f0de754 ("bnxt_en: Handle firmware reset.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fdfb75a1608d..39908a3d9460 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11992,6 +11992,10 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 		}
 		bp->fw_reset_timestamp = jiffies;
 		rtnl_lock();
+		if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) {
+			rtnl_unlock();
+			goto fw_reset_abort;
+		}
 		bnxt_fw_reset_close(bp);
 		if (bp->fw_cap & BNXT_FW_CAP_ERR_RECOVER_RELOAD) {
 			bp->fw_reset_state = BNXT_FW_RESET_STATE_POLL_FW_DOWN;
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 5/9] bnxt_en: fix error path of FW reset
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (3 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 4/9] bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in bnxt_fw_rset_task() Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 6/9] bnxt_en: Validate vlan protocol ID on RX packets Michael Chan
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 3446 bytes --]

From: Somnath Kotur <somnath.kotur@broadcom.com>

When bnxt_open() fails in the firmware reset path, the driver needs to
gracefully abort, but it is executing code that should be invoked only
in the success path.  Define a function to abort FW reset and
consolidate all error paths to call this new function.

Fixes: dab62e7c2de7 ("bnxt_en: Implement faster recovery for firmware fatal error.")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 +++++++++++++++--------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 39908a3d9460..f2f1136fd492 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11959,10 +11959,21 @@ static bool bnxt_fw_reset_timeout(struct bnxt *bp)
 			  (bp->fw_reset_max_dsecs * HZ / 10));
 }
 
+static void bnxt_fw_reset_abort(struct bnxt *bp, int rc)
+{
+	clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
+	if (bp->fw_reset_state != BNXT_FW_RESET_STATE_POLL_VF) {
+		bnxt_ulp_start(bp, rc);
+		bnxt_dl_health_status_update(bp, false);
+	}
+	bp->fw_reset_state = 0;
+	dev_close(bp->dev);
+}
+
 static void bnxt_fw_reset_task(struct work_struct *work)
 {
 	struct bnxt *bp = container_of(work, struct bnxt, fw_reset_task.work);
-	int rc;
+	int rc = 0;
 
 	if (!test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) {
 		netdev_err(bp->dev, "bnxt_fw_reset_task() called when not in fw reset mode!\n");
@@ -11993,8 +12004,9 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 		bp->fw_reset_timestamp = jiffies;
 		rtnl_lock();
 		if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) {
+			bnxt_fw_reset_abort(bp, rc);
 			rtnl_unlock();
-			goto fw_reset_abort;
+			return;
 		}
 		bnxt_fw_reset_close(bp);
 		if (bp->fw_cap & BNXT_FW_CAP_ERR_RECOVER_RELOAD) {
@@ -12043,6 +12055,7 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 			if (val == 0xffff) {
 				if (bnxt_fw_reset_timeout(bp)) {
 					netdev_err(bp->dev, "Firmware reset aborted, PCI config space invalid\n");
+					rc = -ETIMEDOUT;
 					goto fw_reset_abort;
 				}
 				bnxt_queue_fw_reset_work(bp, HZ / 1000);
@@ -12052,6 +12065,7 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 		clear_bit(BNXT_STATE_FW_FATAL_COND, &bp->state);
 		if (pci_enable_device(bp->pdev)) {
 			netdev_err(bp->dev, "Cannot re-enable PCI device\n");
+			rc = -ENODEV;
 			goto fw_reset_abort;
 		}
 		pci_set_master(bp->pdev);
@@ -12078,9 +12092,10 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 		}
 		rc = bnxt_open(bp->dev);
 		if (rc) {
-			netdev_err(bp->dev, "bnxt_open_nic() failed\n");
-			clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
-			dev_close(bp->dev);
+			netdev_err(bp->dev, "bnxt_open() failed during FW reset\n");
+			bnxt_fw_reset_abort(bp, rc);
+			rtnl_unlock();
+			return;
 		}
 
 		bp->fw_reset_state = 0;
@@ -12107,12 +12122,8 @@ static void bnxt_fw_reset_task(struct work_struct *work)
 		netdev_err(bp->dev, "fw_health_status 0x%x\n", sts);
 	}
 fw_reset_abort:
-	clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
-	if (bp->fw_reset_state != BNXT_FW_RESET_STATE_POLL_VF)
-		bnxt_dl_health_status_update(bp, false);
-	bp->fw_reset_state = 0;
 	rtnl_lock();
-	dev_close(bp->dev);
+	bnxt_fw_reset_abort(bp, rc);
 	rtnl_unlock();
 }
 
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 6/9] bnxt_en: Validate vlan protocol ID on RX packets
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (4 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 5/9] bnxt_en: fix error path of FW reset Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 7/9] bnxt_en: Check abort error state in bnxt_half_open_nic() Michael Chan
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 2170 bytes --]

Only pass supported VLAN protocol IDs for stripped VLAN tags to the
stack.  The stack will hit WARN() if the protocol ID is unsupported.

Existing firmware sets up the chip to strip 0x8100, 0x88a8, 0x9100.
Only the 1st two protocols are supported by the kernel.

Fixes: a196e96bb68f ("bnxt_en: clean up VLAN feature bit handling")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f2f1136fd492..169f093e01de 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1671,11 +1671,16 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 
 	if ((tpa_info->flags2 & RX_CMP_FLAGS2_META_FORMAT_VLAN) &&
 	    (skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX)) {
-		u16 vlan_proto = tpa_info->metadata >>
-			RX_CMP_FLAGS2_METADATA_TPID_SFT;
+		__be16 vlan_proto = htons(tpa_info->metadata >>
+					  RX_CMP_FLAGS2_METADATA_TPID_SFT);
 		u16 vtag = tpa_info->metadata & RX_CMP_FLAGS2_METADATA_TCI_MASK;
 
-		__vlan_hwaccel_put_tag(skb, htons(vlan_proto), vtag);
+		if (eth_type_vlan(vlan_proto)) {
+			__vlan_hwaccel_put_tag(skb, vlan_proto, vtag);
+		} else {
+			dev_kfree_skb(skb);
+			return NULL;
+		}
 	}
 
 	skb_checksum_none_assert(skb);
@@ -1897,9 +1902,15 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 	    (skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX)) {
 		u32 meta_data = le32_to_cpu(rxcmp1->rx_cmp_meta_data);
 		u16 vtag = meta_data & RX_CMP_FLAGS2_METADATA_TCI_MASK;
-		u16 vlan_proto = meta_data >> RX_CMP_FLAGS2_METADATA_TPID_SFT;
+		__be16 vlan_proto = htons(meta_data >>
+					  RX_CMP_FLAGS2_METADATA_TPID_SFT);
 
-		__vlan_hwaccel_put_tag(skb, htons(vlan_proto), vtag);
+		if (eth_type_vlan(vlan_proto)) {
+			__vlan_hwaccel_put_tag(skb, vlan_proto, vtag);
+		} else {
+			dev_kfree_skb(skb);
+			goto next_rx;
+		}
 	}
 
 	skb_checksum_none_assert(skb);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 7/9] bnxt_en: Check abort error state in bnxt_half_open_nic()
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (5 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 6/9] bnxt_en: Validate vlan protocol ID on RX packets Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-18 19:36 ` [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open() Michael Chan
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo

[-- Attachment #1: Type: text/plain, Size: 2223 bytes --]

From: Somnath Kotur <somnath.kotur@broadcom.com>

bnxt_half_open_nic() is called during during ethtool self test and is
protected by rtnl_lock.  Firmware reset can be happening at the same
time.  Only critical portions of the entire firmware reset sequence
are protected by the rtnl_lock.  It is possible that bnxt_half_open_nic()
can be called when the firmware reset sequence is aborting.  In that
case, bnxt_half_open_nic() needs to check if the ABORT_ERR flag is set
and abort if it is.  The ethtool self test will fail but the NIC will be
brought to a consistent IF_DOWN state.

Without this patch, if bnxt_half_open_nic() were to continue in this
error state, it may crash like this:

  bnxt_en 0000:82:00.1 enp130s0f1np1: FW reset in progress during close, FW reset will be aborted
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
  ...
  Process ethtool (pid: 333327, stack limit = 0x0000000046476577)
  Call trace:
  bnxt_alloc_mem+0x444/0xef0 [bnxt_en]
  bnxt_half_open_nic+0x24/0xb8 [bnxt_en]
  bnxt_self_test+0x2dc/0x390 [bnxt_en]
  ethtool_self_test+0xe0/0x1f8
  dev_ethtool+0x1744/0x22d0
  dev_ioctl+0x190/0x3e0
  sock_ioctl+0x238/0x480
  do_vfs_ioctl+0xc4/0x758
  ksys_ioctl+0x84/0xb8
  __arm64_sys_ioctl+0x28/0x38
  el0_svc_handler+0xb0/0x180
  el0_svc+0x8/0xc

Fixes: a1301f08c5ac ("bnxt_en: Check abort error state in bnxt_open_nic().")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 169f093e01de..31eb3c00851a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -10208,6 +10208,12 @@ int bnxt_half_open_nic(struct bnxt *bp)
 {
 	int rc = 0;
 
+	if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) {
+		netdev_err(bp->dev, "A previous firmware reset has not completed, aborting half open\n");
+		rc = -ENODEV;
+		goto half_open_err;
+	}
+
 	rc = bnxt_alloc_mem(bp, false);
 	if (rc) {
 		netdev_err(bp->dev, "bnxt_alloc_mem err: %x\n", rc);
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open()
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (6 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 7/9] bnxt_en: Check abort error state in bnxt_half_open_nic() Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-19 10:43   ` Jakub Kicinski
  2021-07-18 19:36 ` [PATCH net 9/9] bnxt_en: Fix PTP capability discovery Michael Chan
  2021-07-19 16:00 ` [PATCH net 0/9] bnxt_en: Bug fixes patchwork-bot+netdevbpf
  9 siblings, 1 reply; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo, Richard Cochran

[-- Attachment #1: Type: text/plain, Size: 4847 bytes --]

The device needs to be in ifup state for PTP to function, so move
bnxt_ptp_init() to bnxt_open().  This means that the PHC will be
registered during bnxt_open().

This also makes firmware reset work correctly.  PTP configurations
may change after firmware upgrade or downgrade.  bnxt_open() will
be called after firmware reset, so it will work properly.

bnxt_ptp_start() is now incorporated into bnxt_ptp_init().  We now
also need to call bnxt_ptp_clear() in bnxt_close().

Fixes: 93cb62d98e9c ("bnxt_en: Enable hardware PTP support")
Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 16 +++++++------
 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c | 24 ++++++-------------
 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h |  1 -
 3 files changed, 16 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 31eb3c00851a..b8b73c210995 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -10134,7 +10134,6 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
 		}
 	}
 
-	bnxt_ptp_start(bp);
 	rc = bnxt_init_nic(bp, irq_re_init);
 	if (rc) {
 		netdev_err(bp->dev, "bnxt_init_nic err: %x\n", rc);
@@ -10273,9 +10272,16 @@ static int bnxt_open(struct net_device *dev)
 	rc = bnxt_hwrm_if_change(bp, true);
 	if (rc)
 		return rc;
+
+	if (bnxt_ptp_init(bp)) {
+		netdev_warn(dev, "PTP initialization failed.\n");
+		kfree(bp->ptp_cfg);
+		bp->ptp_cfg = NULL;
+	}
 	rc = __bnxt_open_nic(bp, true, true);
 	if (rc) {
 		bnxt_hwrm_if_change(bp, false);
+		bnxt_ptp_clear(bp);
 	} else {
 		if (test_and_clear_bit(BNXT_STATE_FW_RESET_DET, &bp->state)) {
 			if (!test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) {
@@ -10366,6 +10372,7 @@ static int bnxt_close(struct net_device *dev)
 {
 	struct bnxt *bp = netdev_priv(dev);
 
+	bnxt_ptp_clear(bp);
 	bnxt_hwmon_close(bp);
 	bnxt_close_nic(bp, true, true);
 	bnxt_hwrm_shutdown_link(bp);
@@ -11352,6 +11359,7 @@ static void bnxt_fw_reset_close(struct bnxt *bp)
 		bnxt_clear_int_mode(bp);
 		pci_disable_device(bp->pdev);
 	}
+	bnxt_ptp_clear(bp);
 	__bnxt_close_nic(bp, true, false);
 	bnxt_vf_reps_free(bp);
 	bnxt_clear_int_mode(bp);
@@ -12694,7 +12702,6 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	if (BNXT_PF(bp))
 		devlink_port_type_clear(&bp->dl_port);
 
-	bnxt_ptp_clear(bp);
 	pci_disable_pcie_error_reporting(pdev);
 	unregister_netdev(dev);
 	clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state);
@@ -13278,11 +13285,6 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 				   rc);
 	}
 
-	if (bnxt_ptp_init(bp)) {
-		netdev_warn(dev, "PTP initialization failed.\n");
-		kfree(bp->ptp_cfg);
-		bp->ptp_cfg = NULL;
-	}
 	bnxt_inv_fw_health_reg(bp);
 	bnxt_dl_register(bp);
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
index f698b6bd4ff8..9089e7f3fbd4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
@@ -385,22 +385,6 @@ int bnxt_get_rx_ts_p5(struct bnxt *bp, u64 *ts, u32 pkt_ts)
 	return 0;
 }
 
-void bnxt_ptp_start(struct bnxt *bp)
-{
-	struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
-
-	if (!ptp)
-		return;
-
-	if (bp->flags & BNXT_FLAG_CHIP_P5) {
-		spin_lock_bh(&ptp->ptp_lock);
-		ptp->current_time = bnxt_refclk_read(bp, NULL);
-		WRITE_ONCE(ptp->old_time, ptp->current_time);
-		spin_unlock_bh(&ptp->ptp_lock);
-		ptp_schedule_worker(ptp->ptp_clock, 0);
-	}
-}
-
 static const struct ptp_clock_info bnxt_ptp_caps = {
 	.owner		= THIS_MODULE,
 	.name		= "bnxt clock",
@@ -450,7 +434,13 @@ int bnxt_ptp_init(struct bnxt *bp)
 		bnxt_unmap_ptp_regs(bp);
 		return err;
 	}
-
+	if (bp->flags & BNXT_FLAG_CHIP_P5) {
+		spin_lock_bh(&ptp->ptp_lock);
+		ptp->current_time = bnxt_refclk_read(bp, NULL);
+		WRITE_ONCE(ptp->old_time, ptp->current_time);
+		spin_unlock_bh(&ptp->ptp_lock);
+		ptp_schedule_worker(ptp->ptp_clock, 0);
+	}
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
index 6b6245750e20..4135ea3ec788 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
@@ -75,7 +75,6 @@ int bnxt_hwtstamp_set(struct net_device *dev, struct ifreq *ifr);
 int bnxt_hwtstamp_get(struct net_device *dev, struct ifreq *ifr);
 int bnxt_get_tx_ts_p5(struct bnxt *bp, struct sk_buff *skb);
 int bnxt_get_rx_ts_p5(struct bnxt *bp, u64 *ts, u32 pkt_ts);
-void bnxt_ptp_start(struct bnxt *bp);
 int bnxt_ptp_init(struct bnxt *bp);
 void bnxt_ptp_clear(struct bnxt *bp);
 #endif
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 9/9] bnxt_en: Fix PTP capability discovery
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (7 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open() Michael Chan
@ 2021-07-18 19:36 ` Michael Chan
  2021-07-19 16:00 ` [PATCH net 0/9] bnxt_en: Bug fixes patchwork-bot+netdevbpf
  9 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-18 19:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuba, gospo, Richard Cochran

[-- Attachment #1: Type: text/plain, Size: 1384 bytes --]

The current PTP initialization logic does not account for firmware
reset that may cause PTP capability to change.  The valid pointer
bp->ptp_cfg is used to indicate that the device is capable of PTP
and that it has been initialized.  So we must clean up bp->ptp_cfg
and free it if the firmware after reset does not support PTP.

Fixes: 93cb62d98e9c ("bnxt_en: Enable hardware PTP support")
Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b8b73c210995..4db162cee911 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7574,8 +7574,12 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp)
 		bp->flags &= ~BNXT_FLAG_WOL_CAP;
 		if (flags & FUNC_QCAPS_RESP_FLAGS_WOL_MAGICPKT_SUPPORTED)
 			bp->flags |= BNXT_FLAG_WOL_CAP;
-		if (flags & FUNC_QCAPS_RESP_FLAGS_PTP_SUPPORTED)
+		if (flags & FUNC_QCAPS_RESP_FLAGS_PTP_SUPPORTED) {
 			__bnxt_hwrm_ptp_qcfg(bp);
+		} else {
+			kfree(bp->ptp_cfg);
+			bp->ptp_cfg = NULL;
+		}
 	} else {
 #ifdef CONFIG_BNXT_SRIOV
 		struct bnxt_vf_info *vf = &bp->vf;
-- 
2.18.1


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open()
  2021-07-18 19:36 ` [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open() Michael Chan
@ 2021-07-19 10:43   ` Jakub Kicinski
  2021-07-19 17:07     ` Richard Cochran
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2021-07-19 10:43 UTC (permalink / raw)
  To: Michael Chan, Richard Cochran; +Cc: davem, netdev, gospo

On Sun, 18 Jul 2021 15:36:32 -0400, Michael Chan wrote:
> The device needs to be in ifup state for PTP to function, so move
> bnxt_ptp_init() to bnxt_open().  This means that the PHC will be
> registered during bnxt_open().

I think it's an anti-pattern to have the clock registered only when
the device is up. Right, Richard?

IIRC Intel parts did it in the past because they had the clock hooked
up to the MAC/PHY so the clock was not actually ticking. But seems like
a wrong trade off to unreg PTP for SW convenience. Or maybe I'm
biased against the live FW reset :) Let's see if Richard agrees.

> This also makes firmware reset work correctly.  PTP configurations
> may change after firmware upgrade or downgrade.  bnxt_open() will
> be called after firmware reset, so it will work properly.
> 
> bnxt_ptp_start() is now incorporated into bnxt_ptp_init().  We now
> also need to call bnxt_ptp_clear() in bnxt_close().

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 0/9] bnxt_en: Bug fixes
  2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
                   ` (8 preceding siblings ...)
  2021-07-18 19:36 ` [PATCH net 9/9] bnxt_en: Fix PTP capability discovery Michael Chan
@ 2021-07-19 16:00 ` patchwork-bot+netdevbpf
  9 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-07-19 16:00 UTC (permalink / raw)
  To: Michael Chan; +Cc: davem, netdev, kuba, gospo

Hello:

This series was applied to netdev/net.git (refs/heads/master):

On Sun, 18 Jul 2021 15:36:24 -0400 you wrote:
> Most of the fixes in this series have to do with error recovery.  They
> include error path handling when the error recovery has to abort, and
> the rediscovery of capabilities (PTP and RoCE) after firmware reset
> that may result in capability changes.
> 
> Two other fixes are to reject invalid ETS settings and to validate
> VLAN protocol in the RX path.
> 
> [...]

Here is the summary with links:
  - [net,1/9] bnxt_en: don't disable an already disabled PCI device
    https://git.kernel.org/netdev/net/c/c81cfb6256d9
  - [net,2/9] bnxt_en: reject ETS settings that will starve a TC
    https://git.kernel.org/netdev/net/c/c08c59653415
  - [net,3/9] bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe()
    https://git.kernel.org/netdev/net/c/2c9f046bc377
  - [net,4/9] bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in bnxt_fw_rset_task()
    https://git.kernel.org/netdev/net/c/6cd657cb3ee6
  - [net,5/9] bnxt_en: fix error path of FW reset
    https://git.kernel.org/netdev/net/c/3958b1da725a
  - [net,6/9] bnxt_en: Validate vlan protocol ID on RX packets
    https://git.kernel.org/netdev/net/c/96bdd4b9ea7e
  - [net,7/9] bnxt_en: Check abort error state in bnxt_half_open_nic()
    https://git.kernel.org/netdev/net/c/11a39259ff79
  - [net,8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open()
    https://git.kernel.org/netdev/net/c/d7859afb6880
  - [net,9/9] bnxt_en: Fix PTP capability discovery
    https://git.kernel.org/netdev/net/c/de5bf19414fe

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open()
  2021-07-19 10:43   ` Jakub Kicinski
@ 2021-07-19 17:07     ` Richard Cochran
  2021-07-20  5:29       ` Michael Chan
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Cochran @ 2021-07-19 17:07 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Michael Chan, davem, netdev, gospo

On Mon, Jul 19, 2021 at 12:43:23PM +0200, Jakub Kicinski wrote:
> On Sun, 18 Jul 2021 15:36:32 -0400, Michael Chan wrote:
> > The device needs to be in ifup state for PTP to function, so move
> > bnxt_ptp_init() to bnxt_open().  This means that the PHC will be
> > registered during bnxt_open().
> 
> I think it's an anti-pattern to have the clock registered only when
> the device is up. Right, Richard?

Yes, indeed.

> IIRC Intel parts did it in the past because they had the clock hooked
> up to the MAC/PHY so the clock was not actually ticking. But seems like
> a wrong trade off to unreg PTP for SW convenience. Or maybe I'm
> biased against the live FW reset :) Let's see if Richard agrees.

Totally agree!

Ideally the PHC appears as soon as the driver instantiates for a HW
device.  Telling time from the clock is independent from the network
interface being up or down.

Some drivers are lazy and fail to decouple these two orthogonal
issues.  In some (all?) cases, there is no HW limitation involved,
just sloppy driver work.

Thanks,
Richard




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open()
  2021-07-19 17:07     ` Richard Cochran
@ 2021-07-20  5:29       ` Michael Chan
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Chan @ 2021-07-20  5:29 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Jakub Kicinski, David Miller, Netdev, Andrew Gospodarek

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

On Mon, Jul 19, 2021 at 10:07 AM Richard Cochran
<richardcochran@gmail.com> wrote:

> Ideally the PHC appears as soon as the driver instantiates for a HW
> device.  Telling time from the clock is independent from the network
> interface being up or down.
>
OK.  Will improve this in follow on patches to keep the PHC registered
unless the firmware after reset no longer supports PTP.  Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-07-20  5:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-18 19:36 [PATCH net 0/9] bnxt_en: Bug fixes Michael Chan
2021-07-18 19:36 ` [PATCH net 1/9] bnxt_en: don't disable an already disabled PCI device Michael Chan
2021-07-18 19:36 ` [PATCH net 2/9] bnxt_en: reject ETS settings that will starve a TC Michael Chan
2021-07-18 19:36 ` [PATCH net 3/9] bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe() Michael Chan
2021-07-18 19:36 ` [PATCH net 4/9] bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in bnxt_fw_rset_task() Michael Chan
2021-07-18 19:36 ` [PATCH net 5/9] bnxt_en: fix error path of FW reset Michael Chan
2021-07-18 19:36 ` [PATCH net 6/9] bnxt_en: Validate vlan protocol ID on RX packets Michael Chan
2021-07-18 19:36 ` [PATCH net 7/9] bnxt_en: Check abort error state in bnxt_half_open_nic() Michael Chan
2021-07-18 19:36 ` [PATCH net 8/9] bnxt_en: Move bnxt_ptp_init() to bnxt_open() Michael Chan
2021-07-19 10:43   ` Jakub Kicinski
2021-07-19 17:07     ` Richard Cochran
2021-07-20  5:29       ` Michael Chan
2021-07-18 19:36 ` [PATCH net 9/9] bnxt_en: Fix PTP capability discovery Michael Chan
2021-07-19 16:00 ` [PATCH net 0/9] bnxt_en: Bug fixes patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).