All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] be2net: patch-set
       [not found] <Suresh.Reddy@broadcom.com>
@ 2017-05-25  2:24 ` Suresh Reddy
  2017-05-25  2:24   ` [PATCH net-next 1/2] be2net: Fix UE detection logic for BE3 Suresh Reddy
                     ` (2 more replies)
  2017-09-13 15:12 ` [PATCH net] be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx Suresh Reddy
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 23+ messages in thread
From: Suresh Reddy @ 2017-05-25  2:24 UTC (permalink / raw)
  To: netdev; +Cc: Suresh Reddy

Hi Dave, Please consider applying these two patches to net-next

Suresh Reddy (2):
  be2net: Fix UE detection logic for BE3
  be2net: Update the driver version to 11.4.0.0

 drivers/net/ethernet/emulex/benet/be.h      |  2 +-
 drivers/net/ethernet/emulex/benet/be_hw.h   |  3 +++
 drivers/net/ethernet/emulex/benet/be_main.c | 27 +++++++++++++++++++--------
 3 files changed, 23 insertions(+), 9 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next 1/2] be2net: Fix UE detection logic for BE3
  2017-05-25  2:24 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
@ 2017-05-25  2:24   ` Suresh Reddy
  2017-05-25  2:24   ` [PATCH net-next 2/2] be2net: Update the driver version to 11.4.0.0 Suresh Reddy
  2017-05-25 18:45   ` [PATCH net-next 0/2] be2net: patch-set David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2017-05-25  2:24 UTC (permalink / raw)
  To: netdev; +Cc: Suresh Reddy

On certain platforms BE3 chips may indicate spurious UEs (unrecoverable
error). Because of the UE detection logic was disabled in the driver
for BE3 chips. Because of this, even in cases of a real UE,
a failover will not occur. This patch re-enables UE detection on BE3
and if a UE is detected, reads the POST register. If the POST register,
reports either a FAT_LOG_STATE or a ARMFW_UE, then it means that a valid
UE occurred in the chip.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be_hw.h   |  3 +++
 drivers/net/ethernet/emulex/benet/be_main.c | 27 +++++++++++++++++++--------
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_hw.h b/drivers/net/ethernet/emulex/benet/be_hw.h
index 36e4232..c967f45 100644
--- a/drivers/net/ethernet/emulex/benet/be_hw.h
+++ b/drivers/net/ethernet/emulex/benet/be_hw.h
@@ -49,6 +49,9 @@
 #define POST_STAGE_BE_RESET		0x3 /* Host wants to reset chip */
 #define POST_STAGE_ARMFW_RDY		0xc000	/* FW is done with POST */
 #define POST_STAGE_RECOVERABLE_ERR	0xE000	/* Recoverable err detected */
+/* FW has detected a UE and is dumping FAT log data */
+#define POST_STAGE_FAT_LOG_START       0x0D00
+#define POST_STAGE_ARMFW_UE            0xF000  /*FW has asserted an UE*/
 
 /* Lancer SLIPORT registers */
 #define SLIPORT_STATUS_OFFSET		0x404
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index f3a09ab..8000551 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3241,8 +3241,9 @@ void be_detect_error(struct be_adapter *adapter)
 {
 	u32 ue_lo = 0, ue_hi = 0, ue_lo_mask = 0, ue_hi_mask = 0;
 	u32 sliport_status = 0, sliport_err1 = 0, sliport_err2 = 0;
-	u32 i;
 	struct device *dev = &adapter->pdev->dev;
+	u16 val;
+	u32 i;
 
 	if (be_check_error(adapter, BE_ERROR_HW))
 		return;
@@ -3280,15 +3281,25 @@ void be_detect_error(struct be_adapter *adapter)
 		ue_lo = (ue_lo & ~ue_lo_mask);
 		ue_hi = (ue_hi & ~ue_hi_mask);
 
-		/* On certain platforms BE hardware can indicate spurious UEs.
-		 * Allow HW to stop working completely in case of a real UE.
-		 * Hence not setting the hw_error for UE detection.
-		 */
-
 		if (ue_lo || ue_hi) {
+			/* On certain platforms BE3 hardware can indicate
+			 * spurious UEs. In case of a UE in the chip,
+			 * the POST register correctly reports either a
+			 * FAT_LOG_START state (FW is currently dumping
+			 * FAT log data) or a ARMFW_UE state. Check for the
+			 * above states to ascertain if the UE is valid or not.
+			 */
+			if (BE3_chip(adapter)) {
+				val = be_POST_stage_get(adapter);
+				if ((val & POST_STAGE_FAT_LOG_START)
+				     != POST_STAGE_FAT_LOG_START &&
+				    (val & POST_STAGE_ARMFW_UE)
+				     != POST_STAGE_ARMFW_UE)
+					return;
+			}
+
 			dev_err(dev, "Error detected in the adapter");
-			if (skyhawk_chip(adapter))
-				be_set_error(adapter, BE_ERROR_UE);
+			be_set_error(adapter, BE_ERROR_UE);
 
 			for (i = 0; ue_lo; ue_lo >>= 1, i++) {
 				if (ue_lo & 1)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net-next 2/2] be2net: Update the driver version to 11.4.0.0
  2017-05-25  2:24 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
  2017-05-25  2:24   ` [PATCH net-next 1/2] be2net: Fix UE detection logic for BE3 Suresh Reddy
@ 2017-05-25  2:24   ` Suresh Reddy
  2017-05-25 18:45   ` [PATCH net-next 0/2] be2net: patch-set David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2017-05-25  2:24 UTC (permalink / raw)
  To: netdev; +Cc: Suresh Reddy

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 5056624..674cf9d 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -37,7 +37,7 @@
 #include "be_hw.h"
 #include "be_roce.h"
 
-#define DRV_VER			"11.1.0.0"
+#define DRV_VER			"11.4.0.0"
 #define DRV_NAME		"be2net"
 #define BE_NAME			"Emulex BladeEngine2"
 #define BE3_NAME		"Emulex BladeEngine3"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 0/2] be2net: patch-set
  2017-05-25  2:24 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
  2017-05-25  2:24   ` [PATCH net-next 1/2] be2net: Fix UE detection logic for BE3 Suresh Reddy
  2017-05-25  2:24   ` [PATCH net-next 2/2] be2net: Update the driver version to 11.4.0.0 Suresh Reddy
@ 2017-05-25 18:45   ` David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2017-05-25 18:45 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Reddy <suresh.reddy@broadcom.com>
Date: Wed, 24 May 2017 22:24:37 -0400

> Hi Dave, Please consider applying these two patches to net-next

Series applied.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net] be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx
       [not found] <Suresh.Reddy@broadcom.com>
  2017-05-25  2:24 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
@ 2017-09-13 15:12 ` Suresh Reddy
  2017-09-13 16:33   ` David Miller
  2018-02-06 13:52 ` [PATCH net 0/2] be2net: patch-set Suresh Reddy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Suresh Reddy @ 2017-09-13 15:12 UTC (permalink / raw)
  To: netdev

IPv6 TSO requests with extension hdrs are a problem to the
Lancer and BEx chips. Workaround is to disable TSO6 feature
for such packets.

Also in Lancer chips, MSS less than 256 was resulting in TX stall.
Fix this by disabling GSO when MSS less than 256.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be.h      |  8 ++++++++
 drivers/net/ethernet/emulex/benet/be_main.c | 14 ++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 674cf9d..8984c49 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -930,6 +930,14 @@ static inline bool is_ipv4_pkt(struct sk_buff *skb)
 	return skb->protocol == htons(ETH_P_IP) && ip_hdr(skb)->version == 4;
 }
 
+static inline bool is_ipv6_ext_hdr(struct sk_buff *skb)
+{
+	if (ip_hdr(skb)->version == 6)
+		return ipv6_ext_hdr(ipv6_hdr(skb)->nexthdr);
+	else
+		return false;
+}
+
 #define be_error_recovering(adapter)	\
 		(adapter->flags & BE_FLAGS_TRY_RECOVERY)
 
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 319eee3..0e3d9f39 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -5089,6 +5089,20 @@ static netdev_features_t be_features_check(struct sk_buff *skb,
 	struct be_adapter *adapter = netdev_priv(dev);
 	u8 l4_hdr = 0;
 
+	if (skb_is_gso(skb)) {
+		/* IPv6 TSO requests with extension hdrs are a problem
+		 * to Lancer and BE3 HW. Disable TSO6 feature.
+		 */
+		if (!skyhawk_chip(adapter) && is_ipv6_ext_hdr(skb))
+			features &= ~NETIF_F_TSO6;
+
+		/* Lancer cannot handle the packet with MSS less than 256.
+		 * Disable the GSO support in such cases
+		 */
+		if (lancer_chip(adapter) && skb_shinfo(skb)->gso_size < 256)
+			features &= ~NETIF_F_GSO_MASK;
+	}
+
 	/* The code below restricts offload features for some tunneled and
 	 * Q-in-Q packets.
 	 * Offload features for normal (non tunnel) packets are unchanged.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net] be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx
  2017-09-13 15:12 ` [PATCH net] be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx Suresh Reddy
@ 2017-09-13 16:33   ` David Miller
  0 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2017-09-13 16:33 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Reddy <suresh.reddy@broadcom.com>
Date: Wed, 13 Sep 2017 11:12:42 -0400

> IPv6 TSO requests with extension hdrs are a problem to the
> Lancer and BEx chips. Workaround is to disable TSO6 feature
> for such packets.
> 
> Also in Lancer chips, MSS less than 256 was resulting in TX stall.
> Fix this by disabling GSO when MSS less than 256.
> 
> Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>

Applied.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net 0/2] be2net: patch-set
       [not found] <Suresh.Reddy@broadcom.com>
  2017-05-25  2:24 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
  2017-09-13 15:12 ` [PATCH net] be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx Suresh Reddy
@ 2018-02-06 13:52 ` Suresh Reddy
  2018-02-06 13:52   ` [PATCH net 1/2] be2net: Fix HW stall issue in Lancer Suresh Reddy
                     ` (2 more replies)
  2018-05-28  5:26 ` [PATCH net] be2net: Fix error detection logic for BE3 Suresh Reddy
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-02-06 13:52 UTC (permalink / raw)
  To: netdev

Hi Dave, Please consider applying these two patches to net

Suresh Reddy (2):
  be2net: Fix HW stall issue in Lancer
  be2net: Handle transmit completion errors in Lancer

 drivers/net/ethernet/emulex/benet/be.h         |   7 +-
 drivers/net/ethernet/emulex/benet/be_ethtool.c |   1 +
 drivers/net/ethernet/emulex/benet/be_hw.h      |   1 +
 drivers/net/ethernet/emulex/benet/be_main.c    | 113 +++++++++++++++----------
 4 files changed, 73 insertions(+), 49 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net 1/2] be2net: Fix HW stall issue in Lancer
  2018-02-06 13:52 ` [PATCH net 0/2] be2net: patch-set Suresh Reddy
@ 2018-02-06 13:52   ` Suresh Reddy
  2018-02-06 13:52   ` [PATCH net 2/2] be2net: Handle transmit completion errors " Suresh Reddy
  2018-02-06 16:48   ` [PATCH net 0/2] be2net: patch-set David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-02-06 13:52 UTC (permalink / raw)
  To: netdev

Lancer HW cannot handle a TSO packet with a single segment.
Disable TSO/GSO for such packets.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index c6e859a..130fa82 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -5097,9 +5097,12 @@ static netdev_features_t be_features_check(struct sk_buff *skb,
 			features &= ~NETIF_F_TSO6;
 
 		/* Lancer cannot handle the packet with MSS less than 256.
+		 * Also it can't handle a TSO packet with a single segment
 		 * Disable the GSO support in such cases
 		 */
-		if (lancer_chip(adapter) && skb_shinfo(skb)->gso_size < 256)
+		if (lancer_chip(adapter) &&
+		    (skb_shinfo(skb)->gso_size < 256 ||
+		     skb_shinfo(skb)->gso_segs == 1))
 			features &= ~NETIF_F_GSO_MASK;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 2/2] be2net: Handle transmit completion errors in Lancer
  2018-02-06 13:52 ` [PATCH net 0/2] be2net: patch-set Suresh Reddy
  2018-02-06 13:52   ` [PATCH net 1/2] be2net: Fix HW stall issue in Lancer Suresh Reddy
@ 2018-02-06 13:52   ` Suresh Reddy
  2018-02-06 16:48   ` [PATCH net 0/2] be2net: patch-set David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-02-06 13:52 UTC (permalink / raw)
  To: netdev

If the driver receives a TX CQE with status as 0x1 or 0x9 or 0xb,
the completion indexes should not be used. The driver must stop
consuming CQEs from this TXQ/CQ. The TXQ from this point on-wards
to be in a bad state. Driver should destroy and recreate the TXQ.

0x1: LANCER_TX_COMP_LSO_ERR
0x9 LANCER_TX_COMP_SGE_ERR
0xb: LANCER_TX_COMP_PARITY_ERR

Reset the adapter if driver sees this error in TX completion. Also
adding sge error counter in ethtool stats.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be.h         |   7 +-
 drivers/net/ethernet/emulex/benet/be_ethtool.c |   1 +
 drivers/net/ethernet/emulex/benet/be_hw.h      |   1 +
 drivers/net/ethernet/emulex/benet/be_main.c    | 108 ++++++++++++++-----------
 4 files changed, 69 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 8984c49..382891f 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -248,6 +248,7 @@ struct be_tx_stats {
 	u32 tx_spoof_check_err;
 	u32 tx_qinq_err;
 	u32 tx_internal_parity_err;
+	u32 tx_sge_err;
 	struct u64_stats_sync sync;
 	struct u64_stats_sync sync_compl;
 };
@@ -944,8 +945,10 @@ static inline bool is_ipv6_ext_hdr(struct sk_buff *skb)
 #define BE_ERROR_EEH		1
 #define BE_ERROR_UE		BIT(1)
 #define BE_ERROR_FW		BIT(2)
-#define BE_ERROR_HW		(BE_ERROR_EEH | BE_ERROR_UE)
-#define BE_ERROR_ANY		(BE_ERROR_EEH | BE_ERROR_UE | BE_ERROR_FW)
+#define BE_ERROR_TX		BIT(3)
+#define BE_ERROR_HW		(BE_ERROR_EEH | BE_ERROR_UE | BE_ERROR_TX)
+#define BE_ERROR_ANY		(BE_ERROR_EEH | BE_ERROR_UE | BE_ERROR_FW | \
+				 BE_ERROR_TX)
 #define BE_CLEAR_ALL		0xFF
 
 static inline u8 be_check_error(struct be_adapter *adapter, u32 err_type)
diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index 7d1819c..7f7e206 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -189,6 +189,7 @@ struct be_ethtool_stat {
 	 * packet data. This counter is applicable only for Lancer adapters.
 	 */
 	{DRVSTAT_TX_INFO(tx_internal_parity_err)},
+	{DRVSTAT_TX_INFO(tx_sge_err)},
 	{DRVSTAT_TX_INFO(tx_bytes)},
 	{DRVSTAT_TX_INFO(tx_pkts)},
 	{DRVSTAT_TX_INFO(tx_vxlan_offload_pkts)},
diff --git a/drivers/net/ethernet/emulex/benet/be_hw.h b/drivers/net/ethernet/emulex/benet/be_hw.h
index c967f45..db5f92f 100644
--- a/drivers/net/ethernet/emulex/benet/be_hw.h
+++ b/drivers/net/ethernet/emulex/benet/be_hw.h
@@ -261,6 +261,7 @@ struct be_eth_hdr_wrb {
 #define LANCER_TX_COMP_HSW_DROP_MAC_ERR		0x3
 #define LANCER_TX_COMP_HSW_DROP_VLAN_ERR	0x5
 #define LANCER_TX_COMP_QINQ_ERR			0x7
+#define LANCER_TX_COMP_SGE_ERR			0x9
 #define LANCER_TX_COMP_PARITY_ERR		0xb
 #define LANCER_TX_COMP_DMA_ERR			0xd
 
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 130fa82..2300072 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -2584,7 +2584,48 @@ static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp, u32 frags_needed)
 	}
 }
 
-static struct be_tx_compl_info *be_tx_compl_get(struct be_tx_obj *txo)
+static inline void be_update_tx_err(struct be_tx_obj *txo, u8 status)
+{
+	switch (status) {
+	case BE_TX_COMP_HDR_PARSE_ERR:
+		tx_stats(txo)->tx_hdr_parse_err++;
+		break;
+	case BE_TX_COMP_NDMA_ERR:
+		tx_stats(txo)->tx_dma_err++;
+		break;
+	case BE_TX_COMP_ACL_ERR:
+		tx_stats(txo)->tx_spoof_check_err++;
+		break;
+	}
+}
+
+static inline void lancer_update_tx_err(struct be_tx_obj *txo, u8 status)
+{
+	switch (status) {
+	case LANCER_TX_COMP_LSO_ERR:
+		tx_stats(txo)->tx_tso_err++;
+		break;
+	case LANCER_TX_COMP_HSW_DROP_MAC_ERR:
+	case LANCER_TX_COMP_HSW_DROP_VLAN_ERR:
+		tx_stats(txo)->tx_spoof_check_err++;
+		break;
+	case LANCER_TX_COMP_QINQ_ERR:
+		tx_stats(txo)->tx_qinq_err++;
+		break;
+	case LANCER_TX_COMP_PARITY_ERR:
+		tx_stats(txo)->tx_internal_parity_err++;
+		break;
+	case LANCER_TX_COMP_DMA_ERR:
+		tx_stats(txo)->tx_dma_err++;
+		break;
+	case LANCER_TX_COMP_SGE_ERR:
+		tx_stats(txo)->tx_sge_err++;
+		break;
+	}
+}
+
+static struct be_tx_compl_info *be_tx_compl_get(struct be_adapter *adapter,
+						struct be_tx_obj *txo)
 {
 	struct be_queue_info *tx_cq = &txo->cq;
 	struct be_tx_compl_info *txcp = &txo->txcp;
@@ -2600,6 +2641,24 @@ static struct be_tx_compl_info *be_tx_compl_get(struct be_tx_obj *txo)
 	txcp->status = GET_TX_COMPL_BITS(status, compl);
 	txcp->end_index = GET_TX_COMPL_BITS(wrb_index, compl);
 
+	if (txcp->status) {
+		if (lancer_chip(adapter)) {
+			lancer_update_tx_err(txo, txcp->status);
+			/* Reset the adapter incase of TSO,
+			 * SGE or Parity error
+			 */
+			if (txcp->status == LANCER_TX_COMP_LSO_ERR ||
+			    txcp->status == LANCER_TX_COMP_PARITY_ERR ||
+			    txcp->status == LANCER_TX_COMP_SGE_ERR)
+				be_set_error(adapter, BE_ERROR_TX);
+		} else {
+			be_update_tx_err(txo, txcp->status);
+		}
+	}
+
+	if (be_check_error(adapter, BE_ERROR_TX))
+		return NULL;
+
 	compl->dw[offsetof(struct amap_eth_tx_compl, valid) / 32] = 0;
 	queue_tail_inc(tx_cq);
 	return txcp;
@@ -2742,7 +2801,7 @@ static void be_tx_compl_clean(struct be_adapter *adapter)
 			cmpl = 0;
 			num_wrbs = 0;
 			txq = &txo->q;
-			while ((txcp = be_tx_compl_get(txo))) {
+			while ((txcp = be_tx_compl_get(adapter, txo))) {
 				num_wrbs +=
 					be_tx_compl_process(adapter, txo,
 							    txcp->end_index);
@@ -3121,42 +3180,6 @@ static int be_process_rx(struct be_rx_obj *rxo, struct napi_struct *napi,
 	return work_done;
 }
 
-static inline void be_update_tx_err(struct be_tx_obj *txo, u8 status)
-{
-	switch (status) {
-	case BE_TX_COMP_HDR_PARSE_ERR:
-		tx_stats(txo)->tx_hdr_parse_err++;
-		break;
-	case BE_TX_COMP_NDMA_ERR:
-		tx_stats(txo)->tx_dma_err++;
-		break;
-	case BE_TX_COMP_ACL_ERR:
-		tx_stats(txo)->tx_spoof_check_err++;
-		break;
-	}
-}
-
-static inline void lancer_update_tx_err(struct be_tx_obj *txo, u8 status)
-{
-	switch (status) {
-	case LANCER_TX_COMP_LSO_ERR:
-		tx_stats(txo)->tx_tso_err++;
-		break;
-	case LANCER_TX_COMP_HSW_DROP_MAC_ERR:
-	case LANCER_TX_COMP_HSW_DROP_VLAN_ERR:
-		tx_stats(txo)->tx_spoof_check_err++;
-		break;
-	case LANCER_TX_COMP_QINQ_ERR:
-		tx_stats(txo)->tx_qinq_err++;
-		break;
-	case LANCER_TX_COMP_PARITY_ERR:
-		tx_stats(txo)->tx_internal_parity_err++;
-		break;
-	case LANCER_TX_COMP_DMA_ERR:
-		tx_stats(txo)->tx_dma_err++;
-		break;
-	}
-}
 
 static void be_process_tx(struct be_adapter *adapter, struct be_tx_obj *txo,
 			  int idx)
@@ -3164,16 +3187,9 @@ static void be_process_tx(struct be_adapter *adapter, struct be_tx_obj *txo,
 	int num_wrbs = 0, work_done = 0;
 	struct be_tx_compl_info *txcp;
 
-	while ((txcp = be_tx_compl_get(txo))) {
+	while ((txcp = be_tx_compl_get(adapter, txo))) {
 		num_wrbs += be_tx_compl_process(adapter, txo, txcp->end_index);
 		work_done++;
-
-		if (txcp->status) {
-			if (lancer_chip(adapter))
-				lancer_update_tx_err(txo, txcp->status);
-			else
-				be_update_tx_err(txo, txcp->status);
-		}
 	}
 
 	if (work_done) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net 0/2] be2net: patch-set
  2018-02-06 13:52 ` [PATCH net 0/2] be2net: patch-set Suresh Reddy
  2018-02-06 13:52   ` [PATCH net 1/2] be2net: Fix HW stall issue in Lancer Suresh Reddy
  2018-02-06 13:52   ` [PATCH net 2/2] be2net: Handle transmit completion errors " Suresh Reddy
@ 2018-02-06 16:48   ` David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2018-02-06 16:48 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Reddy <suresh.reddy@broadcom.com>
Date: Tue,  6 Feb 2018 08:52:40 -0500

> Hi Dave, Please consider applying these two patches to net

Series applied, thank you.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net] be2net: Fix error detection logic for BE3
       [not found] <Suresh.Reddy@broadcom.com>
                   ` (2 preceding siblings ...)
  2018-02-06 13:52 ` [PATCH net 0/2] be2net: patch-set Suresh Reddy
@ 2018-05-28  5:26 ` Suresh Reddy
  2018-05-29 14:58   ` David Miller
  2018-07-23 14:25 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
  2018-07-31 15:39 ` [PATCH V2 net-next 0/2] be2net: patch-set Suresh Reddy
  5 siblings, 1 reply; 23+ messages in thread
From: Suresh Reddy @ 2018-05-28  5:26 UTC (permalink / raw)
  To: netdev

Check for 0xE00 (RECOVERABLE_ERR) along with ARMFW UE (0x0)
in be_detect_error() to know whether the error is valid error or not

Fixes: 673c96e5a ("be2net: Fix UE detection logic for BE3")
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index c697e79..8f75500 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3309,7 +3309,9 @@ void be_detect_error(struct be_adapter *adapter)
 				if ((val & POST_STAGE_FAT_LOG_START)
 				     != POST_STAGE_FAT_LOG_START &&
 				    (val & POST_STAGE_ARMFW_UE)
-				     != POST_STAGE_ARMFW_UE)
+				     != POST_STAGE_ARMFW_UE &&
+				    (val & POST_STAGE_RECOVERABLE_ERR)
+				     != POST_STAGE_RECOVERABLE_ERR)
 					return;
 			}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net] be2net: Fix error detection logic for BE3
  2018-05-28  5:26 ` [PATCH net] be2net: Fix error detection logic for BE3 Suresh Reddy
@ 2018-05-29 14:58   ` David Miller
  0 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2018-05-29 14:58 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Reddy <suresh.reddy@broadcom.com>
Date: Mon, 28 May 2018 01:26:06 -0400

> Check for 0xE00 (RECOVERABLE_ERR) along with ARMFW UE (0x0)
> in be_detect_error() to know whether the error is valid error or not
> 
> Fixes: 673c96e5a ("be2net: Fix UE detection logic for BE3")
> Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>

Applied and queued up for -stable.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next 0/2] be2net: patch-set
       [not found] <Suresh.Reddy@broadcom.com>
                   ` (3 preceding siblings ...)
  2018-05-28  5:26 ` [PATCH net] be2net: Fix error detection logic for BE3 Suresh Reddy
@ 2018-07-23 14:25 ` Suresh Reddy
  2018-07-23 14:25   ` [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout Suresh Reddy
  2018-07-23 14:25   ` [PATCH net-next 2/2] be2net: Update the driver version to 12.0.0.0 Suresh Reddy
  2018-07-31 15:39 ` [PATCH V2 net-next 0/2] be2net: patch-set Suresh Reddy
  5 siblings, 2 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-07-23 14:25 UTC (permalink / raw)
  To: netdev

Hi Dave, Please consider applying these two patches to net-next

Suresh Reddy (2):
  be2net: Collect the transmit queue data in Tx timeout
  be2net: Update the driver version to 12.0.0.0

 drivers/net/ethernet/emulex/benet/be.h      |  2 +-
 drivers/net/ethernet/emulex/benet/be_main.c | 80 ++++++++++++++++++++++++++++-
 2 files changed, 80 insertions(+), 2 deletions(-)

-- 
2.10.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout
  2018-07-23 14:25 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
@ 2018-07-23 14:25   ` Suresh Reddy
  2018-07-23 18:23     ` David Miller
  2018-07-23 14:25   ` [PATCH net-next 2/2] be2net: Update the driver version to 12.0.0.0 Suresh Reddy
  1 sibling, 1 reply; 23+ messages in thread
From: Suresh Reddy @ 2018-07-23 14:25 UTC (permalink / raw)
  To: netdev

Driver dumps tx_queue, tx_compl, pending SKBs information in tx_timeout.
This debug data used to idenfiy the cause of the time out.

Also reset Lancer chip in tx_timeout.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c | 80 ++++++++++++++++++++++++++++-
 1 file changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 05e4c0b..580cdec 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1412,6 +1412,83 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev)
 	return NETDEV_TX_OK;
 }
 
+static void be_tx_timeout(struct net_device *netdev)
+{
+	struct be_adapter *adapter = netdev_priv(netdev);
+	struct device *dev = &adapter->pdev->dev;
+	struct be_tx_obj *txo;
+	struct sk_buff *skb;
+	struct tcphdr *tcphdr;
+	struct udphdr *udphdr;
+	u32 *entry;
+	int status;
+	int i, j;
+
+	for_all_tx_queues(adapter, txo, i) {
+		dev_info(dev, "TXQ Dump: %d H: %d T: %d used: %d, qid: 0x%x\n",
+			 i, txo->q.head, txo->q.tail,
+			 atomic_read(&txo->q.used), txo->q.id);
+
+		entry = txo->q.dma_mem.va;
+		for (j = 0; j < TX_Q_LEN * 4; j += 4) {
+			if (entry[j] != 0 || entry[j + 1] != 0 ||
+			    entry[j + 2] != 0 || entry[j + 3] != 0) {
+				dev_info(dev, "Entry %d 0x%x 0x%x 0x%x 0x%x\n",
+					 j, entry[j], entry[j + 1],
+					 entry[j + 2], entry[j + 3]);
+			}
+		}
+
+		entry = txo->cq.dma_mem.va;
+		dev_info(dev, "TXCQ Dump: %d  H: %d T: %d used: %d\n",
+			 i, txo->cq.head, txo->cq.tail,
+			 atomic_read(&txo->cq.used));
+		for (j = 0; j < TX_CQ_LEN * 4; j += 4) {
+			if (entry[j] != 0 || entry[j + 1] != 0 ||
+			    entry[j + 2] != 0 || entry[j + 3] != 0) {
+				dev_info(dev, "Entry %d 0x%x 0x%x 0x%x 0x%x\n",
+					 j, entry[j], entry[j + 1],
+					 entry[j + 2], entry[j + 3]);
+			}
+		}
+
+		for (j = 0; j < TX_Q_LEN; j++) {
+			if (txo->sent_skb_list[j]) {
+				skb = txo->sent_skb_list[j];
+				if (ip_hdr(skb)->protocol == IPPROTO_TCP) {
+					tcphdr = tcp_hdr(skb);
+					dev_info(dev, "TCP source port %d\n",
+						 ntohs(tcphdr->source));
+					dev_info(dev, "TCP dest port %d\n",
+						 ntohs(tcphdr->dest));
+					dev_info(dev, "TCP seqence num %d\n",
+						 ntohs(tcphdr->seq));
+					dev_info(dev, "TCP ack_seq %d\n",
+						 ntohs(tcphdr->ack_seq));
+				} else if (ip_hdr(skb)->protocol ==
+					   IPPROTO_UDP) {
+					udphdr = udp_hdr(skb);
+					dev_info(dev, "UDP source port %d\n",
+						 ntohs(udphdr->source));
+					dev_info(dev, "UDP dest port %d\n",
+						 ntohs(udphdr->dest));
+				}
+				dev_info(dev, "skb[%d] %p len %d proto 0x%x\n",
+					 j, skb, skb->len, skb->protocol);
+			}
+		}
+	}
+
+	if (lancer_chip(adapter)) {
+		dev_info(dev, "Initiating reset due to tx timeout\n");
+		dev_info(dev, "Resetting adapter\n");
+		status = lancer_physdev_ctrl(adapter,
+					     PHYSDEV_CONTROL_FW_RESET_MASK);
+		if (status)
+			dev_err(dev, "Reset failed .. Reboot server\n");
+	}
+}
+
 static inline bool be_in_all_promisc(struct be_adapter *adapter)
 {
 	return (adapter->if_flags & BE_IF_FLAGS_ALL_PROMISCUOUS) ==
@@ -3274,7 +3351,7 @@ void be_detect_error(struct be_adapter *adapter)
 			/* Do not log error messages if its a FW reset */
 			if (sliport_err1 == SLIPORT_ERROR_FW_RESET1 &&
 			    sliport_err2 == SLIPORT_ERROR_FW_RESET2) {
-				dev_info(dev, "Firmware update in progress\n");
+				dev_info(dev, "Reset is in progress\n");
 			} else {
 				dev_err(dev, "Error detected in the card\n");
 				dev_err(dev, "ERR: sliport status 0x%x\n",
@@ -5218,6 +5295,7 @@ static const struct net_device_ops be_netdev_ops = {
 	.ndo_get_vf_config	= be_get_vf_config,
 	.ndo_set_vf_link_state  = be_set_vf_link_state,
 	.ndo_set_vf_spoofchk    = be_set_vf_spoofchk,
+	.ndo_tx_timeout		= be_tx_timeout,
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= be_netpoll,
 #endif
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net-next 2/2] be2net: Update the driver version to 12.0.0.0
  2018-07-23 14:25 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
  2018-07-23 14:25   ` [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout Suresh Reddy
@ 2018-07-23 14:25   ` Suresh Reddy
  1 sibling, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-07-23 14:25 UTC (permalink / raw)
  To: netdev

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 7005949..d80fe03 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -37,7 +37,7 @@
 #include "be_hw.h"
 #include "be_roce.h"
 
-#define DRV_VER			"11.4.0.0"
+#define DRV_VER			"12.0.0.0"
 #define DRV_NAME		"be2net"
 #define BE_NAME			"Emulex BladeEngine2"
 #define BE3_NAME		"Emulex BladeEngine3"
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout
  2018-07-23 14:25   ` [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout Suresh Reddy
@ 2018-07-23 18:23     ` David Miller
  2018-07-25 12:44       ` Suresh Kumar Reddy Reddygari
  0 siblings, 1 reply; 23+ messages in thread
From: David Miller @ 2018-07-23 18:23 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Reddy <suresh.reddy@broadcom.com>
Date: Mon, 23 Jul 2018 10:25:23 -0400

> Driver dumps tx_queue, tx_compl, pending SKBs information in tx_timeout.
> This debug data used to idenfiy the cause of the time out.
> 
> Also reset Lancer chip in tx_timeout.
> 
> Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>

The purpose of the tx timeout NDO operation is to do whatever is
necessary to handle the TX timeout.

Outputting debugging information is useful, but is secondary.

I see that you do reset the Lancer, but that is far from what really
needs to happen here.

When you get a TX timeout, the hardware is not processing TX ring
entries, nor signalling completion any longer.

Therefore the only way to get things going again is to reset all of
the TX side data structure and logic.  This means shutting down the TX
engine, freeing up all of the TX SKBs in the ring, resetting the TX
ring software state, and then finally reprogramming the head/tail
pointer registers and re-enabling TX DMA processing.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout
  2018-07-23 18:23     ` David Miller
@ 2018-07-25 12:44       ` Suresh Kumar Reddy Reddygari
  2018-07-30 10:53         ` Suresh Kumar Reddy Reddygari
  0 siblings, 1 reply; 23+ messages in thread
From: Suresh Kumar Reddy Reddygari @ 2018-07-25 12:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

On Mon, Jul 23, 2018 at 11:53 PM, David Miller <davem@davemloft.net> wrote:
> From: Suresh Reddy <suresh.reddy@broadcom.com>
> Date: Mon, 23 Jul 2018 10:25:23 -0400
>
>> Driver dumps tx_queue, tx_compl, pending SKBs information in tx_timeout.
>> This debug data used to idenfiy the cause of the time out.
>>
>> Also reset Lancer chip in tx_timeout.
>>
>> Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
>
> The purpose of the tx timeout NDO operation is to do whatever is
> necessary to handle the TX timeout.
>
> Outputting debugging information is useful, but is secondary.
>
> I see that you do reset the Lancer, but that is far from what really
> needs to happen here.
>
> When you get a TX timeout, the hardware is not processing TX ring
> entries, nor signalling completion any longer.
>
> Therefore the only way to get things going again is to reset all of
> the TX side data structure and logic.  This means shutting down the TX
> engine, freeing up all of the TX SKBs in the ring, resetting the TX
> ring software state, and then finally reprogramming the head/tail
> pointer registers and re-enabling TX DMA processing.

The current patch does recover from a TX-timeout by resetting the chip itself
that *includes* resetting the TX block. Freeing up TX SKBs and rings and
re-creating/registering them is also being done.

Now, this recovery is not supported in  BE3 and Skyhawk NICs but is supported
in Lancer NICs. That's why this patch performs recovery in the case of
Lancer NICs
but only gathers diagnostic information in the case of BE3 and Skyhawk NICs.

Regards,
Suresh.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout
  2018-07-25 12:44       ` Suresh Kumar Reddy Reddygari
@ 2018-07-30 10:53         ` Suresh Kumar Reddy Reddygari
  2018-07-30 16:17           ` David Miller
  0 siblings, 1 reply; 23+ messages in thread
From: Suresh Kumar Reddy Reddygari @ 2018-07-30 10:53 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

On Wed, Jul 25, 2018 at 6:14 PM, Suresh Kumar Reddy Reddygari
<suresh.reddy@broadcom.com> wrote:
> On Mon, Jul 23, 2018 at 11:53 PM, David Miller <davem@davemloft.net> wrote:
>> From: Suresh Reddy <suresh.reddy@broadcom.com>
>> Date: Mon, 23 Jul 2018 10:25:23 -0400
>>
>>> Driver dumps tx_queue, tx_compl, pending SKBs information in tx_timeout.
>>> This debug data used to idenfiy the cause of the time out.
>>>
>>> Also reset Lancer chip in tx_timeout.
>>>
...
>>
>> The purpose of the tx timeout NDO operation is to do whatever is
>> necessary to handle the TX timeout.
>>
>> Outputting debugging information is useful, but is secondary.
>>
>> I see that you do reset the Lancer, but that is far from what really
>> needs to happen here.
>>
>> When you get a TX timeout, the hardware is not processing TX ring
>> entries, nor signalling completion any longer.
>>
>> Therefore the only way to get things going again is to reset all of
>> the TX side data structure and logic.  This means shutting down the TX
>> engine, freeing up all of the TX SKBs in the ring, resetting the TX
>> ring software state, and then finally reprogramming the head/tail
>> pointer registers and re-enabling TX DMA processing.

Hi David,

I am clarifying again about the patch (Lancer reset) as I didnt see a reply
from you after my clarification.

> +static void be_tx_timeout(struct net_device *netdev)
> +{
...
> +
> +       if (lancer_chip(adapter)) {
> +               dev_info(dev, "Initiating reset due to tx timeout\n");
> +               dev_info(dev, "Resetting adapter\n");
> +               status = lancer_physdev_ctrl(adapter,
> +                                            PHYSDEV_CONTROL_FW_RESET_MASK);

This patch does recover from a TX-timeout by resetting the chip itself
that *includes* resetting the TX block. Freeing up TX SKBs and rings and
re-creating/registering them is also being done.

Driver and Firmware does the following in chip reset path.

1. When driver sets the PHYSDEV_RESET_MASK bit, Lancer firmware
    goes into an error state and indicates this back to the driver via a bit in
    a doorbell register.
2. Driver detects this and calls be_err_recover().
    be_close() and be_clear() are called in this error recovery path.
    a) In be_close(), we cleanup all pending TX queue entries and SKBs.
    b) In be_clear(), we destroy TX, RX and all other queues.
3. be_resume() is called after be_cleanup() in error recover path.
    In this routine, we create TX, RX queues and other resources
    which were destroyed in cleanup path.

Now, this recovery is supported only on Lancer NICs (and not in  BE3
and Skyhawk NICs). That's why this patch  only gathers diagnostic information
in the case of BE3 and Skyhawk NICs,  but on Lancer NICs does the extra step
of resetting and recovering.

Please let me know if this makes sense. I send a v2 patch with this
explanation in the
commit message.

Regards,
Suresh.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout
  2018-07-30 10:53         ` Suresh Kumar Reddy Reddygari
@ 2018-07-30 16:17           ` David Miller
  0 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2018-07-30 16:17 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Kumar Reddy Reddygari <suresh.reddy@broadcom.com>
Date: Mon, 30 Jul 2018 16:23:12 +0530

> I am clarifying again about the patch (Lancer reset) as I didnt see a reply
> from you after my clarification.

Ok, please resubmit this patch.

Thank you.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH V2 net-next 0/2] be2net: patch-set
       [not found] <Suresh.Reddy@broadcom.com>
                   ` (4 preceding siblings ...)
  2018-07-23 14:25 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
@ 2018-07-31 15:39 ` Suresh Reddy
  2018-07-31 15:39   ` [PATCH V2 net-next 1/2] be2net: gather debug info and reset adapter (only for Lancer) on a tx-timeout Suresh Reddy
                     ` (2 more replies)
  5 siblings, 3 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-07-31 15:39 UTC (permalink / raw)
  To: netdev

v1->v2 : Modified the subject line and commit log.

Please consider applying these two patches to net-next.

Suresh Reddy (2):
  be2net: gather debug info and reset adapter (only for Lancer) on a
    tx-timeout
  be2net: Update the driver version to 12.0.0.0

 drivers/net/ethernet/emulex/benet/be.h      |  2 +-
 drivers/net/ethernet/emulex/benet/be_main.c | 80 ++++++++++++++++++++++++++++-
 2 files changed, 80 insertions(+), 2 deletions(-)

-- 
2.10.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH V2 net-next 1/2] be2net: gather debug info and reset adapter (only for Lancer) on a tx-timeout
  2018-07-31 15:39 ` [PATCH V2 net-next 0/2] be2net: patch-set Suresh Reddy
@ 2018-07-31 15:39   ` Suresh Reddy
  2018-07-31 15:39   ` [PATCH V2 net-next 2/2] be2net: Update the driver version to 12.0.0.0 Suresh Reddy
  2018-08-01 16:39   ` [PATCH V2 net-next 0/2] be2net: patch-set David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-07-31 15:39 UTC (permalink / raw)
  To: netdev

This patch handles a TX-timeout as follows:

1) This patch gathers and prints the following info that can
   help in diagnosing the cause of a TX-timeout.
   a) TX queue and completion queue entries.
   b) SKB and TCP/UDP header details.

2) For Lancer NICs (TX-timeout recovery is not supported for
   BE3/Skyhawk-R NICs), it recovers from the TX timeout as follows:

   a) On a TX-timeout, driver sets the PHYSDEV_CONTROL_FW_RESET_MASK
      bit in the PHYSDEV_CONTROL register. Lancer firmware goes into
      an error state and indicates this back to the driver via a bit
      in a doorbell register.
   b) Driver detects this and calls be_err_recover(). DMA is disabled,
      all pending TX skbs are unmapped and freed (be_close()). All rings
      are destroyed (be_clear()).
   c) The driver waits for the FW to re-initialize and re-creates all
      rings along with other data structs (be_resume())

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c | 80 ++++++++++++++++++++++++++++-
 1 file changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 05e4c0b..580cdec 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1412,6 +1412,83 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev)
 	return NETDEV_TX_OK;
 }
 
+static void be_tx_timeout(struct net_device *netdev)
+{
+	struct be_adapter *adapter = netdev_priv(netdev);
+	struct device *dev = &adapter->pdev->dev;
+	struct be_tx_obj *txo;
+	struct sk_buff *skb;
+	struct tcphdr *tcphdr;
+	struct udphdr *udphdr;
+	u32 *entry;
+	int status;
+	int i, j;
+
+	for_all_tx_queues(adapter, txo, i) {
+		dev_info(dev, "TXQ Dump: %d H: %d T: %d used: %d, qid: 0x%x\n",
+			 i, txo->q.head, txo->q.tail,
+			 atomic_read(&txo->q.used), txo->q.id);
+
+		entry = txo->q.dma_mem.va;
+		for (j = 0; j < TX_Q_LEN * 4; j += 4) {
+			if (entry[j] != 0 || entry[j + 1] != 0 ||
+			    entry[j + 2] != 0 || entry[j + 3] != 0) {
+				dev_info(dev, "Entry %d 0x%x 0x%x 0x%x 0x%x\n",
+					 j, entry[j], entry[j + 1],
+					 entry[j + 2], entry[j + 3]);
+			}
+		}
+
+		entry = txo->cq.dma_mem.va;
+		dev_info(dev, "TXCQ Dump: %d  H: %d T: %d used: %d\n",
+			 i, txo->cq.head, txo->cq.tail,
+			 atomic_read(&txo->cq.used));
+		for (j = 0; j < TX_CQ_LEN * 4; j += 4) {
+			if (entry[j] != 0 || entry[j + 1] != 0 ||
+			    entry[j + 2] != 0 || entry[j + 3] != 0) {
+				dev_info(dev, "Entry %d 0x%x 0x%x 0x%x 0x%x\n",
+					 j, entry[j], entry[j + 1],
+					 entry[j + 2], entry[j + 3]);
+			}
+		}
+
+		for (j = 0; j < TX_Q_LEN; j++) {
+			if (txo->sent_skb_list[j]) {
+				skb = txo->sent_skb_list[j];
+				if (ip_hdr(skb)->protocol == IPPROTO_TCP) {
+					tcphdr = tcp_hdr(skb);
+					dev_info(dev, "TCP source port %d\n",
+						 ntohs(tcphdr->source));
+					dev_info(dev, "TCP dest port %d\n",
+						 ntohs(tcphdr->dest));
+					dev_info(dev, "TCP seqence num %d\n",
+						 ntohs(tcphdr->seq));
+					dev_info(dev, "TCP ack_seq %d\n",
+						 ntohs(tcphdr->ack_seq));
+				} else if (ip_hdr(skb)->protocol ==
+					   IPPROTO_UDP) {
+					udphdr = udp_hdr(skb);
+					dev_info(dev, "UDP source port %d\n",
+						 ntohs(udphdr->source));
+					dev_info(dev, "UDP dest port %d\n",
+						 ntohs(udphdr->dest));
+				}
+				dev_info(dev, "skb[%d] %p len %d proto 0x%x\n",
+					 j, skb, skb->len, skb->protocol);
+			}
+		}
+	}
+
+	if (lancer_chip(adapter)) {
+		dev_info(dev, "Initiating reset due to tx timeout\n");
+		dev_info(dev, "Resetting adapter\n");
+		status = lancer_physdev_ctrl(adapter,
+					     PHYSDEV_CONTROL_FW_RESET_MASK);
+		if (status)
+			dev_err(dev, "Reset failed .. Reboot server\n");
+	}
+}
+
 static inline bool be_in_all_promisc(struct be_adapter *adapter)
 {
 	return (adapter->if_flags & BE_IF_FLAGS_ALL_PROMISCUOUS) ==
@@ -3274,7 +3351,7 @@ void be_detect_error(struct be_adapter *adapter)
 			/* Do not log error messages if its a FW reset */
 			if (sliport_err1 == SLIPORT_ERROR_FW_RESET1 &&
 			    sliport_err2 == SLIPORT_ERROR_FW_RESET2) {
-				dev_info(dev, "Firmware update in progress\n");
+				dev_info(dev, "Reset is in progress\n");
 			} else {
 				dev_err(dev, "Error detected in the card\n");
 				dev_err(dev, "ERR: sliport status 0x%x\n",
@@ -5218,6 +5295,7 @@ static const struct net_device_ops be_netdev_ops = {
 	.ndo_get_vf_config	= be_get_vf_config,
 	.ndo_set_vf_link_state  = be_set_vf_link_state,
 	.ndo_set_vf_spoofchk    = be_set_vf_spoofchk,
+	.ndo_tx_timeout		= be_tx_timeout,
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= be_netpoll,
 #endif
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH V2 net-next 2/2] be2net: Update the driver version to 12.0.0.0
  2018-07-31 15:39 ` [PATCH V2 net-next 0/2] be2net: patch-set Suresh Reddy
  2018-07-31 15:39   ` [PATCH V2 net-next 1/2] be2net: gather debug info and reset adapter (only for Lancer) on a tx-timeout Suresh Reddy
@ 2018-07-31 15:39   ` Suresh Reddy
  2018-08-01 16:39   ` [PATCH V2 net-next 0/2] be2net: patch-set David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: Suresh Reddy @ 2018-07-31 15:39 UTC (permalink / raw)
  To: netdev

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
---
 drivers/net/ethernet/emulex/benet/be.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 7005949..d80fe03 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -37,7 +37,7 @@
 #include "be_hw.h"
 #include "be_roce.h"
 
-#define DRV_VER			"11.4.0.0"
+#define DRV_VER			"12.0.0.0"
 #define DRV_NAME		"be2net"
 #define BE_NAME			"Emulex BladeEngine2"
 #define BE3_NAME		"Emulex BladeEngine3"
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH V2 net-next 0/2] be2net: patch-set
  2018-07-31 15:39 ` [PATCH V2 net-next 0/2] be2net: patch-set Suresh Reddy
  2018-07-31 15:39   ` [PATCH V2 net-next 1/2] be2net: gather debug info and reset adapter (only for Lancer) on a tx-timeout Suresh Reddy
  2018-07-31 15:39   ` [PATCH V2 net-next 2/2] be2net: Update the driver version to 12.0.0.0 Suresh Reddy
@ 2018-08-01 16:39   ` David Miller
  2 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2018-08-01 16:39 UTC (permalink / raw)
  To: suresh.reddy; +Cc: netdev

From: Suresh Reddy <suresh.reddy@broadcom.com>
Date: Tue, 31 Jul 2018 11:39:41 -0400

> v1->v2 : Modified the subject line and commit log.
> 
> Please consider applying these two patches to net-next.

Series applied, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-08-01 18:25 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Suresh.Reddy@broadcom.com>
2017-05-25  2:24 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
2017-05-25  2:24   ` [PATCH net-next 1/2] be2net: Fix UE detection logic for BE3 Suresh Reddy
2017-05-25  2:24   ` [PATCH net-next 2/2] be2net: Update the driver version to 11.4.0.0 Suresh Reddy
2017-05-25 18:45   ` [PATCH net-next 0/2] be2net: patch-set David Miller
2017-09-13 15:12 ` [PATCH net] be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx Suresh Reddy
2017-09-13 16:33   ` David Miller
2018-02-06 13:52 ` [PATCH net 0/2] be2net: patch-set Suresh Reddy
2018-02-06 13:52   ` [PATCH net 1/2] be2net: Fix HW stall issue in Lancer Suresh Reddy
2018-02-06 13:52   ` [PATCH net 2/2] be2net: Handle transmit completion errors " Suresh Reddy
2018-02-06 16:48   ` [PATCH net 0/2] be2net: patch-set David Miller
2018-05-28  5:26 ` [PATCH net] be2net: Fix error detection logic for BE3 Suresh Reddy
2018-05-29 14:58   ` David Miller
2018-07-23 14:25 ` [PATCH net-next 0/2] be2net: patch-set Suresh Reddy
2018-07-23 14:25   ` [PATCH net-next 1/2] be2net: Collect the transmit queue data in Tx timeout Suresh Reddy
2018-07-23 18:23     ` David Miller
2018-07-25 12:44       ` Suresh Kumar Reddy Reddygari
2018-07-30 10:53         ` Suresh Kumar Reddy Reddygari
2018-07-30 16:17           ` David Miller
2018-07-23 14:25   ` [PATCH net-next 2/2] be2net: Update the driver version to 12.0.0.0 Suresh Reddy
2018-07-31 15:39 ` [PATCH V2 net-next 0/2] be2net: patch-set Suresh Reddy
2018-07-31 15:39   ` [PATCH V2 net-next 1/2] be2net: gather debug info and reset adapter (only for Lancer) on a tx-timeout Suresh Reddy
2018-07-31 15:39   ` [PATCH V2 net-next 2/2] be2net: Update the driver version to 12.0.0.0 Suresh Reddy
2018-08-01 16:39   ` [PATCH V2 net-next 0/2] be2net: patch-set David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.