netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22
@ 2019-08-22 20:41 Saeed Mahameed
  2019-08-22 20:41 ` [net 1/4] net/mlx5: Fix crdump chunks print Saeed Mahameed
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Saeed Mahameed @ 2019-08-22 20:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

This series introduces some fixes to mlx5 driver.

1) Form Moshe, two fixes for firmware health reporter
2) From Eran, two ktls fixes.

Please pull and let me know if there is any problem.

No -stable this time :) ..

Thanks,
Saeed.

---
The following changes since commit cc07db5a5b100bc8eaab5097a23d72f858979750:

  gve: Copy and paste bug in gve_get_stats() (2019-08-21 20:50:26 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2019-08-22

for you to fetch changes up to a195784c105b2907b45fd62307d9ce821da9dc20:

  net/mlx5e: Remove ethernet segment from dump WQE (2019-08-22 13:38:48 -0700)

----------------------------------------------------------------
mlx5-fixes-2019-08-22

----------------------------------------------------------------
Eran Ben Elisha (2):
      net/mlx5e: Add num bytes metadata to WQE info
      net/mlx5e: Remove ethernet segment from dump WQE

Moshe Shemesh (2):
      net/mlx5: Fix crdump chunks print
      net/mlx5: Fix delay in fw fatal report handling due to fw report

 .../ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c | 38 ++++++++++------------
 drivers/net/ethernet/mellanox/mlx5/core/health.c   | 22 +++++++------
 2 files changed, 29 insertions(+), 31 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [net 1/4] net/mlx5: Fix crdump chunks print
  2019-08-22 20:41 [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 Saeed Mahameed
@ 2019-08-22 20:41 ` Saeed Mahameed
  2019-08-22 20:41 ` [net 2/4] net/mlx5: Fix delay in fw fatal report handling due to fw report Saeed Mahameed
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Saeed Mahameed @ 2019-08-22 20:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Crdump repeats itself every chunk of 256bytes.
That is due to bug of missing progressing offset while copying the data
from buffer to devlink_fmsg.

Fixes: 9b1f29823605 ("net/mlx5: Add support for FW fatal reporter dump")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 9314777d99e3..cc5887f52679 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -590,7 +590,8 @@ mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
 			data_size = crdump_size - offset;
 		else
 			data_size = MLX5_CR_DUMP_CHUNK_SIZE;
-		err = devlink_fmsg_binary_put(fmsg, cr_data, data_size);
+		err = devlink_fmsg_binary_put(fmsg, (char *)cr_data + offset,
+					      data_size);
 		if (err)
 			goto free_data;
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [net 2/4] net/mlx5: Fix delay in fw fatal report handling due to fw report
  2019-08-22 20:41 [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 Saeed Mahameed
  2019-08-22 20:41 ` [net 1/4] net/mlx5: Fix crdump chunks print Saeed Mahameed
@ 2019-08-22 20:41 ` Saeed Mahameed
  2019-08-22 20:41 ` [net 3/4] net/mlx5e: Add num bytes metadata to WQE info Saeed Mahameed
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Saeed Mahameed @ 2019-08-22 20:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

When fw fatal error occurs, poll health() first detects and reports on a
fw error. Afterwards, it detects and reports on the fw fatal error
itself.

That can cause a long delay in fw fatal error handling which waits in a
queue for the fw error handling to be finished. The fw error handle will
try asking for fw core dump command while fw in fatal state may not
respond and driver will wait for command timeout.

Changing the flow to detect and handle first fw fatal errors and only if
no fatal error detected look for a fw error to handle.

Fixes: d1bf0e2cc4a6 ("net/mlx5: Report devlink health on FW issues")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index cc5887f52679..d685122d9ff7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -701,6 +701,16 @@ static void poll_health(struct timer_list *t)
 	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
 		goto out;
 
+	fatal_error = check_fatal_sensors(dev);
+
+	if (fatal_error && !health->fatal_error) {
+		mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error);
+		dev->priv.health.fatal_error = fatal_error;
+		print_health_info(dev);
+		mlx5_trigger_health_work(dev);
+		goto out;
+	}
+
 	count = ioread32be(health->health_counter);
 	if (count == health->prev)
 		++health->miss_counter;
@@ -719,15 +729,6 @@ static void poll_health(struct timer_list *t)
 	if (health->synd && health->synd != prev_synd)
 		queue_work(health->wq, &health->report_work);
 
-	fatal_error = check_fatal_sensors(dev);
-
-	if (fatal_error && !health->fatal_error) {
-		mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error);
-		dev->priv.health.fatal_error = fatal_error;
-		print_health_info(dev);
-		mlx5_trigger_health_work(dev);
-	}
-
 out:
 	mod_timer(&health->timer, get_next_poll_jiffies());
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [net 3/4] net/mlx5e: Add num bytes metadata to WQE info
  2019-08-22 20:41 [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 Saeed Mahameed
  2019-08-22 20:41 ` [net 1/4] net/mlx5: Fix crdump chunks print Saeed Mahameed
  2019-08-22 20:41 ` [net 2/4] net/mlx5: Fix delay in fw fatal report handling due to fw report Saeed Mahameed
@ 2019-08-22 20:41 ` Saeed Mahameed
  2019-08-22 20:41 ` [net 4/4] net/mlx5e: Remove ethernet segment from dump WQE Saeed Mahameed
  2019-08-24 23:27 ` [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: Saeed Mahameed @ 2019-08-22 20:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

For TLS WQEs, metadata info did not include num_bytes. Due to this issue,
tx_tls_dump_bytes counter did not increment.

Modify tx_fill_wi() to fill num bytes. When it is called for non-traffic
WQE, zero is expected.

Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c   | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
index 8b93101e1a09..0681735ea398 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
@@ -109,13 +109,15 @@ build_progress_params(struct mlx5e_tx_wqe *wqe, u16 pc, u32 sqn,
 
 static void tx_fill_wi(struct mlx5e_txqsq *sq,
 		       u16 pi, u8 num_wqebbs,
-		       skb_frag_t *resync_dump_frag)
+		       skb_frag_t *resync_dump_frag,
+		       u32 num_bytes)
 {
 	struct mlx5e_tx_wqe_info *wi = &sq->db.wqe_info[pi];
 
 	wi->skb              = NULL;
 	wi->num_wqebbs       = num_wqebbs;
 	wi->resync_dump_frag = resync_dump_frag;
+	wi->num_bytes        = num_bytes;
 }
 
 void mlx5e_ktls_tx_offload_set_pending(struct mlx5e_ktls_offload_context_tx *priv_tx)
@@ -143,7 +145,7 @@ post_static_params(struct mlx5e_txqsq *sq,
 
 	umr_wqe = mlx5e_sq_fetch_wqe(sq, MLX5E_KTLS_STATIC_UMR_WQE_SZ, &pi);
 	build_static_params(umr_wqe, sq->pc, sq->sqn, priv_tx, fence);
-	tx_fill_wi(sq, pi, MLX5E_KTLS_STATIC_WQEBBS, NULL);
+	tx_fill_wi(sq, pi, MLX5E_KTLS_STATIC_WQEBBS, NULL, 0);
 	sq->pc += MLX5E_KTLS_STATIC_WQEBBS;
 }
 
@@ -157,7 +159,7 @@ post_progress_params(struct mlx5e_txqsq *sq,
 
 	wqe = mlx5e_sq_fetch_wqe(sq, MLX5E_KTLS_PROGRESS_WQE_SZ, &pi);
 	build_progress_params(wqe, sq->pc, sq->sqn, priv_tx, fence);
-	tx_fill_wi(sq, pi, MLX5E_KTLS_PROGRESS_WQEBBS, NULL);
+	tx_fill_wi(sq, pi, MLX5E_KTLS_PROGRESS_WQEBBS, NULL, 0);
 	sq->pc += MLX5E_KTLS_PROGRESS_WQEBBS;
 }
 
@@ -296,7 +298,7 @@ tx_post_resync_dump(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	dseg->byte_count = cpu_to_be32(fsz);
 	mlx5e_dma_push(sq, dma_addr, fsz, MLX5E_DMA_MAP_PAGE);
 
-	tx_fill_wi(sq, pi, num_wqebbs, frag);
+	tx_fill_wi(sq, pi, num_wqebbs, frag, fsz);
 	sq->pc += num_wqebbs;
 
 	WARN(num_wqebbs > MLX5E_KTLS_MAX_DUMP_WQEBBS,
@@ -323,7 +325,7 @@ static void tx_post_fence_nop(struct mlx5e_txqsq *sq)
 	struct mlx5_wq_cyc *wq = &sq->wq;
 	u16 pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
 
-	tx_fill_wi(sq, pi, 1, NULL);
+	tx_fill_wi(sq, pi, 1, NULL, 0);
 
 	mlx5e_post_nop_fence(wq, sq->sqn, &sq->pc);
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [net 4/4] net/mlx5e: Remove ethernet segment from dump WQE
  2019-08-22 20:41 [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2019-08-22 20:41 ` [net 3/4] net/mlx5e: Add num bytes metadata to WQE info Saeed Mahameed
@ 2019-08-22 20:41 ` Saeed Mahameed
  2019-08-24 23:27 ` [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: Saeed Mahameed @ 2019-08-22 20:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

Dump WQE shall not include Ethernet segment. Define mlx5e_dump_wqe to be
used for "Dump WQEs" instead of sharing it with the general mlx5e_tx_wqe
layout.

Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../mellanox/mlx5/core/en_accel/ktls_tx.c     | 26 +++++++------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
index 0681735ea398..7833ddef0427 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
@@ -250,43 +250,37 @@ tx_post_resync_params(struct mlx5e_txqsq *sq,
 	mlx5e_ktls_tx_post_param_wqes(sq, priv_tx, skip_static_post, true);
 }
 
+struct mlx5e_dump_wqe {
+	struct mlx5_wqe_ctrl_seg ctrl;
+	struct mlx5_wqe_data_seg data;
+};
+
 static int
 tx_post_resync_dump(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 		    skb_frag_t *frag, u32 tisn, bool first)
 {
 	struct mlx5_wqe_ctrl_seg *cseg;
-	struct mlx5_wqe_eth_seg  *eseg;
 	struct mlx5_wqe_data_seg *dseg;
-	struct mlx5e_tx_wqe *wqe;
+	struct mlx5e_dump_wqe *wqe;
 	dma_addr_t dma_addr = 0;
-	u16 ds_cnt, ds_cnt_inl;
 	u8  num_wqebbs;
-	u16 pi, ihs;
+	u16 ds_cnt;
 	int fsz;
-
-	ds_cnt = sizeof(*wqe) / MLX5_SEND_WQE_DS;
-	ihs    = eth_get_headlen(skb->dev, skb->data, skb_headlen(skb));
-	ds_cnt_inl = DIV_ROUND_UP(ihs - INL_HDR_START_SZ, MLX5_SEND_WQE_DS);
-	ds_cnt += ds_cnt_inl;
-	ds_cnt += 1; /* one frag */
+	u16 pi;
 
 	wqe = mlx5e_sq_fetch_wqe(sq, sizeof(*wqe), &pi);
 
+	ds_cnt = sizeof(*wqe) / MLX5_SEND_WQE_DS;
 	num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS);
 
 	cseg = &wqe->ctrl;
-	eseg = &wqe->eth;
-	dseg =  wqe->data;
+	dseg = &wqe->data;
 
 	cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8)  | MLX5_OPCODE_DUMP);
 	cseg->qpn_ds           = cpu_to_be32((sq->sqn << 8) | ds_cnt);
 	cseg->tisn             = cpu_to_be32(tisn << 8);
 	cseg->fm_ce_se         = first ? MLX5_FENCE_MODE_INITIATOR_SMALL : 0;
 
-	eseg->inline_hdr.sz = cpu_to_be16(ihs);
-	memcpy(eseg->inline_hdr.start, skb->data, ihs);
-	dseg += ds_cnt_inl;
-
 	fsz = skb_frag_size(frag);
 	dma_addr = skb_frag_dma_map(sq->pdev, frag, 0, fsz,
 				    DMA_TO_DEVICE);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22
  2019-08-22 20:41 [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2019-08-22 20:41 ` [net 4/4] net/mlx5e: Remove ethernet segment from dump WQE Saeed Mahameed
@ 2019-08-24 23:27 ` David Miller
  4 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2019-08-24 23:27 UTC (permalink / raw)
  To: saeedm; +Cc: netdev

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Thu, 22 Aug 2019 20:41:34 +0000

> This series introduces some fixes to mlx5 driver.
> 
> 1) Form Moshe, two fixes for firmware health reporter
> 2) From Eran, two ktls fixes.
> 
> Please pull and let me know if there is any problem.
> 
> No -stable this time :) ..

:)  Pulled, thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-08-24 23:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22 20:41 [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 Saeed Mahameed
2019-08-22 20:41 ` [net 1/4] net/mlx5: Fix crdump chunks print Saeed Mahameed
2019-08-22 20:41 ` [net 2/4] net/mlx5: Fix delay in fw fatal report handling due to fw report Saeed Mahameed
2019-08-22 20:41 ` [net 3/4] net/mlx5e: Add num bytes metadata to WQE info Saeed Mahameed
2019-08-22 20:41 ` [net 4/4] net/mlx5e: Remove ethernet segment from dump WQE Saeed Mahameed
2019-08-24 23:27 ` [pull request][net 0/4] Mellanox, mlx5 fixes 2019-08-22 David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).