All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements
@ 2019-04-23 19:14 Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 01/13] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
                   ` (13 more replies)
  0 siblings, 14 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Saeed Mahameed

Hi Dave,

This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.

For more information please see tag log below.

Please pull and let me know if there is any problem.

Please note that the series starts with a merge of mlx5-next branch,
to resolve and avoid dependency with rdma tree, and I just merged
v5.1-rc1 into mlx5-next since we forgot to reset the branch on last
merge window, i hope this is ok with you, next time i will avoid such
merges with linus tree.

v1->v2:
 - Drop 1st patch "prefetch for small L1_CACHE_BYTES", we will have to
   introduce a new netdev helper function to be used by any driver, we will
   resubmit it as standalone patch later. 

Thanks,
Saeed.

---
The following changes since commit 3839f99d21688d3062ebd3cc06db46edb3b99ac1:

  Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux (2019-04-23 11:57:33 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-04-22

for you to fetch changes up to f8ebecf2e32a62137dc5a98b2c94b1db37a0f9f8:

  net/mlx5e: Use #define for the WQE wait timeout constant (2019-04-23 12:09:22 -0700)

----------------------------------------------------------------
mlx5-updates-2019-04-22

This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.

From Tariq:
1) Some Enhancements in rq->flags
2) Stabilize RX packet rate (on Striding RQ) with
multiple outstanding UMR posts
In this patch, we add support for multiple outstanding UMR posts,
 to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.

Performance test:
As expected, huge improvement in large-scale (48 cores).

xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

Before: Unstable, 7 to 30 Mpps
After:  Stable,   at 70.5 Mpps

From Shay:
3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow

Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).

When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.

Performance:
    Tested packet rate for UDP 64Byte multi-stream
    over two dual port ConnectX-5 100Gbps NICs.
    CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

    * Tested with hyper-threading disabled

    XDP_TX:

    |          | before | after   |       |
    | 24 rings | 51Mpps | 116Mpps | +126% |
    | 1 ring   | 12Mpps | 12Mpps  | same  |

    XDP_REDIRECT:

    ** Below is the transmit rate, not the redirection rate
    which might be larger, and is not affected by this patch.

    |          | before  | after   |      |
    | 32 rings | 64Mpps  | 92Mpps  | +43% |
    | 1 ring   | 6.4Mpps | 6.4Mpps | same |

As we can see, feature significantly improves scaling, without
hurting single ring performance.

From Maxim:
4) Some trivial refactoring and code improvements prior to a larger series
to support AF_XDP.

-Saeed.

----------------------------------------------------------------
Maxim Mikityanskiy (8):
      net/mlx5e: Remove unused parameter
      net/mlx5e: Report mlx5e_xdp_set errors
      net/mlx5e: Move parameter calculation functions to en/params.c
      net/mlx5e: Add an underflow warning comment
      net/mlx5e: Remove unused parameter
      net/mlx5e: Take HW interrupt trigger into a function
      net/mlx5e: Remove unused rx_page_reuse stat
      net/mlx5e: Use #define for the WQE wait timeout constant

Shay Agroskin (2):
      net/mlx5e: XDP, Add TX MPWQE session counter
      net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow

Tariq Toukan (3):
      net/mlx5e: RX, Support multiple outstanding UMR posts
      net/mlx5e: XDP, Fix shifted flag index in RQ bitmap
      net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  20 ++-
 .../net/ethernet/mellanox/mlx5/core/en/params.c    | 104 +++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en/params.h    |  22 +++
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c   |  30 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h   |  57 ++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 172 ++++++---------------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 130 ++++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c |  15 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h |   8 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  |  11 ++
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       |  12 ++
 include/linux/mlx5/qp.h                            |   1 +
 13 files changed, 390 insertions(+), 195 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [net-next V2 01/13] net/mlx5e: RX, Support multiple outstanding UMR posts
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 02/13] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap Saeed Mahameed
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

The buffers mapping of the Multi-Packet WQEs (of Striding RQ)
is done via UMR posts, one UMR WQE per an RX MPWQE.

A single MPWQE is capable of serving many incoming packets,
usually larger than the budget of a single napi cycle.
Hence, posting a single UMR WQE per napi cycle (and handling its
completion in the next cycle) works fine in many common cases,
but not always.

When an XDP program is loaded, every MPWQE is capable of serving less
packets, to satisfy the packet-per-page requirement.
Thus, for the same number of packets more MPWQEs (and UMR posts)
are needed (twice as much for the default MTU), giving less latency
room for the UMR completions.

In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.

For better SW and HW locality, we combine the UMR posts in bulks of
(at least) two.

This is expected to improve packet rate in high CPU scale.

Performance test:
As expected, huge improvement in large-scale (48 cores).

xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

Before: Unstable, 7 to 30 Mpps
After:  Stable,   at 70.5 Mpps

No degradation in other tested scenarios.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  10 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  36 ++++-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 130 ++++++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/wq.h  |  12 ++
 4 files changed, 136 insertions(+), 52 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 51e109fdeec1..abd2c67fe419 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -461,10 +461,10 @@ struct mlx5e_xdpsq {
 
 struct mlx5e_icosq {
 	/* data path */
+	u16                        cc;
+	u16                        pc;
 
-	/* dirtied @xmit */
-	u16                        pc ____cacheline_aligned_in_smp;
-
+	struct mlx5_wqe_ctrl_seg  *doorbell_cseg;
 	struct mlx5e_cq            cq;
 
 	/* write@xmit, read@completion */
@@ -562,8 +562,10 @@ struct mlx5e_rq {
 			struct mlx5e_mpw_info *info;
 			mlx5e_fp_skb_from_cqe_mpwrq skb_from_cqe_mpwrq;
 			u16                    num_strides;
+			u16                    actual_wq_head;
 			u8                     log_stride_sz;
-			bool                   umr_in_progress;
+			u8                     umr_in_progress;
+			u8                     umr_last_bulk;
 		} mpwqe;
 	};
 	struct {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 5c127fccad60..7ab195ac7299 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -903,10 +903,14 @@ static void mlx5e_free_rx_descs(struct mlx5e_rq *rq)
 
 	if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
 		struct mlx5_wq_ll *wq = &rq->mpwqe.wq;
+		u16 head = wq->head;
+		int i;
 
-		/* UMR WQE (if in progress) is always at wq->head */
-		if (rq->mpwqe.umr_in_progress)
-			rq->dealloc_wqe(rq, wq->head);
+		/* Outstanding UMR WQEs (in progress) start at wq->head */
+		for (i = 0; i < rq->mpwqe.umr_in_progress; i++) {
+			rq->dealloc_wqe(rq, head);
+			head = mlx5_wq_ll_get_wqe_next_ix(wq, head);
+		}
 
 		while (!mlx5_wq_ll_is_empty(wq)) {
 			struct mlx5e_rx_wqe_ll *wqe;
@@ -1092,7 +1096,7 @@ static void mlx5e_free_icosq_db(struct mlx5e_icosq *sq)
 
 static int mlx5e_alloc_icosq_db(struct mlx5e_icosq *sq, int numa)
 {
-	u8 wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
+	int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
 
 	sq->db.ico_wqe = kvzalloc_node(array_size(wq_sz,
 						  sizeof(*sq->db.ico_wqe)),
@@ -2108,6 +2112,13 @@ static inline u8 mlx5e_get_rqwq_log_stride(u8 wq_type, int ndsegs)
 	return order_base_2(sz);
 }
 
+static u8 mlx5e_get_rq_log_wq_sz(void *rqc)
+{
+	void *wq = MLX5_ADDR_OF(rqc, rqc, wq);
+
+	return MLX5_GET(wq, wq, log_wq_sz);
+}
+
 static void mlx5e_build_rq_param(struct mlx5e_priv *priv,
 				 struct mlx5e_params *params,
 				 struct mlx5e_rq_param *param)
@@ -2274,13 +2285,28 @@ static void mlx5e_build_xdpsq_param(struct mlx5e_priv *priv,
 	param->is_mpw = MLX5E_GET_PFLAG(params, MLX5E_PFLAG_XDP_TX_MPWQE);
 }
 
+static u8 mlx5e_build_icosq_log_wq_sz(struct mlx5e_params *params,
+				      struct mlx5e_rq_param *rqp)
+{
+	switch (params->rq_wq_type) {
+	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
+		return order_base_2(MLX5E_UMR_WQEBBS) +
+			mlx5e_get_rq_log_wq_sz(rqp->rqc);
+	default: /* MLX5_WQ_TYPE_CYCLIC */
+		return MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	}
+}
+
 static void mlx5e_build_channel_param(struct mlx5e_priv *priv,
 				      struct mlx5e_params *params,
 				      struct mlx5e_channel_param *cparam)
 {
-	u8 icosq_log_wq_sz = MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	u8 icosq_log_wq_sz;
 
 	mlx5e_build_rq_param(priv, params, &cparam->rq);
+
+	icosq_log_wq_sz = mlx5e_build_icosq_log_wq_sz(params, &cparam->rq);
+
 	mlx5e_build_sq_param(priv, params, &cparam->sq);
 	mlx5e_build_xdpsq_param(priv, params, &cparam->xdp_sq);
 	mlx5e_build_icosq_param(priv, icosq_log_wq_sz, &cparam->icosq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index c3b3002ff62f..13133e7f088e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -409,14 +409,15 @@ mlx5e_free_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, bool recycle
 			mlx5e_page_release(rq, &dma_info[i], recycle);
 }
 
-static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq)
+static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq, u8 n)
 {
 	struct mlx5_wq_ll *wq = &rq->mpwqe.wq;
-	struct mlx5e_rx_wqe_ll *wqe = mlx5_wq_ll_get_wqe(wq, wq->head);
 
-	rq->mpwqe.umr_in_progress = false;
+	do {
+		u16 next_wqe_index = mlx5_wq_ll_get_wqe_next_ix(wq, wq->head);
 
-	mlx5_wq_ll_push(wq, be16_to_cpu(wqe->next.next_wqe_index));
+		mlx5_wq_ll_push(wq, next_wqe_index);
+	} while (--n);
 
 	/* ensure wqes are visible to device before updating doorbell record */
 	dma_wmb();
@@ -426,7 +427,7 @@ static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq)
 
 static inline u16 mlx5e_icosq_wrap_cnt(struct mlx5e_icosq *sq)
 {
-	return sq->pc >> MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	return mlx5_wq_cyc_get_ctr_wrap_cnt(&sq->wq, sq->pc);
 }
 
 static inline void mlx5e_fill_icosq_frag_edge(struct mlx5e_icosq *sq,
@@ -478,8 +479,6 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 	bitmap_zero(wi->xdp_xmit_bitmap, MLX5_MPWRQ_PAGES_PER_WQE);
 	wi->consumed_strides = 0;
 
-	rq->mpwqe.umr_in_progress = true;
-
 	umr_wqe->ctrl.opmod_idx_opcode =
 		cpu_to_be32((sq->pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) |
 			    MLX5_OPCODE_UMR);
@@ -487,7 +486,8 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 
 	sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_UMR;
 	sq->pc += MLX5E_UMR_WQEBBS;
-	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &umr_wqe->ctrl);
+
+	sq->doorbell_cseg = &umr_wqe->ctrl;
 
 	return 0;
 
@@ -542,37 +542,13 @@ bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
 	return !!err;
 }
 
-static inline void mlx5e_poll_ico_single_cqe(struct mlx5e_cq *cq,
-					     struct mlx5e_icosq *sq,
-					     struct mlx5e_rq *rq,
-					     struct mlx5_cqe64 *cqe)
-{
-	struct mlx5_wq_cyc *wq = &sq->wq;
-	u16 ci = mlx5_wq_cyc_ctr2ix(wq, be16_to_cpu(cqe->wqe_counter));
-	struct mlx5e_sq_wqe_info *icowi = &sq->db.ico_wqe[ci];
-
-	mlx5_cqwq_pop(&cq->wq);
-
-	if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) {
-		netdev_WARN_ONCE(cq->channel->netdev,
-				 "Bad OP in ICOSQ CQE: 0x%x\n", get_cqe_opcode(cqe));
-		return;
-	}
-
-	if (likely(icowi->opcode == MLX5_OPCODE_UMR)) {
-		mlx5e_post_rx_mpwqe(rq);
-		return;
-	}
-
-	if (unlikely(icowi->opcode != MLX5_OPCODE_NOP))
-		netdev_WARN_ONCE(cq->channel->netdev,
-				 "Bad OPCODE in ICOSQ WQE info: 0x%x\n", icowi->opcode);
-}
-
 static void mlx5e_poll_ico_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq)
 {
 	struct mlx5e_icosq *sq = container_of(cq, struct mlx5e_icosq, cq);
 	struct mlx5_cqe64 *cqe;
+	u8  completed_umr = 0;
+	u16 sqcc;
+	int i;
 
 	if (unlikely(!test_bit(MLX5E_SQ_STATE_ENABLED, &sq->state)))
 		return;
@@ -581,28 +557,96 @@ static void mlx5e_poll_ico_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq)
 	if (likely(!cqe))
 		return;
 
-	/* by design, there's only a single cqe */
-	mlx5e_poll_ico_single_cqe(cq, sq, rq, cqe);
+	/* sq->cc must be updated only after mlx5_cqwq_update_db_record(),
+	 * otherwise a cq overrun may occur
+	 */
+	sqcc = sq->cc;
+
+	i = 0;
+	do {
+		u16 wqe_counter;
+		bool last_wqe;
+
+		mlx5_cqwq_pop(&cq->wq);
+
+		wqe_counter = be16_to_cpu(cqe->wqe_counter);
+
+		if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) {
+			netdev_WARN_ONCE(cq->channel->netdev,
+					 "Bad OP in ICOSQ CQE: 0x%x\n", get_cqe_opcode(cqe));
+			break;
+		}
+		do {
+			struct mlx5e_sq_wqe_info *wi;
+			u16 ci;
+
+			last_wqe = (sqcc == wqe_counter);
+
+			ci = mlx5_wq_cyc_ctr2ix(&sq->wq, sqcc);
+			wi = &sq->db.ico_wqe[ci];
+
+			if (likely(wi->opcode == MLX5_OPCODE_UMR)) {
+				sqcc += MLX5E_UMR_WQEBBS;
+				completed_umr++;
+			} else if (likely(wi->opcode == MLX5_OPCODE_NOP)) {
+				sqcc++;
+			} else {
+				netdev_WARN_ONCE(cq->channel->netdev,
+						 "Bad OPCODE in ICOSQ WQE info: 0x%x\n",
+						 wi->opcode);
+			}
+
+		} while (!last_wqe);
+
+	} while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq)));
+
+	sq->cc = sqcc;
 
 	mlx5_cqwq_update_db_record(&cq->wq);
+
+	if (likely(completed_umr)) {
+		mlx5e_post_rx_mpwqe(rq, completed_umr);
+		rq->mpwqe.umr_in_progress -= completed_umr;
+	}
 }
 
 bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq)
 {
+	struct mlx5e_icosq *sq = &rq->channel->icosq;
 	struct mlx5_wq_ll *wq = &rq->mpwqe.wq;
+	u8  missing, i;
+	u16 head;
 
 	if (unlikely(!test_bit(MLX5E_RQ_STATE_ENABLED, &rq->state)))
 		return false;
 
-	mlx5e_poll_ico_cq(&rq->channel->icosq.cq, rq);
+	mlx5e_poll_ico_cq(&sq->cq, rq);
+
+	missing = mlx5_wq_ll_missing(wq) - rq->mpwqe.umr_in_progress;
 
-	if (mlx5_wq_ll_is_full(wq))
+	if (unlikely(rq->mpwqe.umr_in_progress > rq->mpwqe.umr_last_bulk))
+		rq->stats->congst_umr++;
+
+#define UMR_WQE_BULK (2)
+	if (likely(missing < UMR_WQE_BULK))
 		return false;
 
-	if (!rq->mpwqe.umr_in_progress)
-		mlx5e_alloc_rx_mpwqe(rq, wq->head);
-	else
-		rq->stats->congst_umr += mlx5_wq_ll_missing(wq) > 2;
+	head = rq->mpwqe.actual_wq_head;
+	i = missing;
+	do {
+		if (unlikely(mlx5e_alloc_rx_mpwqe(rq, head)))
+			break;
+		head = mlx5_wq_ll_get_wqe_next_ix(wq, head);
+	} while (--i);
+
+	rq->mpwqe.umr_last_bulk    = missing - i;
+	if (sq->doorbell_cseg) {
+		mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, sq->doorbell_cseg);
+		sq->doorbell_cseg = NULL;
+	}
+
+	rq->mpwqe.umr_in_progress += rq->mpwqe.umr_last_bulk;
+	rq->mpwqe.actual_wq_head   = head;
 
 	return false;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index ea934a48c90a..1f87cce421e0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -134,6 +134,11 @@ static inline void mlx5_wq_cyc_update_db_record(struct mlx5_wq_cyc *wq)
 	*wq->db = cpu_to_be32(wq->wqe_ctr);
 }
 
+static inline u16 mlx5_wq_cyc_get_ctr_wrap_cnt(struct mlx5_wq_cyc *wq, u16 ctr)
+{
+	return ctr >> wq->fbc.log_sz;
+}
+
 static inline u16 mlx5_wq_cyc_ctr2ix(struct mlx5_wq_cyc *wq, u16 ctr)
 {
 	return ctr & wq->fbc.sz_m1;
@@ -243,6 +248,13 @@ static inline void *mlx5_wq_ll_get_wqe(struct mlx5_wq_ll *wq, u16 ix)
 	return mlx5_frag_buf_get_wqe(&wq->fbc, ix);
 }
 
+static inline u16 mlx5_wq_ll_get_wqe_next_ix(struct mlx5_wq_ll *wq, u16 ix)
+{
+	struct mlx5_wqe_srq_next_seg *wqe = mlx5_wq_ll_get_wqe(wq, ix);
+
+	return be16_to_cpu(wqe->next_wqe_index);
+}
+
 static inline void mlx5_wq_ll_push(struct mlx5_wq_ll *wq, u16 head_next)
 {
 	wq->head = head_next;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 02/13] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 01/13] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 03/13] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush Saeed Mahameed
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Shay Agroskin, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Values in enum mlx5e_rq_flag are used as bit indixes.
Intention was to use them with no BIT(i) wrapping.

No functional bug fix here, as the same (shifted)flag bit
is used for all set, test, and clear operations.

Fixes: 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index abd2c67fe419..8b37264ba107 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -531,7 +531,7 @@ typedef bool (*mlx5e_fp_post_rx_wqes)(struct mlx5e_rq *rq);
 typedef void (*mlx5e_fp_dealloc_wqe)(struct mlx5e_rq*, u16);
 
 enum mlx5e_rq_flag {
-	MLX5E_RQ_FLAG_XDP_XMIT = BIT(0),
+	MLX5E_RQ_FLAG_XDP_XMIT,
 };
 
 struct mlx5e_rq_frag_info {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 03/13] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 01/13] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 02/13] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 04/13] net/mlx5e: XDP, Add TX MPWQE session counter Saeed Mahameed
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Shay Agroskin, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

The XDP redirect flush indication belongs to the receive queue,
not to its XDP send queue.

For this, use a new bit on rq->flags.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h     | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 8b37264ba107..a3700c57b073 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -428,7 +428,6 @@ struct mlx5e_xdpsq {
 	/* dirtied @completion */
 	u32                        xdpi_fifo_cc;
 	u16                        cc;
-	bool                       redirect_flush;
 
 	/* dirtied @xmit */
 	u32                        xdpi_fifo_pc ____cacheline_aligned_in_smp;
@@ -532,6 +531,7 @@ typedef void (*mlx5e_fp_dealloc_wqe)(struct mlx5e_rq*, u16);
 
 enum mlx5e_rq_flag {
 	MLX5E_RQ_FLAG_XDP_XMIT,
+	MLX5E_RQ_FLAG_XDP_REDIRECT,
 };
 
 struct mlx5e_rq_frag_info {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 03b2a9f9c589..9e7ed599ae0a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -85,7 +85,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
 		if (unlikely(err))
 			goto xdp_abort;
 		__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
-		rq->xdpsq.redirect_flush = true;
+		__set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
 		mlx5e_page_dma_unmap(rq, di);
 		rq->stats->xdp_redirect++;
 		return true;
@@ -419,9 +419,9 @@ void mlx5e_xdp_rx_poll_complete(struct mlx5e_rq *rq)
 
 	mlx5e_xmit_xdp_doorbell(xdpsq);
 
-	if (xdpsq->redirect_flush) {
+	if (test_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags)) {
 		xdp_do_flush_map();
-		xdpsq->redirect_flush = false;
+		__clear_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
 	}
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 04/13] net/mlx5e: XDP, Add TX MPWQE session counter
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (2 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 03/13] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 05/13] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Shay Agroskin,
	Saeed Mahameed

From: Shay Agroskin <shayag@mellanox.com>

This counter tracks how many TX MPWQE sessions are started in XDP SQ
in XDP TX/REDIRECT flow. It counts per-channel and global stats.

Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c   | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 6 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 3 +++
 3 files changed, 11 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 9e7ed599ae0a..a9075b526ab9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -105,6 +105,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
 static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 {
 	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
+	struct mlx5e_xdpsq_stats *stats = sq->stats;
 	struct mlx5_wq_cyc *wq = &sq->wq;
 	u8  wqebbs;
 	u16 pi;
@@ -131,6 +132,7 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 		       MLX5E_XDP_MPW_MAX_WQEBBS);
 
 	session->max_ds_count = MLX5_SEND_WQEBB_NUM_DS * wqebbs;
+	stats->mpwqe++;
 }
 
 static void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index b75aa8b8bf04..80ee48dcc0a3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -65,6 +65,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_drop) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_redirect) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_xmit) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_mpwqe) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_cqe) },
@@ -79,6 +80,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_wake) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_xmit) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_mpwqe) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_cqes) },
@@ -160,6 +162,7 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_xdp_drop     += rq_stats->xdp_drop;
 		s->rx_xdp_redirect += rq_stats->xdp_redirect;
 		s->rx_xdp_tx_xmit  += xdpsq_stats->xmit;
+		s->rx_xdp_tx_mpwqe += xdpsq_stats->mpwqe;
 		s->rx_xdp_tx_full  += xdpsq_stats->full;
 		s->rx_xdp_tx_err   += xdpsq_stats->err;
 		s->rx_xdp_tx_cqe   += xdpsq_stats->cqes;
@@ -185,6 +188,7 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->ch_eq_rearm    += ch_stats->eq_rearm;
 		/* xdp redirect */
 		s->tx_xdp_xmit    += xdpsq_red_stats->xmit;
+		s->tx_xdp_mpwqe += xdpsq_red_stats->mpwqe;
 		s->tx_xdp_full    += xdpsq_red_stats->full;
 		s->tx_xdp_err     += xdpsq_red_stats->err;
 		s->tx_xdp_cqes    += xdpsq_red_stats->cqes;
@@ -1245,6 +1249,7 @@ static const struct counter_desc sq_stats_desc[] = {
 
 static const struct counter_desc rq_xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
+	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
@@ -1252,6 +1257,7 @@ static const struct counter_desc rq_xdpsq_stats_desc[] = {
 
 static const struct counter_desc xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
+	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 16c3b785f282..1f05ffa086b1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -77,6 +77,7 @@ struct mlx5e_sw_stats {
 	u64 rx_xdp_drop;
 	u64 rx_xdp_redirect;
 	u64 rx_xdp_tx_xmit;
+	u64 rx_xdp_tx_mpwqe;
 	u64 rx_xdp_tx_full;
 	u64 rx_xdp_tx_err;
 	u64 rx_xdp_tx_cqe;
@@ -91,6 +92,7 @@ struct mlx5e_sw_stats {
 	u64 tx_queue_wake;
 	u64 tx_cqe_err;
 	u64 tx_xdp_xmit;
+	u64 tx_xdp_mpwqe;
 	u64 tx_xdp_full;
 	u64 tx_xdp_err;
 	u64 tx_xdp_cqes;
@@ -241,6 +243,7 @@ struct mlx5e_sq_stats {
 
 struct mlx5e_xdpsq_stats {
 	u64 xmit;
+	u64 mpwqe;
 	u64 full;
 	u64 err;
 	/* dirtied @completion */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 05/13] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (3 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 04/13] net/mlx5e: XDP, Add TX MPWQE session counter Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 06/13] net/mlx5e: Remove unused parameter Saeed Mahameed
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Shay Agroskin,
	Tariq Toukan, Saeed Mahameed

From: Shay Agroskin <shayag@mellanox.com>

Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).

When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.

This is added to better utilize the HW resources (which now makes
one less packet data prefetch) and allow better scalability, on the
account of CPU usage (which now 'memcpy's the packet into the WQE).

To load balance between HW and CPU and get max packet rate, we use
watermarks to detect how much the HW is congested and move the work
loads back and forth between HW and CPU.

Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

* Tested with hyper-threading disabled

XDP_TX:

|          | before | after   |       |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring   | 12Mpps | 12Mpps  | same  |

XDP_REDIRECT:

** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.

|          | before  | after   |      |
| 32 rings | 64Mpps  | 92Mpps  | +43% |
| 1 ring   | 6.4Mpps | 6.4Mpps | same |

As we can see, feature significantly improves scaling, without
hurting single ring performance.

Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  5 +-
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 22 ++++---
 .../net/ethernet/mellanox/mlx5/core/en/xdp.h  | 57 ++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  2 +-
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  8 ++-
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  3 +
 include/linux/mlx5/qp.h                       |  1 +
 7 files changed, 83 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index a3700c57b073..c190061763fd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -409,14 +409,17 @@ struct mlx5e_xdp_info_fifo {
 
 struct mlx5e_xdp_wqe_info {
 	u8 num_wqebbs;
-	u8 num_ds;
+	u8 num_pkts;
 };
 
 struct mlx5e_xdp_mpwqe {
 	/* Current MPWQE session */
 	struct mlx5e_tx_wqe *wqe;
 	u8                   ds_count;
+	u8                   pkt_count;
 	u8                   max_ds_count;
+	u8                   complete;
+	u8                   inline_on;
 };
 
 struct mlx5e_xdpsq;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index a9075b526ab9..c3d4efbf60ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -113,7 +113,9 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 	mlx5e_xdpsq_fetch_wqe(sq, &session->wqe);
 
 	prefetchw(session->wqe->data);
-	session->ds_count = MLX5E_XDP_TX_EMPTY_DS_COUNT;
+	session->ds_count  = MLX5E_XDP_TX_EMPTY_DS_COUNT;
+	session->pkt_count = 0;
+	session->complete  = 0;
 
 	pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
 
@@ -132,6 +134,9 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 		       MLX5E_XDP_MPW_MAX_WQEBBS);
 
 	session->max_ds_count = MLX5_SEND_WQEBB_NUM_DS * wqebbs;
+
+	mlx5e_xdp_update_inline_state(sq);
+
 	stats->mpwqe++;
 }
 
@@ -149,7 +154,7 @@ static void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq)
 	cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | ds_count);
 
 	wi->num_wqebbs = DIV_ROUND_UP(ds_count, MLX5_SEND_WQEBB_NUM_DS);
-	wi->num_ds     = ds_count - MLX5E_XDP_TX_EMPTY_DS_COUNT;
+	wi->num_pkts   = session->pkt_count;
 
 	sq->pc += wi->num_wqebbs;
 
@@ -164,11 +169,9 @@ static bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq,
 	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
 	struct mlx5e_xdpsq_stats *stats = sq->stats;
 
-	dma_addr_t dma_addr    = xdpi->dma_addr;
 	struct xdp_frame *xdpf = xdpi->xdpf;
-	unsigned int dma_len   = xdpf->len;
 
-	if (unlikely(sq->hw_mtu < dma_len)) {
+	if (unlikely(sq->hw_mtu < xdpf->len)) {
 		stats->err++;
 		return false;
 	}
@@ -185,9 +188,10 @@ static bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq,
 		mlx5e_xdp_mpwqe_session_start(sq);
 	}
 
-	mlx5e_xdp_mpwqe_add_dseg(sq, dma_addr, dma_len);
+	mlx5e_xdp_mpwqe_add_dseg(sq, xdpi, stats);
 
-	if (unlikely(session->ds_count == session->max_ds_count))
+	if (unlikely(session->complete ||
+		     session->ds_count == session->max_ds_count))
 		mlx5e_xdp_mpwqe_complete(sq);
 
 	mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, xdpi);
@@ -301,7 +305,7 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq)
 
 			sqcc += wi->num_wqebbs;
 
-			for (j = 0; j < wi->num_ds; j++) {
+			for (j = 0; j < wi->num_pkts; j++) {
 				struct mlx5e_xdp_info xdpi =
 					mlx5e_xdpi_fifo_pop(xdpi_fifo);
 
@@ -342,7 +346,7 @@ void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq)
 
 		sq->cc += wi->num_wqebbs;
 
-		for (i = 0; i < wi->num_ds; i++) {
+		for (i = 0; i < wi->num_pkts; i++) {
 			struct mlx5e_xdp_info xdpi =
 				mlx5e_xdpi_fifo_pop(xdpi_fifo);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index ee27a7c8cd87..858e7a2a13ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -75,16 +75,68 @@ static inline void mlx5e_xmit_xdp_doorbell(struct mlx5e_xdpsq *sq)
 	}
 }
 
+/* Enable inline WQEs to shift some load from a congested HCA (HW) to
+ * a less congested cpu (SW).
+ */
+static inline void mlx5e_xdp_update_inline_state(struct mlx5e_xdpsq *sq)
+{
+	u16 outstanding = sq->xdpi_fifo_pc - sq->xdpi_fifo_cc;
+	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
+
+#define MLX5E_XDP_INLINE_WATERMARK_LOW	10
+#define MLX5E_XDP_INLINE_WATERMARK_HIGH 128
+
+	if (session->inline_on) {
+		if (outstanding <= MLX5E_XDP_INLINE_WATERMARK_LOW)
+			session->inline_on = 0;
+		return;
+	}
+
+	/* inline is false */
+	if (outstanding >= MLX5E_XDP_INLINE_WATERMARK_HIGH)
+		session->inline_on = 1;
+}
+
 static inline void
-mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, dma_addr_t dma_addr, u16 dma_len)
+mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_info *xdpi,
+			 struct mlx5e_xdpsq_stats *stats)
 {
 	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
+	dma_addr_t dma_addr    = xdpi->dma_addr;
+	struct xdp_frame *xdpf = xdpi->xdpf;
 	struct mlx5_wqe_data_seg *dseg =
-		(struct mlx5_wqe_data_seg *)session->wqe + session->ds_count++;
+		(struct mlx5_wqe_data_seg *)session->wqe + session->ds_count;
+	u16 dma_len = xdpf->len;
 
+	session->pkt_count++;
+
+#define MLX5E_XDP_INLINE_WQE_SZ_THRSD (256 - sizeof(struct mlx5_wqe_inline_seg))
+
+	if (session->inline_on && dma_len <= MLX5E_XDP_INLINE_WQE_SZ_THRSD) {
+		struct mlx5_wqe_inline_seg *inline_dseg =
+			(struct mlx5_wqe_inline_seg *)dseg;
+		u16 ds_len = sizeof(*inline_dseg) + dma_len;
+		u16 ds_cnt = DIV_ROUND_UP(ds_len, MLX5_SEND_WQE_DS);
+
+		if (unlikely(session->ds_count + ds_cnt > session->max_ds_count)) {
+			/* Not enough space for inline wqe, send with memory pointer */
+			session->complete = true;
+			goto no_inline;
+		}
+
+		inline_dseg->byte_count = cpu_to_be32(dma_len | MLX5_INLINE_SEG);
+		memcpy(inline_dseg->data, xdpf->data, dma_len);
+
+		session->ds_count += ds_cnt;
+		stats->inlnw++;
+		return;
+	}
+
+no_inline:
 	dseg->addr       = cpu_to_be64(dma_addr);
 	dseg->byte_count = cpu_to_be32(dma_len);
 	dseg->lkey       = sq->mkey_be;
+	session->ds_count++;
 }
 
 static inline void mlx5e_xdpsq_fetch_wqe(struct mlx5e_xdpsq *sq,
@@ -111,5 +163,4 @@ mlx5e_xdpi_fifo_pop(struct mlx5e_xdp_info_fifo *fifo)
 {
 	return fifo->xi[(*fifo->cc)++ & fifo->mask];
 }
-
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 7ab195ac7299..23df6f486c6a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1532,7 +1532,7 @@ static int mlx5e_open_xdpsq(struct mlx5e_channel *c,
 			dseg->lkey = sq->mkey_be;
 
 			wi->num_wqebbs = 1;
-			wi->num_ds     = 1;
+			wi->num_pkts   = 1;
 		}
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 80ee48dcc0a3..ca0ff3b3fbd1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -66,6 +66,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_redirect) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_xmit) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_mpwqe) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_inlnw) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_cqe) },
@@ -81,6 +82,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_xmit) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_mpwqe) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_inlnw) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_cqes) },
@@ -163,6 +165,7 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_xdp_redirect += rq_stats->xdp_redirect;
 		s->rx_xdp_tx_xmit  += xdpsq_stats->xmit;
 		s->rx_xdp_tx_mpwqe += xdpsq_stats->mpwqe;
+		s->rx_xdp_tx_inlnw += xdpsq_stats->inlnw;
 		s->rx_xdp_tx_full  += xdpsq_stats->full;
 		s->rx_xdp_tx_err   += xdpsq_stats->err;
 		s->rx_xdp_tx_cqe   += xdpsq_stats->cqes;
@@ -188,7 +191,8 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->ch_eq_rearm    += ch_stats->eq_rearm;
 		/* xdp redirect */
 		s->tx_xdp_xmit    += xdpsq_red_stats->xmit;
-		s->tx_xdp_mpwqe += xdpsq_red_stats->mpwqe;
+		s->tx_xdp_mpwqe   += xdpsq_red_stats->mpwqe;
+		s->tx_xdp_inlnw   += xdpsq_red_stats->inlnw;
 		s->tx_xdp_full    += xdpsq_red_stats->full;
 		s->tx_xdp_err     += xdpsq_red_stats->err;
 		s->tx_xdp_cqes    += xdpsq_red_stats->cqes;
@@ -1250,6 +1254,7 @@ static const struct counter_desc sq_stats_desc[] = {
 static const struct counter_desc rq_xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
+	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, inlnw) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
@@ -1258,6 +1263,7 @@ static const struct counter_desc rq_xdpsq_stats_desc[] = {
 static const struct counter_desc xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
+	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, inlnw) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 1f05ffa086b1..ac3c7c2a0964 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -78,6 +78,7 @@ struct mlx5e_sw_stats {
 	u64 rx_xdp_redirect;
 	u64 rx_xdp_tx_xmit;
 	u64 rx_xdp_tx_mpwqe;
+	u64 rx_xdp_tx_inlnw;
 	u64 rx_xdp_tx_full;
 	u64 rx_xdp_tx_err;
 	u64 rx_xdp_tx_cqe;
@@ -93,6 +94,7 @@ struct mlx5e_sw_stats {
 	u64 tx_cqe_err;
 	u64 tx_xdp_xmit;
 	u64 tx_xdp_mpwqe;
+	u64 tx_xdp_inlnw;
 	u64 tx_xdp_full;
 	u64 tx_xdp_err;
 	u64 tx_xdp_cqes;
@@ -244,6 +246,7 @@ struct mlx5e_sq_stats {
 struct mlx5e_xdpsq_stats {
 	u64 xmit;
 	u64 mpwqe;
+	u64 inlnw;
 	u64 full;
 	u64 err;
 	/* dirtied @completion */
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index 0343c81d4c5f..3ba4edbd17a6 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -395,6 +395,7 @@ struct mlx5_wqe_signature_seg {
 
 struct mlx5_wqe_inline_seg {
 	__be32	byte_count;
+	__be32	data[0];
 };
 
 enum mlx5_sig_type {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 06/13] net/mlx5e: Remove unused parameter
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (4 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 05/13] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 07/13] net/mlx5e: Report mlx5e_xdp_set errors Saeed Mahameed
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

params is unused in mlx5e_init_di_list.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 23df6f486c6a..8185773a7bed 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -470,7 +470,6 @@ static void mlx5e_init_frags_partition(struct mlx5e_rq *rq)
 }
 
 static int mlx5e_init_di_list(struct mlx5e_rq *rq,
-			      struct mlx5e_params *params,
 			      int wq_sz, int cpu)
 {
 	int len = wq_sz << rq->wqe.info.log_num_frags;
@@ -598,7 +597,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 			goto err_free;
 		}
 
-		err = mlx5e_init_di_list(rq, params, wq_sz, c->cpu);
+		err = mlx5e_init_di_list(rq, wq_sz, c->cpu);
 		if (err)
 			goto err_free;
 		rq->post_wqes = mlx5e_post_rx_wqes;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 07/13] net/mlx5e: Report mlx5e_xdp_set errors
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (5 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 06/13] net/mlx5e: Remove unused parameter Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 08/13] net/mlx5e: Move parameter calculation functions to en/params.c Saeed Mahameed
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

If the channels fail to reopen after setting an XDP program, return the
error code instead of 0. A proper fix is still needed, as now any error
while reopening the channels brings the interface down. This patch only
adds error reporting.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8185773a7bed..a3397d5bfa76 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4313,7 +4313,7 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
 		mlx5e_set_rq_type(priv->mdev, &priv->channels.params);
 
 	if (was_opened && reset)
-		mlx5e_open_locked(netdev);
+		err = mlx5e_open_locked(netdev);
 
 	if (!test_bit(MLX5E_STATE_OPENED, &priv->state) || reset)
 		goto unlock;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 08/13] net/mlx5e: Move parameter calculation functions to en/params.c
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (6 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 07/13] net/mlx5e: Report mlx5e_xdp_set errors Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 09/13] net/mlx5e: Add an underflow warning comment Saeed Mahameed
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

This commit moves the parameter calculation functions to a separate file
for better modularity and code sharing with future features.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   3 +-
 .../ethernet/mellanox/mlx5/core/en/params.c   | 102 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/en/params.h   |  23 ++++
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  99 +----------------
 4 files changed, 128 insertions(+), 99 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 1a16f6d73cbc..3dbbe3b643b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -22,7 +22,8 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 #
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
 		en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
-		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o
+		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o \
+		en/params.o
 
 #
 # Netdev extra
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
new file mode 100644
index 000000000000..658337c3bba1
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2019 Mellanox Technologies. */
+
+#include "en/params.h"
+
+u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params)
+{
+	u16 hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
+	u16 linear_rq_headroom = params->xdp_prog ?
+		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
+	u32 frag_sz;
+
+	linear_rq_headroom += NET_IP_ALIGN;
+
+	frag_sz = MLX5_SKB_FRAG_SZ(linear_rq_headroom + hw_mtu);
+
+	if (params->xdp_prog && frag_sz < PAGE_SIZE)
+		frag_sz = PAGE_SIZE;
+
+	return frag_sz;
+}
+
+u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params)
+{
+	u32 linear_frag_sz = mlx5e_rx_get_linear_frag_sz(params);
+
+	return MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(linear_frag_sz);
+}
+
+bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
+			    struct mlx5e_params *params)
+{
+	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
+
+	return !params->lro_en && frag_sz <= PAGE_SIZE;
+}
+
+#define MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ ((BIT(__mlx5_bit_sz(wq, log_wqe_stride_size)) - 1) + \
+					  MLX5_MPWQE_LOG_STRIDE_SZ_BASE)
+bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
+				  struct mlx5e_params *params)
+{
+	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
+	s8 signed_log_num_strides_param;
+	u8 log_num_strides;
+
+	if (!mlx5e_rx_is_linear_skb(mdev, params))
+		return false;
+
+	if (order_base_2(frag_sz) > MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ)
+		return false;
+
+	if (MLX5_CAP_GEN(mdev, ext_stride_num_range))
+		return true;
+
+	log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(frag_sz);
+	signed_log_num_strides_param =
+		(s8)log_num_strides - MLX5_MPWQE_LOG_NUM_STRIDES_BASE;
+
+	return signed_log_num_strides_param >= 0;
+}
+
+u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params)
+{
+	if (params->log_rq_mtu_frames <
+	    mlx5e_mpwqe_log_pkts_per_wqe(params) + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
+		return MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW;
+
+	return params->log_rq_mtu_frames - mlx5e_mpwqe_log_pkts_per_wqe(params);
+}
+
+u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params)
+{
+	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params))
+		return order_base_2(mlx5e_rx_get_linear_frag_sz(params));
+
+	return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
+}
+
+u8 mlx5e_mpwqe_get_log_num_strides(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params)
+{
+	return MLX5_MPWRQ_LOG_WQE_SZ -
+		mlx5e_mpwqe_get_log_stride_size(mdev, params);
+}
+
+u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
+			  struct mlx5e_params *params)
+{
+	u16 linear_rq_headroom = params->xdp_prog ?
+		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
+	bool is_linear_skb;
+
+	linear_rq_headroom += NET_IP_ALIGN;
+
+	is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
+		mlx5e_rx_is_linear_skb(mdev, params) :
+		mlx5e_rx_mpwqe_is_linear_skb(mdev, params);
+
+	return is_linear_skb ? linear_rq_headroom : 0;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
new file mode 100644
index 000000000000..0ef1436c4c76
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2019 Mellanox Technologies. */
+
+#ifndef __MLX5_EN_PARAMS_H__
+#define __MLX5_EN_PARAMS_H__
+
+#include "en.h"
+
+u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params);
+u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params);
+bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
+			    struct mlx5e_params *params);
+bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
+				  struct mlx5e_params *params);
+u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params);
+u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params);
+u8 mlx5e_mpwqe_get_log_num_strides(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params);
+u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
+			  struct mlx5e_params *params);
+
+#endif /* __MLX5_EN_PARAMS_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index a3397d5bfa76..8cab86a558ab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -55,6 +55,7 @@
 #include "lib/eq.h"
 #include "en/monitor_stats.h"
 #include "en/reporter.h"
+#include "en/params.h"
 
 struct mlx5e_rq_param {
 	u32			rqc[MLX5_ST_SZ_DW(rqc)];
@@ -103,104 +104,6 @@ bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
 	return true;
 }
 
-static u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params)
-{
-	u16 hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
-	u16 linear_rq_headroom = params->xdp_prog ?
-		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
-	u32 frag_sz;
-
-	linear_rq_headroom += NET_IP_ALIGN;
-
-	frag_sz = MLX5_SKB_FRAG_SZ(linear_rq_headroom + hw_mtu);
-
-	if (params->xdp_prog && frag_sz < PAGE_SIZE)
-		frag_sz = PAGE_SIZE;
-
-	return frag_sz;
-}
-
-static u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params)
-{
-	u32 linear_frag_sz = mlx5e_rx_get_linear_frag_sz(params);
-
-	return MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(linear_frag_sz);
-}
-
-static bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
-				   struct mlx5e_params *params)
-{
-	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
-
-	return !params->lro_en && frag_sz <= PAGE_SIZE;
-}
-
-#define MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ ((BIT(__mlx5_bit_sz(wq, log_wqe_stride_size)) - 1) + \
-					  MLX5_MPWQE_LOG_STRIDE_SZ_BASE)
-static bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
-					 struct mlx5e_params *params)
-{
-	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
-	s8 signed_log_num_strides_param;
-	u8 log_num_strides;
-
-	if (!mlx5e_rx_is_linear_skb(mdev, params))
-		return false;
-
-	if (order_base_2(frag_sz) > MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ)
-		return false;
-
-	if (MLX5_CAP_GEN(mdev, ext_stride_num_range))
-		return true;
-
-	log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(frag_sz);
-	signed_log_num_strides_param =
-		(s8)log_num_strides - MLX5_MPWQE_LOG_NUM_STRIDES_BASE;
-
-	return signed_log_num_strides_param >= 0;
-}
-
-static u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params)
-{
-	if (params->log_rq_mtu_frames <
-	    mlx5e_mpwqe_log_pkts_per_wqe(params) + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
-		return MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW;
-
-	return params->log_rq_mtu_frames - mlx5e_mpwqe_log_pkts_per_wqe(params);
-}
-
-static u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
-					  struct mlx5e_params *params)
-{
-	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params))
-		return order_base_2(mlx5e_rx_get_linear_frag_sz(params));
-
-	return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
-}
-
-static u8 mlx5e_mpwqe_get_log_num_strides(struct mlx5_core_dev *mdev,
-					  struct mlx5e_params *params)
-{
-	return MLX5_MPWRQ_LOG_WQE_SZ -
-		mlx5e_mpwqe_get_log_stride_size(mdev, params);
-}
-
-static u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
-				 struct mlx5e_params *params)
-{
-	u16 linear_rq_headroom = params->xdp_prog ?
-		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
-	bool is_linear_skb;
-
-	linear_rq_headroom += NET_IP_ALIGN;
-
-	is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
-		mlx5e_rx_is_linear_skb(mdev, params) :
-		mlx5e_rx_mpwqe_is_linear_skb(mdev, params);
-
-	return is_linear_skb ? linear_rq_headroom : 0;
-}
-
 void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
 			       struct mlx5e_params *params)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 09/13] net/mlx5e: Add an underflow warning comment
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (7 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 08/13] net/mlx5e: Move parameter calculation functions to en/params.c Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 10/13] net/mlx5e: Remove unused parameter Saeed Mahameed
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

mlx5e_mpwqe_get_log_rq_size calculates the number of WQEs (N) based on
the requested number of frames in the RQ (F) and the number of packets
per WQE (P). It ensures that N is not less than the minimum number of
WQEs in an RQ (N_min). Arithmetically, it means that F / P >= N_min
should be true. This function deals with logarithms, so it should check
that log(F) - log(P) >= log(N_min). However, if F < P, this expression
will cause an unsigned underflow. Check log(F) >= log(P) + log(N_min)
instead.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 658337c3bba1..fa6661ea6310 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -62,11 +62,14 @@ bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
 
 u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params)
 {
+	u8 log_pkts_per_wqe = mlx5e_mpwqe_log_pkts_per_wqe(params);
+
+	/* Numbers are unsigned, don't subtract to avoid underflow. */
 	if (params->log_rq_mtu_frames <
-	    mlx5e_mpwqe_log_pkts_per_wqe(params) + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
+	    log_pkts_per_wqe + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
 		return MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW;
 
-	return params->log_rq_mtu_frames - mlx5e_mpwqe_log_pkts_per_wqe(params);
+	return params->log_rq_mtu_frames - log_pkts_per_wqe;
 }
 
 u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 10/13] net/mlx5e: Remove unused parameter
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (8 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 09/13] net/mlx5e: Add an underflow warning comment Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 11/13] net/mlx5e: Take HW interrupt trigger into a function Saeed Mahameed
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

mdev is unused in mlx5e_rx_is_linear_skb.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c |  7 +++----
 drivers/net/ethernet/mellanox/mlx5/core/en/params.h |  3 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c   | 10 +++++-----
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index fa6661ea6310..d3744bffbae3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -27,8 +27,7 @@ u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params)
 	return MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(linear_frag_sz);
 }
 
-bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
-			    struct mlx5e_params *params)
+bool mlx5e_rx_is_linear_skb(struct mlx5e_params *params)
 {
 	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
 
@@ -44,7 +43,7 @@ bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
 	s8 signed_log_num_strides_param;
 	u8 log_num_strides;
 
-	if (!mlx5e_rx_is_linear_skb(mdev, params))
+	if (!mlx5e_rx_is_linear_skb(params))
 		return false;
 
 	if (order_base_2(frag_sz) > MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ)
@@ -98,7 +97,7 @@ u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
 	linear_rq_headroom += NET_IP_ALIGN;
 
 	is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
-		mlx5e_rx_is_linear_skb(mdev, params) :
+		mlx5e_rx_is_linear_skb(params) :
 		mlx5e_rx_mpwqe_is_linear_skb(mdev, params);
 
 	return is_linear_skb ? linear_rq_headroom : 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
index 0ef1436c4c76..b106a0236f36 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -8,8 +8,7 @@
 
 u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params);
 u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params);
-bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
-			    struct mlx5e_params *params);
+bool mlx5e_rx_is_linear_skb(struct mlx5e_params *params);
 bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
 				  struct mlx5e_params *params);
 u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8cab86a558ab..1c328dbb6fe0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -518,7 +518,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 			goto err_free;
 		}
 
-		rq->wqe.skb_from_cqe = mlx5e_rx_is_linear_skb(mdev, params) ?
+		rq->wqe.skb_from_cqe = mlx5e_rx_is_linear_skb(params) ?
 			mlx5e_skb_from_cqe_linear :
 			mlx5e_skb_from_cqe_nonlinear;
 		rq->mkey_be = c->mkey_be;
@@ -1960,7 +1960,7 @@ static void mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 		byte_count += MLX5E_METADATA_ETHER_LEN;
 #endif
 
-	if (mlx5e_rx_is_linear_skb(mdev, params)) {
+	if (mlx5e_rx_is_linear_skb(params)) {
 		int frag_stride;
 
 		frag_stride = mlx5e_rx_get_linear_frag_sz(params);
@@ -3722,7 +3722,7 @@ int mlx5e_change_mtu(struct net_device *netdev, int new_mtu,
 	new_channels.params.sw_mtu = new_mtu;
 
 	if (params->xdp_prog &&
-	    !mlx5e_rx_is_linear_skb(priv->mdev, &new_channels.params)) {
+	    !mlx5e_rx_is_linear_skb(&new_channels.params)) {
 		netdev_err(netdev, "MTU(%d) > %d is not allowed while XDP enabled\n",
 			   new_mtu, MLX5E_XDP_MAX_MTU);
 		err = -EINVAL;
@@ -4163,7 +4163,7 @@ static int mlx5e_xdp_allowed(struct mlx5e_priv *priv, struct bpf_prog *prog)
 	new_channels.params = priv->channels.params;
 	new_channels.params.xdp_prog = prog;
 
-	if (!mlx5e_rx_is_linear_skb(priv->mdev, &new_channels.params)) {
+	if (!mlx5e_rx_is_linear_skb(&new_channels.params)) {
 		netdev_warn(netdev, "XDP is not allowed with MTU(%d) > %d\n",
 			    new_channels.params.sw_mtu, MLX5E_XDP_MAX_MTU);
 		return -EINVAL;
@@ -4506,7 +4506,7 @@ void mlx5e_build_rq_params(struct mlx5_core_dev *mdev,
 	if (!slow_pci_heuristic(mdev) &&
 	    mlx5e_striding_rq_possible(mdev, params) &&
 	    (mlx5e_rx_mpwqe_is_linear_skb(mdev, params) ||
-	     !mlx5e_rx_is_linear_skb(mdev, params)))
+	     !mlx5e_rx_is_linear_skb(params)))
 		MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ, true);
 	mlx5e_set_rq_type(mdev, params);
 	mlx5e_init_rq_type_params(mdev, params);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 11/13] net/mlx5e: Take HW interrupt trigger into a function
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (9 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 10/13] net/mlx5e: Remove unused parameter Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:14 ` [net-next V2 12/13] net/mlx5e: Remove unused rx_page_reuse stat Saeed Mahameed
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

mlx5e_trigger_irq posts a NOP to the ICO SQ just to trigger an IRQ and
enter the NAPI poll on the right CPU according to the affinity. Use it
in mlx5e_activate_rq.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +---------
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 11 +++++++++++
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index c190061763fd..7e0c3d4de108 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -778,6 +778,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 			  struct mlx5e_tx_wqe *wqe, u16 pi, bool xmit_more);
 
+void mlx5e_trigger_irq(struct mlx5e_icosq *sq);
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 1c328dbb6fe0..8ae17dcad487 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -877,16 +877,8 @@ static int mlx5e_open_rq(struct mlx5e_channel *c,
 
 static void mlx5e_activate_rq(struct mlx5e_rq *rq)
 {
-	struct mlx5e_icosq *sq = &rq->channel->icosq;
-	struct mlx5_wq_cyc *wq = &sq->wq;
-	struct mlx5e_tx_wqe *nopwqe;
-
-	u16 pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
-
 	set_bit(MLX5E_RQ_STATE_ENABLED, &rq->state);
-	sq->db.ico_wqe[pi].opcode     = MLX5_OPCODE_NOP;
-	nopwqe = mlx5e_post_nop(wq, sq->sqn, &sq->pc);
-	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &nopwqe->ctrl);
+	mlx5e_trigger_irq(&rq->channel->icosq);
 }
 
 static void mlx5e_deactivate_rq(struct mlx5e_rq *rq)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index b4af5e19f6ac..f9862bf75491 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -71,6 +71,17 @@ static void mlx5e_handle_rx_dim(struct mlx5e_rq *rq)
 	net_dim(&rq->dim, dim_sample);
 }
 
+void mlx5e_trigger_irq(struct mlx5e_icosq *sq)
+{
+	struct mlx5_wq_cyc *wq = &sq->wq;
+	struct mlx5e_tx_wqe *nopwqe;
+	u16 pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
+
+	sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_NOP;
+	nopwqe = mlx5e_post_nop(wq, sq->sqn, &sq->pc);
+	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &nopwqe->ctrl);
+}
+
 int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct mlx5e_channel *c = container_of(napi, struct mlx5e_channel,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 12/13] net/mlx5e: Remove unused rx_page_reuse stat
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (10 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 11/13] net/mlx5e: Take HW interrupt trigger into a function Saeed Mahameed
@ 2019-04-23 19:14 ` Saeed Mahameed
  2019-04-23 19:15 ` [net-next V2 13/13] net/mlx5e: Use #define for the WQE wait timeout constant Saeed Mahameed
  2019-04-23 23:25 ` [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Jakub Kicinski
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

Remove the no longer used page_reuse stat of RQs.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 ---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 --
 2 files changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index ca0ff3b3fbd1..483d321d2151 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -93,7 +93,6 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_buff_alloc_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_blks) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_pkts) },
-	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_page_reuse) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_reuse) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) },
@@ -176,7 +175,6 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_buff_alloc_err += rq_stats->buff_alloc_err;
 		s->rx_cqe_compress_blks += rq_stats->cqe_compress_blks;
 		s->rx_cqe_compress_pkts += rq_stats->cqe_compress_pkts;
-		s->rx_page_reuse  += rq_stats->page_reuse;
 		s->rx_cache_reuse += rq_stats->cache_reuse;
 		s->rx_cache_full  += rq_stats->cache_full;
 		s->rx_cache_empty += rq_stats->cache_empty;
@@ -1220,7 +1218,6 @@ static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, buff_alloc_err) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_blks) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) },
-	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, page_reuse) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_reuse) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_full) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_empty) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index ac3c7c2a0964..cdddcc46971b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -105,7 +105,6 @@ struct mlx5e_sw_stats {
 	u64 rx_buff_alloc_err;
 	u64 rx_cqe_compress_blks;
 	u64 rx_cqe_compress_pkts;
-	u64 rx_page_reuse;
 	u64 rx_cache_reuse;
 	u64 rx_cache_full;
 	u64 rx_cache_empty;
@@ -205,7 +204,6 @@ struct mlx5e_rq_stats {
 	u64 buff_alloc_err;
 	u64 cqe_compress_blks;
 	u64 cqe_compress_pkts;
-	u64 page_reuse;
 	u64 cache_reuse;
 	u64 cache_full;
 	u64 cache_empty;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net-next V2 13/13] net/mlx5e: Use #define for the WQE wait timeout constant
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (11 preceding siblings ...)
  2019-04-23 19:14 ` [net-next V2 12/13] net/mlx5e: Remove unused rx_page_reuse stat Saeed Mahameed
@ 2019-04-23 19:15 ` Saeed Mahameed
  2019-04-23 23:25 ` [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Jakub Kicinski
  13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2019-04-23 19:15 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

Create a #define for the timeout of mlx5e_wait_for_min_rx_wqes to
clarify the meaning of a magic number.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8ae17dcad487..69a9d67396ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2256,14 +2256,18 @@ static void mlx5e_activate_channels(struct mlx5e_channels *chs)
 		mlx5e_activate_channel(chs->c[i]);
 }
 
+#define MLX5E_RQ_WQES_TIMEOUT 20000 /* msecs */
+
 static int mlx5e_wait_channels_min_rx_wqes(struct mlx5e_channels *chs)
 {
 	int err = 0;
 	int i;
 
-	for (i = 0; i < chs->num; i++)
-		err |= mlx5e_wait_for_min_rx_wqes(&chs->c[i]->rq,
-						  err ? 0 : 20000);
+	for (i = 0; i < chs->num; i++) {
+		int timeout = err ? 0 : MLX5E_RQ_WQES_TIMEOUT;
+
+		err |= mlx5e_wait_for_min_rx_wqes(&chs->c[i]->rq, timeout);
+	}
 
 	return err ? -ETIMEDOUT : 0;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements
  2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (12 preceding siblings ...)
  2019-04-23 19:15 ` [net-next V2 13/13] net/mlx5e: Use #define for the WQE wait timeout constant Saeed Mahameed
@ 2019-04-23 23:25 ` Jakub Kicinski
  2019-04-24  0:08   ` David Miller
  13 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-04-23 23:25 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jesper Dangaard Brouer, Jonathan Lemon

On Tue, 23 Apr 2019 12:14:47 -0700, Saeed Mahameed wrote:
> Hi Dave,
> 
> This series includes updates to mlx5e driver RX data path and some
> significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
> bottlenecks.
> 
> For more information please see tag log below.
> 
> Please pull and let me know if there is any problem.
> 
> Please note that the series starts with a merge of mlx5-next branch,
> to resolve and avoid dependency with rdma tree, and I just merged
> v5.1-rc1 into mlx5-next since we forgot to reset the branch on last
> merge window, i hope this is ok with you, next time i will avoid such
> merges with linus tree.
> 
> v1->v2:
>  - Drop 1st patch "prefetch for small L1_CACHE_BYTES", we will have to
>    introduce a new netdev helper function to be used by any driver, we will
>    resubmit it as standalone patch later. 

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

> The following changes since commit 3839f99d21688d3062ebd3cc06db46edb3b99ac1:
> 
>   Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux (2019-04-23 11:57:33 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-04-22
> 
> for you to fetch changes up to f8ebecf2e32a62137dc5a98b2c94b1db37a0f9f8:
> 
>   net/mlx5e: Use #define for the WQE wait timeout constant (2019-04-23 12:09:22 -0700)

FWIW I tried to pull:

git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git

to compare with v1, and I don't see the 2019-04-22 tag..

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements
  2019-04-23 23:25 ` [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Jakub Kicinski
@ 2019-04-24  0:08   ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2019-04-24  0:08 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: saeedm, netdev, brouer, bsd

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Tue, 23 Apr 2019 16:25:38 -0700

> On Tue, 23 Apr 2019 12:14:47 -0700, Saeed Mahameed wrote:
>> Hi Dave,
>> 
>> This series includes updates to mlx5e driver RX data path and some
>> significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
>> bottlenecks.
>> 
>> For more information please see tag log below.
>> 
>> Please pull and let me know if there is any problem.
>> 
>> Please note that the series starts with a merge of mlx5-next branch,
>> to resolve and avoid dependency with rdma tree, and I just merged
>> v5.1-rc1 into mlx5-next since we forgot to reset the branch on last
>> merge window, i hope this is ok with you, next time i will avoid such
>> merges with linus tree.
>> 
>> v1->v2:
>>  - Drop 1st patch "prefetch for small L1_CACHE_BYTES", we will have to
>>    introduce a new netdev helper function to be used by any driver, we will
>>    resubmit it as standalone patch later. 
> 
> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Pulled, thanks everyone.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-04-24  0:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23 19:14 [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 01/13] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 02/13] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 03/13] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 04/13] net/mlx5e: XDP, Add TX MPWQE session counter Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 05/13] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 06/13] net/mlx5e: Remove unused parameter Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 07/13] net/mlx5e: Report mlx5e_xdp_set errors Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 08/13] net/mlx5e: Move parameter calculation functions to en/params.c Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 09/13] net/mlx5e: Add an underflow warning comment Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 10/13] net/mlx5e: Remove unused parameter Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 11/13] net/mlx5e: Take HW interrupt trigger into a function Saeed Mahameed
2019-04-23 19:14 ` [net-next V2 12/13] net/mlx5e: Remove unused rx_page_reuse stat Saeed Mahameed
2019-04-23 19:15 ` [net-next V2 13/13] net/mlx5e: Use #define for the WQE wait timeout constant Saeed Mahameed
2019-04-23 23:25 ` [pull request][net-next V2 00/13] Mellanox, mlx5 RX and XDP improvements Jakub Kicinski
2019-04-24  0:08   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.