All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements
@ 2019-04-22 22:32 Saeed Mahameed
  2019-04-22 22:32 ` [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Saeed Mahameed
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Saeed Mahameed

Hi Dave,

This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.

For more information please see tag log below.

Please pull and let me know if there is any problem.

Please note that the series starts with a merge of mlx5-next branch,
to resolve and avoid dependency with rdma tree, and I just merged
v5.1-rc1 into mlx5-next since we forgot to reset the branch on last
merge window, i hope this is ok with you, next time i will avoid such
merges with linus tree.

Thanks,
Saeed.

---
The following changes since commit 47eae9c6922daf35559d6ecc4408f573df251b20:

  Merge branch 'mlx5-next-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux (2019-04-22 15:27:29 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-04-22

for you to fetch changes up to 9800e88c14d58c45f3d8b8faaa1cbd624267fe70:

  net/mlx5e: Use #define for the WQE wait timeout constant (2019-04-22 15:29:47 -0700)

----------------------------------------------------------------
mlx5-updates-2019-04-22

This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.

From Tariq:
1) Additional prefetch for small L1_CACHE_BYTES
2) Some Enhancements in rq->flags
3) Stabilize RX packet rate (on Striding RQ) with
multiple outstanding UMR posts
In this patch, we add support for multiple outstanding UMR posts,
 to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.

Performance test:
As expected, huge improvement in large-scale (48 cores).

xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

Before: Unstable, 7 to 30 Mpps
After:  Stable,   at 70.5 Mpps

From Shay:
4) XDP, Inline small packets into the TX MPWQE in XDP xmit flow

Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).

When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.

Performance:
    Tested packet rate for UDP 64Byte multi-stream
    over two dual port ConnectX-5 100Gbps NICs.
    CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

    * Tested with hyper-threading disabled

    XDP_TX:

    |          | before | after   |       |
    | 24 rings | 51Mpps | 116Mpps | +126% |
    | 1 ring   | 12Mpps | 12Mpps  | same  |

    XDP_REDIRECT:

    ** Below is the transmit rate, not the redirection rate
    which might be larger, and is not affected by this patch.

    |          | before  | after   |      |
    | 32 rings | 64Mpps  | 92Mpps  | +43% |
    | 1 ring   | 6.4Mpps | 6.4Mpps | same |

As we can see, feature significantly improves scaling, without
hurting single ring performance.

From Maxim:
5) Some trivial refactoring and code improvements prior to a larger series
to support AF_XDP.

-Saeed.

----------------------------------------------------------------
Maxim Mikityanskiy (8):
      net/mlx5e: Remove unused parameter
      net/mlx5e: Report mlx5e_xdp_set errors
      net/mlx5e: Move parameter calculation functions to en/params.c
      net/mlx5e: Add an underflow warning comment
      net/mlx5e: Remove unused parameter
      net/mlx5e: Take HW interrupt trigger into a function
      net/mlx5e: Remove unused rx_page_reuse stat
      net/mlx5e: Use #define for the WQE wait timeout constant

Shay Agroskin (2):
      net/mlx5e: XDP, Add TX MPWQE session counter
      net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow

Tariq Toukan (4):
      net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
      net/mlx5e: RX, Support multiple outstanding UMR posts
      net/mlx5e: XDP, Fix shifted flag index in RQ bitmap
      net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  37 ++++-
 .../net/ethernet/mellanox/mlx5/core/en/params.c    | 104 +++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en/params.h    |  22 +++
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c   |  34 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h   |  57 ++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 172 ++++++---------------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 143 +++++++++++------
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c |  15 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h |   8 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  |  11 ++
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       |  12 ++
 include/linux/mlx5/qp.h                            |   1 +
 14 files changed, 416 insertions(+), 206 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.h

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-23  2:46   ` Jakub Kicinski
  2019-04-22 22:32 ` [net-next 02/14] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

A single cacheline might not contain the packet header for
small L1_CACHE_BYTES values.
Issue an additional prefetch in this case.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h    | 17 +++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c    |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 ++++++-------
 .../ethernet/mellanox/mlx5/core/en_selftest.c   |  3 +--
 4 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 51e109fdeec1..6147be23a9b9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -50,6 +50,7 @@
 #include <net/xdp.h>
 #include <linux/net_dim.h>
 #include <linux/bits.h>
+#include <linux/prefetch.h>
 #include "wq.h"
 #include "mlx5_core.h"
 #include "en_stats.h"
@@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
 	mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq->wq.cc);
 }
 
+static inline void mlx5e_prefetch(void *p)
+{
+	prefetch(p);
+#if L1_CACHE_BYTES < 128
+	prefetch(p + L1_CACHE_BYTES);
+#endif
+}
+
+static inline void mlx5e_prefetchw(void *p)
+{
+	prefetchw(p);
+#if L1_CACHE_BYTES < 128
+	prefetchw(p + L1_CACHE_BYTES);
+#endif
+}
+
 extern const struct ethtool_ops mlx5e_ethtool_ops;
 #ifdef CONFIG_MLX5_CORE_EN_DCB
 extern const struct dcbnl_rtnl_ops mlx5e_dcbnl_ops;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 03b2a9f9c589..6778bdeff1a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -111,7 +111,7 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 
 	mlx5e_xdpsq_fetch_wqe(sq, &session->wqe);
 
-	prefetchw(session->wqe->data);
+	mlx5e_prefetchw(session->wqe->data);
 	session->ds_count = MLX5E_XDP_TX_EMPTY_DS_COUNT;
 
 	pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
@@ -209,7 +209,7 @@ static bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_info *
 
 	struct mlx5e_xdpsq_stats *stats = sq->stats;
 
-	prefetchw(wqe);
+	mlx5e_prefetchw(wqe);
 
 	if (unlikely(dma_len < MLX5E_XDP_MIN_INLINE || sq->hw_mtu < dma_len)) {
 		stats->err++;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index c3b3002ff62f..0148ca6ed4ae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -30,7 +30,6 @@
  * SOFTWARE.
  */
 
-#include <linux/prefetch.h>
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/tcp.h>
@@ -963,8 +962,8 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
 
 	dma_sync_single_range_for_cpu(rq->pdev, di->addr, wi->offset,
 				      frag_size, DMA_FROM_DEVICE);
-	prefetchw(va); /* xdp_frame data area */
-	prefetch(data);
+	mlx5e_prefetchw(va); /* xdp_frame data area */
+	mlx5e_prefetch(data);
 
 	if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_RESP_SEND)) {
 		rq->stats->wqe_err++;
@@ -1013,7 +1012,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
 		return NULL;
 	}
 
-	prefetchw(skb->data);
+	mlx5e_prefetchw(skb->data);
 
 	while (byte_cnt) {
 		u16 frag_consumed_bytes =
@@ -1130,7 +1129,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 		return NULL;
 	}
 
-	prefetchw(skb->data);
+	mlx5e_prefetchw(skb->data);
 
 	if (unlikely(frag_offset >= PAGE_SIZE)) {
 		di++;
@@ -1182,8 +1181,8 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 	dma_sync_single_range_for_cpu(rq->pdev, di->addr, head_offset,
 				      frag_size, DMA_FROM_DEVICE);
-	prefetchw(va); /* xdp_frame data area */
-	prefetch(data);
+	mlx5e_prefetchw(va); /* xdp_frame data area */
+	mlx5e_prefetch(data);
 
 	rcu_read_lock();
 	consumed = mlx5e_xdp_handle(rq, di, va, &rx_headroom, &cqe_bcnt32);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 4382ef85488c..1b57c10f773f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -30,7 +30,6 @@
  * SOFTWARE.
  */
 
-#include <linux/prefetch.h>
 #include <linux/ip.h>
 #include <linux/udp.h>
 #include <net/udp.h>
@@ -124,7 +123,7 @@ static struct sk_buff *mlx5e_test_get_udp_skb(struct mlx5e_priv *priv)
 		return NULL;
 	}
 
-	prefetchw(skb->data);
+	mlx5e_prefetchw(skb->data);
 	skb_reserve(skb, NET_IP_ALIGN);
 
 	/*  Reserve for ethernet and IP header  */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 02/14] net/mlx5e: RX, Support multiple outstanding UMR posts
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
  2019-04-22 22:32 ` [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-22 22:32 ` [net-next 03/14] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap Saeed Mahameed
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

The buffers mapping of the Multi-Packet WQEs (of Striding RQ)
is done via UMR posts, one UMR WQE per an RX MPWQE.

A single MPWQE is capable of serving many incoming packets,
usually larger than the budget of a single napi cycle.
Hence, posting a single UMR WQE per napi cycle (and handling its
completion in the next cycle) works fine in many common cases,
but not always.

When an XDP program is loaded, every MPWQE is capable of serving less
packets, to satisfy the packet-per-page requirement.
Thus, for the same number of packets more MPWQEs (and UMR posts)
are needed (twice as much for the default MTU), giving less latency
room for the UMR completions.

In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.

For better SW and HW locality, we combine the UMR posts in bulks of
(at least) two.

This is expected to improve packet rate in high CPU scale.

Performance test:
As expected, huge improvement in large-scale (48 cores).

xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.

Before: Unstable, 7 to 30 Mpps
After:  Stable,   at 70.5 Mpps

No degradation in other tested scenarios.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  10 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  36 ++++-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 130 ++++++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/wq.h  |  12 ++
 4 files changed, 136 insertions(+), 52 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 6147be23a9b9..6b9f661a0577 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -462,10 +462,10 @@ struct mlx5e_xdpsq {
 
 struct mlx5e_icosq {
 	/* data path */
+	u16                        cc;
+	u16                        pc;
 
-	/* dirtied @xmit */
-	u16                        pc ____cacheline_aligned_in_smp;
-
+	struct mlx5_wqe_ctrl_seg  *doorbell_cseg;
 	struct mlx5e_cq            cq;
 
 	/* write@xmit, read@completion */
@@ -563,8 +563,10 @@ struct mlx5e_rq {
 			struct mlx5e_mpw_info *info;
 			mlx5e_fp_skb_from_cqe_mpwrq skb_from_cqe_mpwrq;
 			u16                    num_strides;
+			u16                    actual_wq_head;
 			u8                     log_stride_sz;
-			bool                   umr_in_progress;
+			u8                     umr_in_progress;
+			u8                     umr_last_bulk;
 		} mpwqe;
 	};
 	struct {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 5c127fccad60..7ab195ac7299 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -903,10 +903,14 @@ static void mlx5e_free_rx_descs(struct mlx5e_rq *rq)
 
 	if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
 		struct mlx5_wq_ll *wq = &rq->mpwqe.wq;
+		u16 head = wq->head;
+		int i;
 
-		/* UMR WQE (if in progress) is always at wq->head */
-		if (rq->mpwqe.umr_in_progress)
-			rq->dealloc_wqe(rq, wq->head);
+		/* Outstanding UMR WQEs (in progress) start at wq->head */
+		for (i = 0; i < rq->mpwqe.umr_in_progress; i++) {
+			rq->dealloc_wqe(rq, head);
+			head = mlx5_wq_ll_get_wqe_next_ix(wq, head);
+		}
 
 		while (!mlx5_wq_ll_is_empty(wq)) {
 			struct mlx5e_rx_wqe_ll *wqe;
@@ -1092,7 +1096,7 @@ static void mlx5e_free_icosq_db(struct mlx5e_icosq *sq)
 
 static int mlx5e_alloc_icosq_db(struct mlx5e_icosq *sq, int numa)
 {
-	u8 wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
+	int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
 
 	sq->db.ico_wqe = kvzalloc_node(array_size(wq_sz,
 						  sizeof(*sq->db.ico_wqe)),
@@ -2108,6 +2112,13 @@ static inline u8 mlx5e_get_rqwq_log_stride(u8 wq_type, int ndsegs)
 	return order_base_2(sz);
 }
 
+static u8 mlx5e_get_rq_log_wq_sz(void *rqc)
+{
+	void *wq = MLX5_ADDR_OF(rqc, rqc, wq);
+
+	return MLX5_GET(wq, wq, log_wq_sz);
+}
+
 static void mlx5e_build_rq_param(struct mlx5e_priv *priv,
 				 struct mlx5e_params *params,
 				 struct mlx5e_rq_param *param)
@@ -2274,13 +2285,28 @@ static void mlx5e_build_xdpsq_param(struct mlx5e_priv *priv,
 	param->is_mpw = MLX5E_GET_PFLAG(params, MLX5E_PFLAG_XDP_TX_MPWQE);
 }
 
+static u8 mlx5e_build_icosq_log_wq_sz(struct mlx5e_params *params,
+				      struct mlx5e_rq_param *rqp)
+{
+	switch (params->rq_wq_type) {
+	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
+		return order_base_2(MLX5E_UMR_WQEBBS) +
+			mlx5e_get_rq_log_wq_sz(rqp->rqc);
+	default: /* MLX5_WQ_TYPE_CYCLIC */
+		return MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	}
+}
+
 static void mlx5e_build_channel_param(struct mlx5e_priv *priv,
 				      struct mlx5e_params *params,
 				      struct mlx5e_channel_param *cparam)
 {
-	u8 icosq_log_wq_sz = MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	u8 icosq_log_wq_sz;
 
 	mlx5e_build_rq_param(priv, params, &cparam->rq);
+
+	icosq_log_wq_sz = mlx5e_build_icosq_log_wq_sz(params, &cparam->rq);
+
 	mlx5e_build_sq_param(priv, params, &cparam->sq);
 	mlx5e_build_xdpsq_param(priv, params, &cparam->xdp_sq);
 	mlx5e_build_icosq_param(priv, icosq_log_wq_sz, &cparam->icosq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 0148ca6ed4ae..0f3da3ae858f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -408,14 +408,15 @@ mlx5e_free_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, bool recycle
 			mlx5e_page_release(rq, &dma_info[i], recycle);
 }
 
-static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq)
+static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq, u8 n)
 {
 	struct mlx5_wq_ll *wq = &rq->mpwqe.wq;
-	struct mlx5e_rx_wqe_ll *wqe = mlx5_wq_ll_get_wqe(wq, wq->head);
 
-	rq->mpwqe.umr_in_progress = false;
+	do {
+		u16 next_wqe_index = mlx5_wq_ll_get_wqe_next_ix(wq, wq->head);
 
-	mlx5_wq_ll_push(wq, be16_to_cpu(wqe->next.next_wqe_index));
+		mlx5_wq_ll_push(wq, next_wqe_index);
+	} while (--n);
 
 	/* ensure wqes are visible to device before updating doorbell record */
 	dma_wmb();
@@ -425,7 +426,7 @@ static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq)
 
 static inline u16 mlx5e_icosq_wrap_cnt(struct mlx5e_icosq *sq)
 {
-	return sq->pc >> MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	return mlx5_wq_cyc_get_ctr_wrap_cnt(&sq->wq, sq->pc);
 }
 
 static inline void mlx5e_fill_icosq_frag_edge(struct mlx5e_icosq *sq,
@@ -477,8 +478,6 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 	bitmap_zero(wi->xdp_xmit_bitmap, MLX5_MPWRQ_PAGES_PER_WQE);
 	wi->consumed_strides = 0;
 
-	rq->mpwqe.umr_in_progress = true;
-
 	umr_wqe->ctrl.opmod_idx_opcode =
 		cpu_to_be32((sq->pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) |
 			    MLX5_OPCODE_UMR);
@@ -486,7 +485,8 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 
 	sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_UMR;
 	sq->pc += MLX5E_UMR_WQEBBS;
-	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &umr_wqe->ctrl);
+
+	sq->doorbell_cseg = &umr_wqe->ctrl;
 
 	return 0;
 
@@ -541,37 +541,13 @@ bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
 	return !!err;
 }
 
-static inline void mlx5e_poll_ico_single_cqe(struct mlx5e_cq *cq,
-					     struct mlx5e_icosq *sq,
-					     struct mlx5e_rq *rq,
-					     struct mlx5_cqe64 *cqe)
-{
-	struct mlx5_wq_cyc *wq = &sq->wq;
-	u16 ci = mlx5_wq_cyc_ctr2ix(wq, be16_to_cpu(cqe->wqe_counter));
-	struct mlx5e_sq_wqe_info *icowi = &sq->db.ico_wqe[ci];
-
-	mlx5_cqwq_pop(&cq->wq);
-
-	if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) {
-		netdev_WARN_ONCE(cq->channel->netdev,
-				 "Bad OP in ICOSQ CQE: 0x%x\n", get_cqe_opcode(cqe));
-		return;
-	}
-
-	if (likely(icowi->opcode == MLX5_OPCODE_UMR)) {
-		mlx5e_post_rx_mpwqe(rq);
-		return;
-	}
-
-	if (unlikely(icowi->opcode != MLX5_OPCODE_NOP))
-		netdev_WARN_ONCE(cq->channel->netdev,
-				 "Bad OPCODE in ICOSQ WQE info: 0x%x\n", icowi->opcode);
-}
-
 static void mlx5e_poll_ico_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq)
 {
 	struct mlx5e_icosq *sq = container_of(cq, struct mlx5e_icosq, cq);
 	struct mlx5_cqe64 *cqe;
+	u8  completed_umr = 0;
+	u16 sqcc;
+	int i;
 
 	if (unlikely(!test_bit(MLX5E_SQ_STATE_ENABLED, &sq->state)))
 		return;
@@ -580,28 +556,96 @@ static void mlx5e_poll_ico_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq)
 	if (likely(!cqe))
 		return;
 
-	/* by design, there's only a single cqe */
-	mlx5e_poll_ico_single_cqe(cq, sq, rq, cqe);
+	/* sq->cc must be updated only after mlx5_cqwq_update_db_record(),
+	 * otherwise a cq overrun may occur
+	 */
+	sqcc = sq->cc;
+
+	i = 0;
+	do {
+		u16 wqe_counter;
+		bool last_wqe;
+
+		mlx5_cqwq_pop(&cq->wq);
+
+		wqe_counter = be16_to_cpu(cqe->wqe_counter);
+
+		if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) {
+			netdev_WARN_ONCE(cq->channel->netdev,
+					 "Bad OP in ICOSQ CQE: 0x%x\n", get_cqe_opcode(cqe));
+			break;
+		}
+		do {
+			struct mlx5e_sq_wqe_info *wi;
+			u16 ci;
+
+			last_wqe = (sqcc == wqe_counter);
+
+			ci = mlx5_wq_cyc_ctr2ix(&sq->wq, sqcc);
+			wi = &sq->db.ico_wqe[ci];
+
+			if (likely(wi->opcode == MLX5_OPCODE_UMR)) {
+				sqcc += MLX5E_UMR_WQEBBS;
+				completed_umr++;
+			} else if (likely(wi->opcode == MLX5_OPCODE_NOP)) {
+				sqcc++;
+			} else {
+				netdev_WARN_ONCE(cq->channel->netdev,
+						 "Bad OPCODE in ICOSQ WQE info: 0x%x\n",
+						 wi->opcode);
+			}
+
+		} while (!last_wqe);
+
+	} while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq)));
+
+	sq->cc = sqcc;
 
 	mlx5_cqwq_update_db_record(&cq->wq);
+
+	if (likely(completed_umr)) {
+		mlx5e_post_rx_mpwqe(rq, completed_umr);
+		rq->mpwqe.umr_in_progress -= completed_umr;
+	}
 }
 
 bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq)
 {
+	struct mlx5e_icosq *sq = &rq->channel->icosq;
 	struct mlx5_wq_ll *wq = &rq->mpwqe.wq;
+	u8  missing, i;
+	u16 head;
 
 	if (unlikely(!test_bit(MLX5E_RQ_STATE_ENABLED, &rq->state)))
 		return false;
 
-	mlx5e_poll_ico_cq(&rq->channel->icosq.cq, rq);
+	mlx5e_poll_ico_cq(&sq->cq, rq);
+
+	missing = mlx5_wq_ll_missing(wq) - rq->mpwqe.umr_in_progress;
 
-	if (mlx5_wq_ll_is_full(wq))
+	if (unlikely(rq->mpwqe.umr_in_progress > rq->mpwqe.umr_last_bulk))
+		rq->stats->congst_umr++;
+
+#define UMR_WQE_BULK (2)
+	if (likely(missing < UMR_WQE_BULK))
 		return false;
 
-	if (!rq->mpwqe.umr_in_progress)
-		mlx5e_alloc_rx_mpwqe(rq, wq->head);
-	else
-		rq->stats->congst_umr += mlx5_wq_ll_missing(wq) > 2;
+	head = rq->mpwqe.actual_wq_head;
+	i = missing;
+	do {
+		if (unlikely(mlx5e_alloc_rx_mpwqe(rq, head)))
+			break;
+		head = mlx5_wq_ll_get_wqe_next_ix(wq, head);
+	} while (--i);
+
+	rq->mpwqe.umr_last_bulk    = missing - i;
+	if (sq->doorbell_cseg) {
+		mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, sq->doorbell_cseg);
+		sq->doorbell_cseg = NULL;
+	}
+
+	rq->mpwqe.umr_in_progress += rq->mpwqe.umr_last_bulk;
+	rq->mpwqe.actual_wq_head   = head;
 
 	return false;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index ea934a48c90a..1f87cce421e0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -134,6 +134,11 @@ static inline void mlx5_wq_cyc_update_db_record(struct mlx5_wq_cyc *wq)
 	*wq->db = cpu_to_be32(wq->wqe_ctr);
 }
 
+static inline u16 mlx5_wq_cyc_get_ctr_wrap_cnt(struct mlx5_wq_cyc *wq, u16 ctr)
+{
+	return ctr >> wq->fbc.log_sz;
+}
+
 static inline u16 mlx5_wq_cyc_ctr2ix(struct mlx5_wq_cyc *wq, u16 ctr)
 {
 	return ctr & wq->fbc.sz_m1;
@@ -243,6 +248,13 @@ static inline void *mlx5_wq_ll_get_wqe(struct mlx5_wq_ll *wq, u16 ix)
 	return mlx5_frag_buf_get_wqe(&wq->fbc, ix);
 }
 
+static inline u16 mlx5_wq_ll_get_wqe_next_ix(struct mlx5_wq_ll *wq, u16 ix)
+{
+	struct mlx5_wqe_srq_next_seg *wqe = mlx5_wq_ll_get_wqe(wq, ix);
+
+	return be16_to_cpu(wqe->next_wqe_index);
+}
+
 static inline void mlx5_wq_ll_push(struct mlx5_wq_ll *wq, u16 head_next)
 {
 	wq->head = head_next;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 03/14] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
  2019-04-22 22:32 ` [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Saeed Mahameed
  2019-04-22 22:32 ` [net-next 02/14] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-22 22:32 ` [net-next 04/14] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush Saeed Mahameed
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Shay Agroskin, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Values in enum mlx5e_rq_flag are used as bit indixes.
Intention was to use them with no BIT(i) wrapping.

No functional bug fix here, as the same (shifted)flag bit
is used for all set, test, and clear operations.

Fixes: 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 6b9f661a0577..d77511b42ab8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -532,7 +532,7 @@ typedef bool (*mlx5e_fp_post_rx_wqes)(struct mlx5e_rq *rq);
 typedef void (*mlx5e_fp_dealloc_wqe)(struct mlx5e_rq*, u16);
 
 enum mlx5e_rq_flag {
-	MLX5E_RQ_FLAG_XDP_XMIT = BIT(0),
+	MLX5E_RQ_FLAG_XDP_XMIT,
 };
 
 struct mlx5e_rq_frag_info {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 04/14] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (2 preceding siblings ...)
  2019-04-22 22:32 ` [net-next 03/14] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-22 22:32 ` [net-next 05/14] net/mlx5e: XDP, Add TX MPWQE session counter Saeed Mahameed
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Tariq Toukan,
	Shay Agroskin, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

The XDP redirect flush indication belongs to the receive queue,
not to its XDP send queue.

For this, use a new bit on rq->flags.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h     | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index d77511b42ab8..33f49b41027e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -429,7 +429,6 @@ struct mlx5e_xdpsq {
 	/* dirtied @completion */
 	u32                        xdpi_fifo_cc;
 	u16                        cc;
-	bool                       redirect_flush;
 
 	/* dirtied @xmit */
 	u32                        xdpi_fifo_pc ____cacheline_aligned_in_smp;
@@ -533,6 +532,7 @@ typedef void (*mlx5e_fp_dealloc_wqe)(struct mlx5e_rq*, u16);
 
 enum mlx5e_rq_flag {
 	MLX5E_RQ_FLAG_XDP_XMIT,
+	MLX5E_RQ_FLAG_XDP_REDIRECT,
 };
 
 struct mlx5e_rq_frag_info {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 6778bdeff1a2..03dbb1b8669b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -85,7 +85,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
 		if (unlikely(err))
 			goto xdp_abort;
 		__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
-		rq->xdpsq.redirect_flush = true;
+		__set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
 		mlx5e_page_dma_unmap(rq, di);
 		rq->stats->xdp_redirect++;
 		return true;
@@ -419,9 +419,9 @@ void mlx5e_xdp_rx_poll_complete(struct mlx5e_rq *rq)
 
 	mlx5e_xmit_xdp_doorbell(xdpsq);
 
-	if (xdpsq->redirect_flush) {
+	if (test_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags)) {
 		xdp_do_flush_map();
-		xdpsq->redirect_flush = false;
+		__clear_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
 	}
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 05/14] net/mlx5e: XDP, Add TX MPWQE session counter
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (3 preceding siblings ...)
  2019-04-22 22:32 ` [net-next 04/14] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-22 22:32 ` [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Shay Agroskin,
	Saeed Mahameed

From: Shay Agroskin <shayag@mellanox.com>

This counter tracks how many TX MPWQE sessions are started in XDP SQ
in XDP TX/REDIRECT flow. It counts per-channel and global stats.

Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c   | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 6 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 3 +++
 3 files changed, 11 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 03dbb1b8669b..96f5ea0cf544 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -105,6 +105,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
 static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 {
 	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
+	struct mlx5e_xdpsq_stats *stats = sq->stats;
 	struct mlx5_wq_cyc *wq = &sq->wq;
 	u8  wqebbs;
 	u16 pi;
@@ -131,6 +132,7 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 		       MLX5E_XDP_MPW_MAX_WQEBBS);
 
 	session->max_ds_count = MLX5_SEND_WQEBB_NUM_DS * wqebbs;
+	stats->mpwqe++;
 }
 
 static void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index b75aa8b8bf04..80ee48dcc0a3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -65,6 +65,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_drop) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_redirect) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_xmit) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_mpwqe) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_cqe) },
@@ -79,6 +80,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_wake) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_xmit) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_mpwqe) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_cqes) },
@@ -160,6 +162,7 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_xdp_drop     += rq_stats->xdp_drop;
 		s->rx_xdp_redirect += rq_stats->xdp_redirect;
 		s->rx_xdp_tx_xmit  += xdpsq_stats->xmit;
+		s->rx_xdp_tx_mpwqe += xdpsq_stats->mpwqe;
 		s->rx_xdp_tx_full  += xdpsq_stats->full;
 		s->rx_xdp_tx_err   += xdpsq_stats->err;
 		s->rx_xdp_tx_cqe   += xdpsq_stats->cqes;
@@ -185,6 +188,7 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->ch_eq_rearm    += ch_stats->eq_rearm;
 		/* xdp redirect */
 		s->tx_xdp_xmit    += xdpsq_red_stats->xmit;
+		s->tx_xdp_mpwqe += xdpsq_red_stats->mpwqe;
 		s->tx_xdp_full    += xdpsq_red_stats->full;
 		s->tx_xdp_err     += xdpsq_red_stats->err;
 		s->tx_xdp_cqes    += xdpsq_red_stats->cqes;
@@ -1245,6 +1249,7 @@ static const struct counter_desc sq_stats_desc[] = {
 
 static const struct counter_desc rq_xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
+	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
@@ -1252,6 +1257,7 @@ static const struct counter_desc rq_xdpsq_stats_desc[] = {
 
 static const struct counter_desc xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
+	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 16c3b785f282..1f05ffa086b1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -77,6 +77,7 @@ struct mlx5e_sw_stats {
 	u64 rx_xdp_drop;
 	u64 rx_xdp_redirect;
 	u64 rx_xdp_tx_xmit;
+	u64 rx_xdp_tx_mpwqe;
 	u64 rx_xdp_tx_full;
 	u64 rx_xdp_tx_err;
 	u64 rx_xdp_tx_cqe;
@@ -91,6 +92,7 @@ struct mlx5e_sw_stats {
 	u64 tx_queue_wake;
 	u64 tx_cqe_err;
 	u64 tx_xdp_xmit;
+	u64 tx_xdp_mpwqe;
 	u64 tx_xdp_full;
 	u64 tx_xdp_err;
 	u64 tx_xdp_cqes;
@@ -241,6 +243,7 @@ struct mlx5e_sq_stats {
 
 struct mlx5e_xdpsq_stats {
 	u64 xmit;
+	u64 mpwqe;
 	u64 full;
 	u64 err;
 	/* dirtied @completion */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (4 preceding siblings ...)
  2019-04-22 22:32 ` [net-next 05/14] net/mlx5e: XDP, Add TX MPWQE session counter Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-23 12:53   ` Gal Pressman
  2019-04-22 22:32 ` [net-next 07/14] net/mlx5e: Remove unused parameter Saeed Mahameed
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Shay Agroskin,
	Tariq Toukan, Saeed Mahameed

From: Shay Agroskin <shayag@mellanox.com>

Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).

When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.

This is added to better utilize the HW resources (which now makes
one less packet data prefetch) and allow better scalability, on the
account of CPU usage (which now 'memcpy's the packet into the WQE).

To load balance between HW and CPU and get max packet rate, we use
watermarks to detect how much the HW is congested and move the work
loads back and forth between HW and CPU.

Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

* Tested with hyper-threading disabled

XDP_TX:

|          | before | after   |       |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring   | 12Mpps | 12Mpps  | same  |

XDP_REDIRECT:

** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.

|          | before  | after   |      |
| 32 rings | 64Mpps  | 92Mpps  | +43% |
| 1 ring   | 6.4Mpps | 6.4Mpps | same |

As we can see, feature significantly improves scaling, without
hurting single ring performance.

Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  5 +-
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 22 ++++---
 .../net/ethernet/mellanox/mlx5/core/en/xdp.h  | 57 ++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  2 +-
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  8 ++-
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  3 +
 include/linux/mlx5/qp.h                       |  1 +
 7 files changed, 83 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 33f49b41027e..786c62791ff8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -410,14 +410,17 @@ struct mlx5e_xdp_info_fifo {
 
 struct mlx5e_xdp_wqe_info {
 	u8 num_wqebbs;
-	u8 num_ds;
+	u8 num_pkts;
 };
 
 struct mlx5e_xdp_mpwqe {
 	/* Current MPWQE session */
 	struct mlx5e_tx_wqe *wqe;
 	u8                   ds_count;
+	u8                   pkt_count;
 	u8                   max_ds_count;
+	u8                   complete;
+	u8                   inline_on;
 };
 
 struct mlx5e_xdpsq;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 96f5ea0cf544..138366b1877c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -113,7 +113,9 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 	mlx5e_xdpsq_fetch_wqe(sq, &session->wqe);
 
 	mlx5e_prefetchw(session->wqe->data);
-	session->ds_count = MLX5E_XDP_TX_EMPTY_DS_COUNT;
+	session->ds_count  = MLX5E_XDP_TX_EMPTY_DS_COUNT;
+	session->pkt_count = 0;
+	session->complete  = 0;
 
 	pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
 
@@ -132,6 +134,9 @@ static void mlx5e_xdp_mpwqe_session_start(struct mlx5e_xdpsq *sq)
 		       MLX5E_XDP_MPW_MAX_WQEBBS);
 
 	session->max_ds_count = MLX5_SEND_WQEBB_NUM_DS * wqebbs;
+
+	mlx5e_xdp_update_inline_state(sq);
+
 	stats->mpwqe++;
 }
 
@@ -149,7 +154,7 @@ static void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq)
 	cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | ds_count);
 
 	wi->num_wqebbs = DIV_ROUND_UP(ds_count, MLX5_SEND_WQEBB_NUM_DS);
-	wi->num_ds     = ds_count - MLX5E_XDP_TX_EMPTY_DS_COUNT;
+	wi->num_pkts   = session->pkt_count;
 
 	sq->pc += wi->num_wqebbs;
 
@@ -164,11 +169,9 @@ static bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq,
 	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
 	struct mlx5e_xdpsq_stats *stats = sq->stats;
 
-	dma_addr_t dma_addr    = xdpi->dma_addr;
 	struct xdp_frame *xdpf = xdpi->xdpf;
-	unsigned int dma_len   = xdpf->len;
 
-	if (unlikely(sq->hw_mtu < dma_len)) {
+	if (unlikely(sq->hw_mtu < xdpf->len)) {
 		stats->err++;
 		return false;
 	}
@@ -185,9 +188,10 @@ static bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq,
 		mlx5e_xdp_mpwqe_session_start(sq);
 	}
 
-	mlx5e_xdp_mpwqe_add_dseg(sq, dma_addr, dma_len);
+	mlx5e_xdp_mpwqe_add_dseg(sq, xdpi, stats);
 
-	if (unlikely(session->ds_count == session->max_ds_count))
+	if (unlikely(session->complete ||
+		     session->ds_count == session->max_ds_count))
 		mlx5e_xdp_mpwqe_complete(sq);
 
 	mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, xdpi);
@@ -301,7 +305,7 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq)
 
 			sqcc += wi->num_wqebbs;
 
-			for (j = 0; j < wi->num_ds; j++) {
+			for (j = 0; j < wi->num_pkts; j++) {
 				struct mlx5e_xdp_info xdpi =
 					mlx5e_xdpi_fifo_pop(xdpi_fifo);
 
@@ -342,7 +346,7 @@ void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq)
 
 		sq->cc += wi->num_wqebbs;
 
-		for (i = 0; i < wi->num_ds; i++) {
+		for (i = 0; i < wi->num_pkts; i++) {
 			struct mlx5e_xdp_info xdpi =
 				mlx5e_xdpi_fifo_pop(xdpi_fifo);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index ee27a7c8cd87..858e7a2a13ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -75,16 +75,68 @@ static inline void mlx5e_xmit_xdp_doorbell(struct mlx5e_xdpsq *sq)
 	}
 }
 
+/* Enable inline WQEs to shift some load from a congested HCA (HW) to
+ * a less congested cpu (SW).
+ */
+static inline void mlx5e_xdp_update_inline_state(struct mlx5e_xdpsq *sq)
+{
+	u16 outstanding = sq->xdpi_fifo_pc - sq->xdpi_fifo_cc;
+	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
+
+#define MLX5E_XDP_INLINE_WATERMARK_LOW	10
+#define MLX5E_XDP_INLINE_WATERMARK_HIGH 128
+
+	if (session->inline_on) {
+		if (outstanding <= MLX5E_XDP_INLINE_WATERMARK_LOW)
+			session->inline_on = 0;
+		return;
+	}
+
+	/* inline is false */
+	if (outstanding >= MLX5E_XDP_INLINE_WATERMARK_HIGH)
+		session->inline_on = 1;
+}
+
 static inline void
-mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, dma_addr_t dma_addr, u16 dma_len)
+mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_info *xdpi,
+			 struct mlx5e_xdpsq_stats *stats)
 {
 	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
+	dma_addr_t dma_addr    = xdpi->dma_addr;
+	struct xdp_frame *xdpf = xdpi->xdpf;
 	struct mlx5_wqe_data_seg *dseg =
-		(struct mlx5_wqe_data_seg *)session->wqe + session->ds_count++;
+		(struct mlx5_wqe_data_seg *)session->wqe + session->ds_count;
+	u16 dma_len = xdpf->len;
 
+	session->pkt_count++;
+
+#define MLX5E_XDP_INLINE_WQE_SZ_THRSD (256 - sizeof(struct mlx5_wqe_inline_seg))
+
+	if (session->inline_on && dma_len <= MLX5E_XDP_INLINE_WQE_SZ_THRSD) {
+		struct mlx5_wqe_inline_seg *inline_dseg =
+			(struct mlx5_wqe_inline_seg *)dseg;
+		u16 ds_len = sizeof(*inline_dseg) + dma_len;
+		u16 ds_cnt = DIV_ROUND_UP(ds_len, MLX5_SEND_WQE_DS);
+
+		if (unlikely(session->ds_count + ds_cnt > session->max_ds_count)) {
+			/* Not enough space for inline wqe, send with memory pointer */
+			session->complete = true;
+			goto no_inline;
+		}
+
+		inline_dseg->byte_count = cpu_to_be32(dma_len | MLX5_INLINE_SEG);
+		memcpy(inline_dseg->data, xdpf->data, dma_len);
+
+		session->ds_count += ds_cnt;
+		stats->inlnw++;
+		return;
+	}
+
+no_inline:
 	dseg->addr       = cpu_to_be64(dma_addr);
 	dseg->byte_count = cpu_to_be32(dma_len);
 	dseg->lkey       = sq->mkey_be;
+	session->ds_count++;
 }
 
 static inline void mlx5e_xdpsq_fetch_wqe(struct mlx5e_xdpsq *sq,
@@ -111,5 +163,4 @@ mlx5e_xdpi_fifo_pop(struct mlx5e_xdp_info_fifo *fifo)
 {
 	return fifo->xi[(*fifo->cc)++ & fifo->mask];
 }
-
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 7ab195ac7299..23df6f486c6a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1532,7 +1532,7 @@ static int mlx5e_open_xdpsq(struct mlx5e_channel *c,
 			dseg->lkey = sq->mkey_be;
 
 			wi->num_wqebbs = 1;
-			wi->num_ds     = 1;
+			wi->num_pkts   = 1;
 		}
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 80ee48dcc0a3..ca0ff3b3fbd1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -66,6 +66,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_redirect) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_xmit) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_mpwqe) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_inlnw) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_cqe) },
@@ -81,6 +82,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_xmit) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_mpwqe) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_inlnw) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xdp_cqes) },
@@ -163,6 +165,7 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_xdp_redirect += rq_stats->xdp_redirect;
 		s->rx_xdp_tx_xmit  += xdpsq_stats->xmit;
 		s->rx_xdp_tx_mpwqe += xdpsq_stats->mpwqe;
+		s->rx_xdp_tx_inlnw += xdpsq_stats->inlnw;
 		s->rx_xdp_tx_full  += xdpsq_stats->full;
 		s->rx_xdp_tx_err   += xdpsq_stats->err;
 		s->rx_xdp_tx_cqe   += xdpsq_stats->cqes;
@@ -188,7 +191,8 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->ch_eq_rearm    += ch_stats->eq_rearm;
 		/* xdp redirect */
 		s->tx_xdp_xmit    += xdpsq_red_stats->xmit;
-		s->tx_xdp_mpwqe += xdpsq_red_stats->mpwqe;
+		s->tx_xdp_mpwqe   += xdpsq_red_stats->mpwqe;
+		s->tx_xdp_inlnw   += xdpsq_red_stats->inlnw;
 		s->tx_xdp_full    += xdpsq_red_stats->full;
 		s->tx_xdp_err     += xdpsq_red_stats->err;
 		s->tx_xdp_cqes    += xdpsq_red_stats->cqes;
@@ -1250,6 +1254,7 @@ static const struct counter_desc sq_stats_desc[] = {
 static const struct counter_desc rq_xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
+	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, inlnw) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_RQ_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
@@ -1258,6 +1263,7 @@ static const struct counter_desc rq_xdpsq_stats_desc[] = {
 static const struct counter_desc xdpsq_stats_desc[] = {
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, xmit) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, mpwqe) },
+	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, inlnw) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, full) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, err) },
 	{ MLX5E_DECLARE_XDPSQ_STAT(struct mlx5e_xdpsq_stats, cqes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 1f05ffa086b1..ac3c7c2a0964 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -78,6 +78,7 @@ struct mlx5e_sw_stats {
 	u64 rx_xdp_redirect;
 	u64 rx_xdp_tx_xmit;
 	u64 rx_xdp_tx_mpwqe;
+	u64 rx_xdp_tx_inlnw;
 	u64 rx_xdp_tx_full;
 	u64 rx_xdp_tx_err;
 	u64 rx_xdp_tx_cqe;
@@ -93,6 +94,7 @@ struct mlx5e_sw_stats {
 	u64 tx_cqe_err;
 	u64 tx_xdp_xmit;
 	u64 tx_xdp_mpwqe;
+	u64 tx_xdp_inlnw;
 	u64 tx_xdp_full;
 	u64 tx_xdp_err;
 	u64 tx_xdp_cqes;
@@ -244,6 +246,7 @@ struct mlx5e_sq_stats {
 struct mlx5e_xdpsq_stats {
 	u64 xmit;
 	u64 mpwqe;
+	u64 inlnw;
 	u64 full;
 	u64 err;
 	/* dirtied @completion */
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index 0343c81d4c5f..3ba4edbd17a6 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -395,6 +395,7 @@ struct mlx5_wqe_signature_seg {
 
 struct mlx5_wqe_inline_seg {
 	__be32	byte_count;
+	__be32	data[0];
 };
 
 enum mlx5_sig_type {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 07/14] net/mlx5e: Remove unused parameter
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (5 preceding siblings ...)
  2019-04-22 22:32 ` [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
@ 2019-04-22 22:32 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 08/14] net/mlx5e: Report mlx5e_xdp_set errors Saeed Mahameed
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

params is unused in mlx5e_init_di_list.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 23df6f486c6a..8185773a7bed 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -470,7 +470,6 @@ static void mlx5e_init_frags_partition(struct mlx5e_rq *rq)
 }
 
 static int mlx5e_init_di_list(struct mlx5e_rq *rq,
-			      struct mlx5e_params *params,
 			      int wq_sz, int cpu)
 {
 	int len = wq_sz << rq->wqe.info.log_num_frags;
@@ -598,7 +597,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 			goto err_free;
 		}
 
-		err = mlx5e_init_di_list(rq, params, wq_sz, c->cpu);
+		err = mlx5e_init_di_list(rq, wq_sz, c->cpu);
 		if (err)
 			goto err_free;
 		rq->post_wqes = mlx5e_post_rx_wqes;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 08/14] net/mlx5e: Report mlx5e_xdp_set errors
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (6 preceding siblings ...)
  2019-04-22 22:32 ` [net-next 07/14] net/mlx5e: Remove unused parameter Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 09/14] net/mlx5e: Move parameter calculation functions to en/params.c Saeed Mahameed
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

If the channels fail to reopen after setting an XDP program, return the
error code instead of 0. A proper fix is still needed, as now any error
while reopening the channels brings the interface down. This patch only
adds error reporting.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8185773a7bed..a3397d5bfa76 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4313,7 +4313,7 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
 		mlx5e_set_rq_type(priv->mdev, &priv->channels.params);
 
 	if (was_opened && reset)
-		mlx5e_open_locked(netdev);
+		err = mlx5e_open_locked(netdev);
 
 	if (!test_bit(MLX5E_STATE_OPENED, &priv->state) || reset)
 		goto unlock;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 09/14] net/mlx5e: Move parameter calculation functions to en/params.c
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (7 preceding siblings ...)
  2019-04-22 22:33 ` [net-next 08/14] net/mlx5e: Report mlx5e_xdp_set errors Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 10/14] net/mlx5e: Add an underflow warning comment Saeed Mahameed
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

This commit moves the parameter calculation functions to a separate file
for better modularity and code sharing with future features.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   3 +-
 .../ethernet/mellanox/mlx5/core/en/params.c   | 102 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/en/params.h   |  23 ++++
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  99 +----------------
 4 files changed, 128 insertions(+), 99 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/params.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 1a16f6d73cbc..3dbbe3b643b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -22,7 +22,8 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 #
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
 		en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
-		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o
+		en_selftest.o en/port.o en/monitor_stats.o en/reporter_tx.o \
+		en/params.o
 
 #
 # Netdev extra
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
new file mode 100644
index 000000000000..658337c3bba1
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2019 Mellanox Technologies. */
+
+#include "en/params.h"
+
+u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params)
+{
+	u16 hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
+	u16 linear_rq_headroom = params->xdp_prog ?
+		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
+	u32 frag_sz;
+
+	linear_rq_headroom += NET_IP_ALIGN;
+
+	frag_sz = MLX5_SKB_FRAG_SZ(linear_rq_headroom + hw_mtu);
+
+	if (params->xdp_prog && frag_sz < PAGE_SIZE)
+		frag_sz = PAGE_SIZE;
+
+	return frag_sz;
+}
+
+u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params)
+{
+	u32 linear_frag_sz = mlx5e_rx_get_linear_frag_sz(params);
+
+	return MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(linear_frag_sz);
+}
+
+bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
+			    struct mlx5e_params *params)
+{
+	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
+
+	return !params->lro_en && frag_sz <= PAGE_SIZE;
+}
+
+#define MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ ((BIT(__mlx5_bit_sz(wq, log_wqe_stride_size)) - 1) + \
+					  MLX5_MPWQE_LOG_STRIDE_SZ_BASE)
+bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
+				  struct mlx5e_params *params)
+{
+	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
+	s8 signed_log_num_strides_param;
+	u8 log_num_strides;
+
+	if (!mlx5e_rx_is_linear_skb(mdev, params))
+		return false;
+
+	if (order_base_2(frag_sz) > MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ)
+		return false;
+
+	if (MLX5_CAP_GEN(mdev, ext_stride_num_range))
+		return true;
+
+	log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(frag_sz);
+	signed_log_num_strides_param =
+		(s8)log_num_strides - MLX5_MPWQE_LOG_NUM_STRIDES_BASE;
+
+	return signed_log_num_strides_param >= 0;
+}
+
+u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params)
+{
+	if (params->log_rq_mtu_frames <
+	    mlx5e_mpwqe_log_pkts_per_wqe(params) + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
+		return MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW;
+
+	return params->log_rq_mtu_frames - mlx5e_mpwqe_log_pkts_per_wqe(params);
+}
+
+u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params)
+{
+	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params))
+		return order_base_2(mlx5e_rx_get_linear_frag_sz(params));
+
+	return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
+}
+
+u8 mlx5e_mpwqe_get_log_num_strides(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params)
+{
+	return MLX5_MPWRQ_LOG_WQE_SZ -
+		mlx5e_mpwqe_get_log_stride_size(mdev, params);
+}
+
+u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
+			  struct mlx5e_params *params)
+{
+	u16 linear_rq_headroom = params->xdp_prog ?
+		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
+	bool is_linear_skb;
+
+	linear_rq_headroom += NET_IP_ALIGN;
+
+	is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
+		mlx5e_rx_is_linear_skb(mdev, params) :
+		mlx5e_rx_mpwqe_is_linear_skb(mdev, params);
+
+	return is_linear_skb ? linear_rq_headroom : 0;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
new file mode 100644
index 000000000000..0ef1436c4c76
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2019 Mellanox Technologies. */
+
+#ifndef __MLX5_EN_PARAMS_H__
+#define __MLX5_EN_PARAMS_H__
+
+#include "en.h"
+
+u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params);
+u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params);
+bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
+			    struct mlx5e_params *params);
+bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
+				  struct mlx5e_params *params);
+u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params);
+u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params);
+u8 mlx5e_mpwqe_get_log_num_strides(struct mlx5_core_dev *mdev,
+				   struct mlx5e_params *params);
+u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
+			  struct mlx5e_params *params);
+
+#endif /* __MLX5_EN_PARAMS_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index a3397d5bfa76..8cab86a558ab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -55,6 +55,7 @@
 #include "lib/eq.h"
 #include "en/monitor_stats.h"
 #include "en/reporter.h"
+#include "en/params.h"
 
 struct mlx5e_rq_param {
 	u32			rqc[MLX5_ST_SZ_DW(rqc)];
@@ -103,104 +104,6 @@ bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
 	return true;
 }
 
-static u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params)
-{
-	u16 hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
-	u16 linear_rq_headroom = params->xdp_prog ?
-		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
-	u32 frag_sz;
-
-	linear_rq_headroom += NET_IP_ALIGN;
-
-	frag_sz = MLX5_SKB_FRAG_SZ(linear_rq_headroom + hw_mtu);
-
-	if (params->xdp_prog && frag_sz < PAGE_SIZE)
-		frag_sz = PAGE_SIZE;
-
-	return frag_sz;
-}
-
-static u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params)
-{
-	u32 linear_frag_sz = mlx5e_rx_get_linear_frag_sz(params);
-
-	return MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(linear_frag_sz);
-}
-
-static bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
-				   struct mlx5e_params *params)
-{
-	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
-
-	return !params->lro_en && frag_sz <= PAGE_SIZE;
-}
-
-#define MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ ((BIT(__mlx5_bit_sz(wq, log_wqe_stride_size)) - 1) + \
-					  MLX5_MPWQE_LOG_STRIDE_SZ_BASE)
-static bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
-					 struct mlx5e_params *params)
-{
-	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
-	s8 signed_log_num_strides_param;
-	u8 log_num_strides;
-
-	if (!mlx5e_rx_is_linear_skb(mdev, params))
-		return false;
-
-	if (order_base_2(frag_sz) > MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ)
-		return false;
-
-	if (MLX5_CAP_GEN(mdev, ext_stride_num_range))
-		return true;
-
-	log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(frag_sz);
-	signed_log_num_strides_param =
-		(s8)log_num_strides - MLX5_MPWQE_LOG_NUM_STRIDES_BASE;
-
-	return signed_log_num_strides_param >= 0;
-}
-
-static u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params)
-{
-	if (params->log_rq_mtu_frames <
-	    mlx5e_mpwqe_log_pkts_per_wqe(params) + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
-		return MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW;
-
-	return params->log_rq_mtu_frames - mlx5e_mpwqe_log_pkts_per_wqe(params);
-}
-
-static u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
-					  struct mlx5e_params *params)
-{
-	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params))
-		return order_base_2(mlx5e_rx_get_linear_frag_sz(params));
-
-	return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
-}
-
-static u8 mlx5e_mpwqe_get_log_num_strides(struct mlx5_core_dev *mdev,
-					  struct mlx5e_params *params)
-{
-	return MLX5_MPWRQ_LOG_WQE_SZ -
-		mlx5e_mpwqe_get_log_stride_size(mdev, params);
-}
-
-static u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
-				 struct mlx5e_params *params)
-{
-	u16 linear_rq_headroom = params->xdp_prog ?
-		XDP_PACKET_HEADROOM : MLX5_RX_HEADROOM;
-	bool is_linear_skb;
-
-	linear_rq_headroom += NET_IP_ALIGN;
-
-	is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
-		mlx5e_rx_is_linear_skb(mdev, params) :
-		mlx5e_rx_mpwqe_is_linear_skb(mdev, params);
-
-	return is_linear_skb ? linear_rq_headroom : 0;
-}
-
 void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
 			       struct mlx5e_params *params)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 10/14] net/mlx5e: Add an underflow warning comment
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (8 preceding siblings ...)
  2019-04-22 22:33 ` [net-next 09/14] net/mlx5e: Move parameter calculation functions to en/params.c Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 11/14] net/mlx5e: Remove unused parameter Saeed Mahameed
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

mlx5e_mpwqe_get_log_rq_size calculates the number of WQEs (N) based on
the requested number of frames in the RQ (F) and the number of packets
per WQE (P). It ensures that N is not less than the minimum number of
WQEs in an RQ (N_min). Arithmetically, it means that F / P >= N_min
should be true. This function deals with logarithms, so it should check
that log(F) - log(P) >= log(N_min). However, if F < P, this expression
will cause an unsigned underflow. Check log(F) >= log(P) + log(N_min)
instead.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 658337c3bba1..fa6661ea6310 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -62,11 +62,14 @@ bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
 
 u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params)
 {
+	u8 log_pkts_per_wqe = mlx5e_mpwqe_log_pkts_per_wqe(params);
+
+	/* Numbers are unsigned, don't subtract to avoid underflow. */
 	if (params->log_rq_mtu_frames <
-	    mlx5e_mpwqe_log_pkts_per_wqe(params) + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
+	    log_pkts_per_wqe + MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW)
 		return MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW;
 
-	return params->log_rq_mtu_frames - mlx5e_mpwqe_log_pkts_per_wqe(params);
+	return params->log_rq_mtu_frames - log_pkts_per_wqe;
 }
 
 u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 11/14] net/mlx5e: Remove unused parameter
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (9 preceding siblings ...)
  2019-04-22 22:33 ` [net-next 10/14] net/mlx5e: Add an underflow warning comment Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 12/14] net/mlx5e: Take HW interrupt trigger into a function Saeed Mahameed
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

mdev is unused in mlx5e_rx_is_linear_skb.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c |  7 +++----
 drivers/net/ethernet/mellanox/mlx5/core/en/params.h |  3 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c   | 10 +++++-----
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index fa6661ea6310..d3744bffbae3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -27,8 +27,7 @@ u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params)
 	return MLX5_MPWRQ_LOG_WQE_SZ - order_base_2(linear_frag_sz);
 }
 
-bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
-			    struct mlx5e_params *params)
+bool mlx5e_rx_is_linear_skb(struct mlx5e_params *params)
 {
 	u32 frag_sz = mlx5e_rx_get_linear_frag_sz(params);
 
@@ -44,7 +43,7 @@ bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
 	s8 signed_log_num_strides_param;
 	u8 log_num_strides;
 
-	if (!mlx5e_rx_is_linear_skb(mdev, params))
+	if (!mlx5e_rx_is_linear_skb(params))
 		return false;
 
 	if (order_base_2(frag_sz) > MLX5_MAX_MPWQE_LOG_WQE_STRIDE_SZ)
@@ -98,7 +97,7 @@ u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
 	linear_rq_headroom += NET_IP_ALIGN;
 
 	is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
-		mlx5e_rx_is_linear_skb(mdev, params) :
+		mlx5e_rx_is_linear_skb(params) :
 		mlx5e_rx_mpwqe_is_linear_skb(mdev, params);
 
 	return is_linear_skb ? linear_rq_headroom : 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
index 0ef1436c4c76..b106a0236f36 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -8,8 +8,7 @@
 
 u32 mlx5e_rx_get_linear_frag_sz(struct mlx5e_params *params);
 u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5e_params *params);
-bool mlx5e_rx_is_linear_skb(struct mlx5_core_dev *mdev,
-			    struct mlx5e_params *params);
+bool mlx5e_rx_is_linear_skb(struct mlx5e_params *params);
 bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev,
 				  struct mlx5e_params *params);
 u8 mlx5e_mpwqe_get_log_rq_size(struct mlx5e_params *params);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8cab86a558ab..1c328dbb6fe0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -518,7 +518,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 			goto err_free;
 		}
 
-		rq->wqe.skb_from_cqe = mlx5e_rx_is_linear_skb(mdev, params) ?
+		rq->wqe.skb_from_cqe = mlx5e_rx_is_linear_skb(params) ?
 			mlx5e_skb_from_cqe_linear :
 			mlx5e_skb_from_cqe_nonlinear;
 		rq->mkey_be = c->mkey_be;
@@ -1960,7 +1960,7 @@ static void mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 		byte_count += MLX5E_METADATA_ETHER_LEN;
 #endif
 
-	if (mlx5e_rx_is_linear_skb(mdev, params)) {
+	if (mlx5e_rx_is_linear_skb(params)) {
 		int frag_stride;
 
 		frag_stride = mlx5e_rx_get_linear_frag_sz(params);
@@ -3722,7 +3722,7 @@ int mlx5e_change_mtu(struct net_device *netdev, int new_mtu,
 	new_channels.params.sw_mtu = new_mtu;
 
 	if (params->xdp_prog &&
-	    !mlx5e_rx_is_linear_skb(priv->mdev, &new_channels.params)) {
+	    !mlx5e_rx_is_linear_skb(&new_channels.params)) {
 		netdev_err(netdev, "MTU(%d) > %d is not allowed while XDP enabled\n",
 			   new_mtu, MLX5E_XDP_MAX_MTU);
 		err = -EINVAL;
@@ -4163,7 +4163,7 @@ static int mlx5e_xdp_allowed(struct mlx5e_priv *priv, struct bpf_prog *prog)
 	new_channels.params = priv->channels.params;
 	new_channels.params.xdp_prog = prog;
 
-	if (!mlx5e_rx_is_linear_skb(priv->mdev, &new_channels.params)) {
+	if (!mlx5e_rx_is_linear_skb(&new_channels.params)) {
 		netdev_warn(netdev, "XDP is not allowed with MTU(%d) > %d\n",
 			    new_channels.params.sw_mtu, MLX5E_XDP_MAX_MTU);
 		return -EINVAL;
@@ -4506,7 +4506,7 @@ void mlx5e_build_rq_params(struct mlx5_core_dev *mdev,
 	if (!slow_pci_heuristic(mdev) &&
 	    mlx5e_striding_rq_possible(mdev, params) &&
 	    (mlx5e_rx_mpwqe_is_linear_skb(mdev, params) ||
-	     !mlx5e_rx_is_linear_skb(mdev, params)))
+	     !mlx5e_rx_is_linear_skb(params)))
 		MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ, true);
 	mlx5e_set_rq_type(mdev, params);
 	mlx5e_init_rq_type_params(mdev, params);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 12/14] net/mlx5e: Take HW interrupt trigger into a function
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (10 preceding siblings ...)
  2019-04-22 22:33 ` [net-next 11/14] net/mlx5e: Remove unused parameter Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 13/14] net/mlx5e: Remove unused rx_page_reuse stat Saeed Mahameed
  2019-04-22 22:33 ` [net-next 14/14] net/mlx5e: Use #define for the WQE wait timeout constant Saeed Mahameed
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

mlx5e_trigger_irq posts a NOP to the ICO SQ just to trigger an IRQ and
enter the NAPI poll on the right CPU according to the affinity. Use it
in mlx5e_activate_rq.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +---------
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 11 +++++++++++
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 786c62791ff8..b81acd4d72ac 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -779,6 +779,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 			  struct mlx5e_tx_wqe *wqe, u16 pi, bool xmit_more);
 
+void mlx5e_trigger_irq(struct mlx5e_icosq *sq);
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 1c328dbb6fe0..8ae17dcad487 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -877,16 +877,8 @@ static int mlx5e_open_rq(struct mlx5e_channel *c,
 
 static void mlx5e_activate_rq(struct mlx5e_rq *rq)
 {
-	struct mlx5e_icosq *sq = &rq->channel->icosq;
-	struct mlx5_wq_cyc *wq = &sq->wq;
-	struct mlx5e_tx_wqe *nopwqe;
-
-	u16 pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
-
 	set_bit(MLX5E_RQ_STATE_ENABLED, &rq->state);
-	sq->db.ico_wqe[pi].opcode     = MLX5_OPCODE_NOP;
-	nopwqe = mlx5e_post_nop(wq, sq->sqn, &sq->pc);
-	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &nopwqe->ctrl);
+	mlx5e_trigger_irq(&rq->channel->icosq);
 }
 
 static void mlx5e_deactivate_rq(struct mlx5e_rq *rq)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index b4af5e19f6ac..f9862bf75491 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -71,6 +71,17 @@ static void mlx5e_handle_rx_dim(struct mlx5e_rq *rq)
 	net_dim(&rq->dim, dim_sample);
 }
 
+void mlx5e_trigger_irq(struct mlx5e_icosq *sq)
+{
+	struct mlx5_wq_cyc *wq = &sq->wq;
+	struct mlx5e_tx_wqe *nopwqe;
+	u16 pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
+
+	sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_NOP;
+	nopwqe = mlx5e_post_nop(wq, sq->sqn, &sq->pc);
+	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &nopwqe->ctrl);
+}
+
 int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct mlx5e_channel *c = container_of(napi, struct mlx5e_channel,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 13/14] net/mlx5e: Remove unused rx_page_reuse stat
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (11 preceding siblings ...)
  2019-04-22 22:33 ` [net-next 12/14] net/mlx5e: Take HW interrupt trigger into a function Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  2019-04-22 22:33 ` [net-next 14/14] net/mlx5e: Use #define for the WQE wait timeout constant Saeed Mahameed
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

Remove the no longer used page_reuse stat of RQs.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 ---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 --
 2 files changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index ca0ff3b3fbd1..483d321d2151 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -93,7 +93,6 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_buff_alloc_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_blks) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_pkts) },
-	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_page_reuse) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_reuse) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) },
@@ -176,7 +175,6 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_buff_alloc_err += rq_stats->buff_alloc_err;
 		s->rx_cqe_compress_blks += rq_stats->cqe_compress_blks;
 		s->rx_cqe_compress_pkts += rq_stats->cqe_compress_pkts;
-		s->rx_page_reuse  += rq_stats->page_reuse;
 		s->rx_cache_reuse += rq_stats->cache_reuse;
 		s->rx_cache_full  += rq_stats->cache_full;
 		s->rx_cache_empty += rq_stats->cache_empty;
@@ -1220,7 +1218,6 @@ static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, buff_alloc_err) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_blks) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) },
-	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, page_reuse) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_reuse) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_full) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_empty) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index ac3c7c2a0964..cdddcc46971b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -105,7 +105,6 @@ struct mlx5e_sw_stats {
 	u64 rx_buff_alloc_err;
 	u64 rx_cqe_compress_blks;
 	u64 rx_cqe_compress_pkts;
-	u64 rx_page_reuse;
 	u64 rx_cache_reuse;
 	u64 rx_cache_full;
 	u64 rx_cache_empty;
@@ -205,7 +204,6 @@ struct mlx5e_rq_stats {
 	u64 buff_alloc_err;
 	u64 cqe_compress_blks;
 	u64 cqe_compress_pkts;
-	u64 page_reuse;
 	u64 cache_reuse;
 	u64 cache_full;
 	u64 cache_empty;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [net-next 14/14] net/mlx5e: Use #define for the WQE wait timeout constant
  2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
                   ` (12 preceding siblings ...)
  2019-04-22 22:33 ` [net-next 13/14] net/mlx5e: Remove unused rx_page_reuse stat Saeed Mahameed
@ 2019-04-22 22:33 ` Saeed Mahameed
  13 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-22 22:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon,
	Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@mellanox.com>

Create a #define for the timeout of mlx5e_wait_for_min_rx_wqes to
clarify the meaning of a magic number.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8ae17dcad487..69a9d67396ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2256,14 +2256,18 @@ static void mlx5e_activate_channels(struct mlx5e_channels *chs)
 		mlx5e_activate_channel(chs->c[i]);
 }
 
+#define MLX5E_RQ_WQES_TIMEOUT 20000 /* msecs */
+
 static int mlx5e_wait_channels_min_rx_wqes(struct mlx5e_channels *chs)
 {
 	int err = 0;
 	int i;
 
-	for (i = 0; i < chs->num; i++)
-		err |= mlx5e_wait_for_min_rx_wqes(&chs->c[i]->rq,
-						  err ? 0 : 20000);
+	for (i = 0; i < chs->num; i++) {
+		int timeout = err ? 0 : MLX5E_RQ_WQES_TIMEOUT;
+
+		err |= mlx5e_wait_for_min_rx_wqes(&chs->c[i]->rq, timeout);
+	}
 
 	return err ? -ETIMEDOUT : 0;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-22 22:32 ` [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Saeed Mahameed
@ 2019-04-23  2:46   ` Jakub Kicinski
  2019-04-23 13:23     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2019-04-23  2:46 UTC (permalink / raw)
  To: Saeed Mahameed, Tariq Toukan
  Cc: David S. Miller, netdev, Jesper Dangaard Brouer, Jonathan Lemon

On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> index 51e109fdeec1..6147be23a9b9 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> @@ -50,6 +50,7 @@
>  #include <net/xdp.h>
>  #include <linux/net_dim.h>
>  #include <linux/bits.h>
> +#include <linux/prefetch.h>
>  #include "wq.h"
>  #include "mlx5_core.h"
>  #include "en_stats.h"
> @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
>  	mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq->wq.cc);
>  }
>  
> +static inline void mlx5e_prefetch(void *p)
> +{
> +	prefetch(p);
> +#if L1_CACHE_BYTES < 128
> +	prefetch(p + L1_CACHE_BYTES);
> +#endif
> +}
> +
> +static inline void mlx5e_prefetchw(void *p)
> +{
> +	prefetchw(p);
> +#if L1_CACHE_BYTES < 128
> +	prefetchw(p + L1_CACHE_BYTES);
> +#endif
> +}

All Intel drivers do the exact same thing, perhaps it's time to add a
helper fot this?

net_prefetch_headers()

or some such?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow
  2019-04-22 22:32 ` [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
@ 2019-04-23 12:53   ` Gal Pressman
  2019-04-23 16:46     ` Saeed Mahameed
  0 siblings, 1 reply; 25+ messages in thread
From: Gal Pressman @ 2019-04-23 12:53 UTC (permalink / raw)
  To: Saeed Mahameed, David S. Miller
  Cc: netdev, Jesper Dangaard Brouer, Jonathan Lemon, Shay Agroskin,
	Tariq Toukan

On 23-Apr-19 01:32, Saeed Mahameed wrote:
>  static inline void
> -mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, dma_addr_t dma_addr, u16 dma_len)
> +mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_info *xdpi,
> +			 struct mlx5e_xdpsq_stats *stats)

Passing stats as a function parameter is weird, why not remove and use sq->stats?

>  {
>  	struct mlx5e_xdp_mpwqe *session = &sq->mpwqe;
> +	dma_addr_t dma_addr    = xdpi->dma_addr;
> +	struct xdp_frame *xdpf = xdpi->xdpf;
>  	struct mlx5_wqe_data_seg *dseg =
> -		(struct mlx5_wqe_data_seg *)session->wqe + session->ds_count++;
> +		(struct mlx5_wqe_data_seg *)session->wqe + session->ds_count;
> +	u16 dma_len = xdpf->len;
>  
> +	session->pkt_count++;
> +
> +#define MLX5E_XDP_INLINE_WQE_SZ_THRSD (256 - sizeof(struct mlx5_wqe_inline_seg))
> +
> +	if (session->inline_on && dma_len <= MLX5E_XDP_INLINE_WQE_SZ_THRSD) {
> +		struct mlx5_wqe_inline_seg *inline_dseg =
> +			(struct mlx5_wqe_inline_seg *)dseg;
> +		u16 ds_len = sizeof(*inline_dseg) + dma_len;
> +		u16 ds_cnt = DIV_ROUND_UP(ds_len, MLX5_SEND_WQE_DS);
> +
> +		if (unlikely(session->ds_count + ds_cnt > session->max_ds_count)) {
> +			/* Not enough space for inline wqe, send with memory pointer */
> +			session->complete = true;
> +			goto no_inline;
> +		}
> +
> +		inline_dseg->byte_count = cpu_to_be32(dma_len | MLX5_INLINE_SEG);
> +		memcpy(inline_dseg->data, xdpf->data, dma_len);
> +
> +		session->ds_count += ds_cnt;
> +		stats->inlnw++;
> +		return;
> +	}
> +
> +no_inline:
>  	dseg->addr       = cpu_to_be64(dma_addr);
>  	dseg->byte_count = cpu_to_be32(dma_len);
>  	dseg->lkey       = sq->mkey_be;
> +	session->ds_count++;
>  }

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23  2:46   ` Jakub Kicinski
@ 2019-04-23 13:23     ` Jesper Dangaard Brouer
  2019-04-23 15:21       ` Alexander Duyck
  0 siblings, 1 reply; 25+ messages in thread
From: Jesper Dangaard Brouer @ 2019-04-23 13:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Saeed Mahameed, Tariq Toukan, David S. Miller, netdev,
	Jonathan Lemon, brouer, Alexander Duyck

On Mon, 22 Apr 2019 19:46:47 -0700
Jakub Kicinski <jakub.kicinski@netronome.com> wrote:

> On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > index 51e109fdeec1..6147be23a9b9 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > @@ -50,6 +50,7 @@
> >  #include <net/xdp.h>
> >  #include <linux/net_dim.h>
> >  #include <linux/bits.h>
> > +#include <linux/prefetch.h>
> >  #include "wq.h"
> >  #include "mlx5_core.h"
> >  #include "en_stats.h"
> > @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
> >  	mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq->wq.cc);
> >  }
> >  
> > +static inline void mlx5e_prefetch(void *p)
> > +{
> > +	prefetch(p);
> > +#if L1_CACHE_BYTES < 128
> > +	prefetch(p + L1_CACHE_BYTES);
> > +#endif
> > +}
> > +
> > +static inline void mlx5e_prefetchw(void *p)
> > +{
> > +	prefetchw(p);
> > +#if L1_CACHE_BYTES < 128
> > +	prefetchw(p + L1_CACHE_BYTES);
> > +#endif
> > +}  
> 
> All Intel drivers do the exact same thing, perhaps it's time to add a
> helper fot this?
> 
> net_prefetch_headers()
> 
> or some such?

I wonder if Tariq measured any effect from doing this?

Because Intel CPUs will usually already prefetch the next cache-line,
as described in [1], you can even read (and modify) this MSR 0x1A4
e.g. via tools in [2].  Maybe Intel guys added it before this was done
in HW, and never cleaned it up?

[1] https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

[2] http://www.kernel.org/pub/linux/utils/cpu/msr-tools/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23 13:23     ` Jesper Dangaard Brouer
@ 2019-04-23 15:21       ` Alexander Duyck
  2019-04-23 16:42         ` Saeed Mahameed
  0 siblings, 1 reply; 25+ messages in thread
From: Alexander Duyck @ 2019-04-23 15:21 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Jakub Kicinski, Saeed Mahameed, Tariq Toukan, David S. Miller,
	Netdev, Jonathan Lemon

On Tue, Apr 23, 2019 at 6:23 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Mon, 22 Apr 2019 19:46:47 -0700
> Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
>
> > On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > index 51e109fdeec1..6147be23a9b9 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > @@ -50,6 +50,7 @@
> > >  #include <net/xdp.h>
> > >  #include <linux/net_dim.h>
> > >  #include <linux/bits.h>
> > > +#include <linux/prefetch.h>
> > >  #include "wq.h"
> > >  #include "mlx5_core.h"
> > >  #include "en_stats.h"
> > > @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
> > >     mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq->wq.cc);
> > >  }
> > >
> > > +static inline void mlx5e_prefetch(void *p)
> > > +{
> > > +   prefetch(p);
> > > +#if L1_CACHE_BYTES < 128
> > > +   prefetch(p + L1_CACHE_BYTES);
> > > +#endif
> > > +}
> > > +
> > > +static inline void mlx5e_prefetchw(void *p)
> > > +{
> > > +   prefetchw(p);
> > > +#if L1_CACHE_BYTES < 128
> > > +   prefetchw(p + L1_CACHE_BYTES);
> > > +#endif
> > > +}
> >
> > All Intel drivers do the exact same thing, perhaps it's time to add a
> > helper fot this?
> >
> > net_prefetch_headers()
> >
> > or some such?
>
> I wonder if Tariq measured any effect from doing this?
>
> Because Intel CPUs will usually already prefetch the next cache-line,
> as described in [1], you can even read (and modify) this MSR 0x1A4
> e.g. via tools in [2].  Maybe Intel guys added it before this was done
> in HW, and never cleaned it up?
>
> [1] https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors

The issue is the adjacent cache line prefetcher can be on or off and a
network driver shouldn't really be going through and twiddling those
sort of bits. In some cases having it on can result in more memory
being consumed then is needed. The reason why I enabled the additional
cacheline prefetch for the Intel NICs is because most TCP packets are
at a minimum 68 bytes for just the headers so there was an advantage
for TCP traffic to make certain we prefetched at least enough for us
to process the headers.

As far as Jakub comment about combining the functions I would be okay
with that. We just need to make it a static inline function available
to all the network drivers.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23 15:21       ` Alexander Duyck
@ 2019-04-23 16:42         ` Saeed Mahameed
  2019-04-23 17:27           ` Alexander Duyck
  0 siblings, 1 reply; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-23 16:42 UTC (permalink / raw)
  To: brouer, alexander.duyck; +Cc: davem, netdev, jakub.kicinski, Tariq Toukan, bsd

On Tue, 2019-04-23 at 08:21 -0700, Alexander Duyck wrote:
> On Tue, Apr 23, 2019 at 6:23 AM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > On Mon, 22 Apr 2019 19:46:47 -0700
> > Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> > 
> > > On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > index 51e109fdeec1..6147be23a9b9 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > @@ -50,6 +50,7 @@
> > > >  #include <net/xdp.h>
> > > >  #include <linux/net_dim.h>
> > > >  #include <linux/bits.h>
> > > > +#include <linux/prefetch.h>
> > > >  #include "wq.h"
> > > >  #include "mlx5_core.h"
> > > >  #include "en_stats.h"
> > > > @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct
> > > > mlx5e_cq *cq)
> > > >     mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq-
> > > > >wq.cc);
> > > >  }
> > > > 
> > > > +static inline void mlx5e_prefetch(void *p)
> > > > +{
> > > > +   prefetch(p);
> > > > +#if L1_CACHE_BYTES < 128
> > > > +   prefetch(p + L1_CACHE_BYTES);
> > > > +#endif
> > > > +}
> > > > +
> > > > +static inline void mlx5e_prefetchw(void *p)
> > > > +{
> > > > +   prefetchw(p);
> > > > +#if L1_CACHE_BYTES < 128
> > > > +   prefetchw(p + L1_CACHE_BYTES);
> > > > +#endif
> > > > +}
> > > 
> > > All Intel drivers do the exact same thing, perhaps it's time to
> > > add a
> > > helper fot this?
> > > 
> > > net_prefetch_headers()
> > > 
> > > or some such?
> > 
> > I wonder if Tariq measured any effect from doing this?
> > 
> > Because Intel CPUs will usually already prefetch the next cache-
> > line,
> > as described in [1], you can even read (and modify) this MSR 0x1A4
> > e.g. via tools in [2].  Maybe Intel guys added it before this was
> > done
> > in HW, and never cleaned it up?
> > 
> > [1] 
> > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors
> 
> The issue is the adjacent cache line prefetcher can be on or off and
> a
> network driver shouldn't really be going through and twiddling those
> sort of bits. In some cases having it on can result in more memory
> being consumed then is needed. The reason why I enabled the
> additional
> cacheline prefetch for the Intel NICs is because most TCP packets are
> at a minimum 68 bytes for just the headers so there was an advantage
> for TCP traffic to make certain we prefetched at least enough for us
> to process the headers.
> 

So if L2 adjacent cache line prefetcher bit is enabled then this
additional prefetch step is redundant ? what is the performance cost in
this case ?

> As far as Jakub comment about combining the functions I would be okay
> with that. We just need to make it a static inline function available
> to all the network drivers.
> 

Agreed, will drop this patch for now and Tariq will address, in next
version.

> Thanks.
> 
> - Alex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow
  2019-04-23 12:53   ` Gal Pressman
@ 2019-04-23 16:46     ` Saeed Mahameed
  0 siblings, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-23 16:46 UTC (permalink / raw)
  To: davem, galpress; +Cc: netdev, Shay Agroskin, Tariq Toukan, brouer, bsd

On Tue, 2019-04-23 at 15:53 +0300, Gal Pressman wrote:
> On 23-Apr-19 01:32, Saeed Mahameed wrote:
> >  static inline void
> > -mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, dma_addr_t
> > dma_addr, u16 dma_len)
> > +mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, struct
> > mlx5e_xdp_info *xdpi,
> > +			 struct mlx5e_xdpsq_stats *stats)
> 
> Passing stats as a function parameter is weird, why not remove and
> use sq->stats?
> 

This is the core function to put the packet pointer into the hw ring
buffer, it is very performance critical, and since the sq->stats now is
a pointer we want to avoid two ptr de-references (sq->stats-
>tx_pacets); we read the sq->stats once and provide it all the way down
to lower level functions.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23 16:42         ` Saeed Mahameed
@ 2019-04-23 17:27           ` Alexander Duyck
  2019-04-23 18:24             ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 25+ messages in thread
From: Alexander Duyck @ 2019-04-23 17:27 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: brouer, davem, netdev, jakub.kicinski, Tariq Toukan, bsd

On Tue, Apr 23, 2019 at 9:42 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> On Tue, 2019-04-23 at 08:21 -0700, Alexander Duyck wrote:
> > On Tue, Apr 23, 2019 at 6:23 AM Jesper Dangaard Brouer
> > <brouer@redhat.com> wrote:
> > > On Mon, 22 Apr 2019 19:46:47 -0700
> > > Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> > >
> > > > On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:
> > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > index 51e109fdeec1..6147be23a9b9 100644
> > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > @@ -50,6 +50,7 @@
> > > > >  #include <net/xdp.h>
> > > > >  #include <linux/net_dim.h>
> > > > >  #include <linux/bits.h>
> > > > > +#include <linux/prefetch.h>
> > > > >  #include "wq.h"
> > > > >  #include "mlx5_core.h"
> > > > >  #include "en_stats.h"
> > > > > @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct
> > > > > mlx5e_cq *cq)
> > > > >     mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq-
> > > > > >wq.cc);
> > > > >  }
> > > > >
> > > > > +static inline void mlx5e_prefetch(void *p)
> > > > > +{
> > > > > +   prefetch(p);
> > > > > +#if L1_CACHE_BYTES < 128
> > > > > +   prefetch(p + L1_CACHE_BYTES);
> > > > > +#endif
> > > > > +}
> > > > > +
> > > > > +static inline void mlx5e_prefetchw(void *p)
> > > > > +{
> > > > > +   prefetchw(p);
> > > > > +#if L1_CACHE_BYTES < 128
> > > > > +   prefetchw(p + L1_CACHE_BYTES);
> > > > > +#endif
> > > > > +}
> > > >
> > > > All Intel drivers do the exact same thing, perhaps it's time to
> > > > add a
> > > > helper fot this?
> > > >
> > > > net_prefetch_headers()
> > > >
> > > > or some such?
> > >
> > > I wonder if Tariq measured any effect from doing this?
> > >
> > > Because Intel CPUs will usually already prefetch the next cache-
> > > line,
> > > as described in [1], you can even read (and modify) this MSR 0x1A4
> > > e.g. via tools in [2].  Maybe Intel guys added it before this was
> > > done
> > > in HW, and never cleaned it up?
> > >
> > > [1]
> > > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors
> >
> > The issue is the adjacent cache line prefetcher can be on or off and
> > a
> > network driver shouldn't really be going through and twiddling those
> > sort of bits. In some cases having it on can result in more memory
> > being consumed then is needed. The reason why I enabled the
> > additional
> > cacheline prefetch for the Intel NICs is because most TCP packets are
> > at a minimum 68 bytes for just the headers so there was an advantage
> > for TCP traffic to make certain we prefetched at least enough for us
> > to process the headers.
> >
>
> So if L2 adjacent cache line prefetcher bit is enabled then this
> additional prefetch step is redundant ? what is the performance cost in
> this case ?

I don't recall. I don't think it would be anything too significant though.

> > As far as Jakub comment about combining the functions I would be okay
> > with that. We just need to make it a static inline function available
> > to all the network drivers.
> >
>
> Agreed, will drop this patch for now and Tariq will address, in next
> version.
>
> > Thanks.
> >
> > - Alex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23 17:27           ` Alexander Duyck
@ 2019-04-23 18:24             ` Jesper Dangaard Brouer
  2019-04-23 18:46               ` Saeed Mahameed
  2019-04-23 20:12               ` Alexander Duyck
  0 siblings, 2 replies; 25+ messages in thread
From: Jesper Dangaard Brouer @ 2019-04-23 18:24 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Saeed Mahameed, davem, netdev, jakub.kicinski, Tariq Toukan, bsd, brouer

On Tue, 23 Apr 2019 10:27:32 -0700
Alexander Duyck <alexander.duyck@gmail.com> wrote:

> On Tue, Apr 23, 2019 at 9:42 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
> >
> > On Tue, 2019-04-23 at 08:21 -0700, Alexander Duyck wrote:  
> > > On Tue, Apr 23, 2019 at 6:23 AM Jesper Dangaard Brouer
> > > <brouer@redhat.com> wrote:  
> > > > On Mon, 22 Apr 2019 19:46:47 -0700
> > > > Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> > > >  
> > > > > On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:  
> > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > index 51e109fdeec1..6147be23a9b9 100644
> > > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > @@ -50,6 +50,7 @@
> > > > > >  #include <net/xdp.h>
> > > > > >  #include <linux/net_dim.h>
> > > > > >  #include <linux/bits.h>
> > > > > > +#include <linux/prefetch.h>
> > > > > >  #include "wq.h"
> > > > > >  #include "mlx5_core.h"
> > > > > >  #include "en_stats.h"
> > > > > > @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct
> > > > > > mlx5e_cq *cq)
> > > > > >     mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq-  
> > > > > > >wq.cc);  
> > > > > >  }
> > > > > >
> > > > > > +static inline void mlx5e_prefetch(void *p)
> > > > > > +{
> > > > > > +   prefetch(p);
> > > > > > +#if L1_CACHE_BYTES < 128
> > > > > > +   prefetch(p + L1_CACHE_BYTES);
> > > > > > +#endif
> > > > > > +}
> > > > > > +
> > > > > > +static inline void mlx5e_prefetchw(void *p)
> > > > > > +{
> > > > > > +   prefetchw(p);
> > > > > > +#if L1_CACHE_BYTES < 128
> > > > > > +   prefetchw(p + L1_CACHE_BYTES);
> > > > > > +#endif
> > > > > > +}  
> > > > >
> > > > > All Intel drivers do the exact same thing, perhaps it's time to
> > > > > add a
> > > > > helper fot this?
> > > > >
> > > > > net_prefetch_headers()
> > > > >
> > > > > or some such?  
> > > >
> > > > I wonder if Tariq measured any effect from doing this?
> > > >
> > > > Because Intel CPUs will usually already prefetch the next cache-
> > > > line,
> > > > as described in [1], you can even read (and modify) this MSR 0x1A4
> > > > e.g. via tools in [2].  Maybe Intel guys added it before this was
> > > > done
> > > > in HW, and never cleaned it up?
> > > >
> > > > [1]
> > > > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors  
> > >
> > > The issue is the adjacent cache line prefetcher can be on or off and
> > > a
> > > network driver shouldn't really be going through and twiddling those
> > > sort of bits. In some cases having it on can result in more memory
> > > being consumed then is needed. The reason why I enabled the
> > > additional
> > > cacheline prefetch for the Intel NICs is because most TCP packets are
> > > at a minimum 68 bytes for just the headers so there was an advantage
> > > for TCP traffic to make certain we prefetched at least enough for us
> > > to process the headers.
> > >  
> >
> > So if L2 adjacent cache line prefetcher bit is enabled then this

Nitpick: is it the DCU prefetcher bit that "Fetches the next cache line
into L1-D cache" in the link[1].

> > additional prefetch step is redundant ? what is the performance cost in
> > this case ?  
> 
> I don't recall. I don't think it would be anything too significant though.

I tried to measure this (approx 1 year ago), a prefetch that is not
needed, and AFAICR the overhead was below 1 nanosec, approx 0.333 ns.
(but anyone claiming to be able to measure below 2 ns variation
accuracy should be questioned...)

> > > As far as Jakub comment about combining the functions I would be okay
> > > with that. We just need to make it a static inline function available
> > > to all the network drivers.
> > >  
> >
> > Agreed, will drop this patch for now and Tariq will address, in next
> > version.

I don't mind the patch, and Alex provided a good argument why is still
makes sense.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23 18:24             ` Jesper Dangaard Brouer
@ 2019-04-23 18:46               ` Saeed Mahameed
  2019-04-23 20:12               ` Alexander Duyck
  1 sibling, 0 replies; 25+ messages in thread
From: Saeed Mahameed @ 2019-04-23 18:46 UTC (permalink / raw)
  To: brouer, alexander.duyck; +Cc: davem, netdev, jakub.kicinski, Tariq Toukan, bsd

On Tue, 2019-04-23 at 20:24 +0200, Jesper Dangaard Brouer wrote:
> On Tue, 23 Apr 2019 10:27:32 -0700
> Alexander Duyck <alexander.duyck@gmail.com> wrote:
> 
> > On Tue, Apr 23, 2019 at 9:42 AM Saeed Mahameed <saeedm@mellanox.com
> > > wrote:
> > > On Tue, 2019-04-23 at 08:21 -0700, Alexander Duyck wrote:  
> > > > On Tue, Apr 23, 2019 at 6:23 AM Jesper Dangaard Brouer
> > > > <brouer@redhat.com> wrote:  
> > > > > On Mon, 22 Apr 2019 19:46:47 -0700
> > > > > Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> > > > >  
> > > > > > On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:  
> > > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > index 51e109fdeec1..6147be23a9b9 100644
> > > > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > @@ -50,6 +50,7 @@
> > > > > > >  #include <net/xdp.h>
> > > > > > >  #include <linux/net_dim.h>
> > > > > > >  #include <linux/bits.h>
> > > > > > > +#include <linux/prefetch.h>
> > > > > > >  #include "wq.h"
> > > > > > >  #include "mlx5_core.h"
> > > > > > >  #include "en_stats.h"
> > > > > > > @@ -986,6 +987,22 @@ static inline void
> > > > > > > mlx5e_cq_arm(struct
> > > > > > > mlx5e_cq *cq)
> > > > > > >     mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map,
> > > > > > > cq-  
> > > > > > > > wq.cc);  
> > > > > > >  }
> > > > > > > 
> > > > > > > +static inline void mlx5e_prefetch(void *p)
> > > > > > > +{
> > > > > > > +   prefetch(p);
> > > > > > > +#if L1_CACHE_BYTES < 128
> > > > > > > +   prefetch(p + L1_CACHE_BYTES);
> > > > > > > +#endif
> > > > > > > +}
> > > > > > > +
> > > > > > > +static inline void mlx5e_prefetchw(void *p)
> > > > > > > +{
> > > > > > > +   prefetchw(p);
> > > > > > > +#if L1_CACHE_BYTES < 128
> > > > > > > +   prefetchw(p + L1_CACHE_BYTES);
> > > > > > > +#endif
> > > > > > > +}  
> > > > > > 
> > > > > > All Intel drivers do the exact same thing, perhaps it's
> > > > > > time to
> > > > > > add a
> > > > > > helper fot this?
> > > > > > 
> > > > > > net_prefetch_headers()
> > > > > > 
> > > > > > or some such?  
> > > > > 
> > > > > I wonder if Tariq measured any effect from doing this?
> > > > > 
> > > > > Because Intel CPUs will usually already prefetch the next
> > > > > cache-
> > > > > line,
> > > > > as described in [1], you can even read (and modify) this MSR
> > > > > 0x1A4
> > > > > e.g. via tools in [2].  Maybe Intel guys added it before this
> > > > > was
> > > > > done
> > > > > in HW, and never cleaned it up?
> > > > > 
> > > > > [1]
> > > > > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors  
> > > > 
> > > > The issue is the adjacent cache line prefetcher can be on or
> > > > off and
> > > > a
> > > > network driver shouldn't really be going through and twiddling
> > > > those
> > > > sort of bits. In some cases having it on can result in more
> > > > memory
> > > > being consumed then is needed. The reason why I enabled the
> > > > additional
> > > > cacheline prefetch for the Intel NICs is because most TCP
> > > > packets are
> > > > at a minimum 68 bytes for just the headers so there was an
> > > > advantage
> > > > for TCP traffic to make certain we prefetched at least enough
> > > > for us
> > > > to process the headers.
> > > >  
> > > 
> > > So if L2 adjacent cache line prefetcher bit is enabled then this
> 
> Nitpick: is it the DCU prefetcher bit that "Fetches the next cache
> line
> into L1-D cache" in the link[1].
> 
> > > additional prefetch step is redundant ? what is the performance
> > > cost in
> > > this case ?  
> > 
> > I don't recall. I don't think it would be anything too significant
> > though.
> 
> I tried to measure this (approx 1 year ago), a prefetch that is not
> needed, and AFAICR the overhead was below 1 nanosec, approx 0.333 ns.
> (but anyone claiming to be able to measure below 2 ns variation
> accuracy should be questioned...)
> 
> > > > As far as Jakub comment about combining the functions I would
> > > > be okay
> > > > with that. We just need to make it a static inline function
> > > > available
> > > > to all the network drivers.
> > > >  
> > > 
> > > Agreed, will drop this patch for now and Tariq will address, in
> > > next
> > > version.
> 
> I don't mind the patch, and Alex provided a good argument why is
> still
> makes sense.
> 

Sure but it is better to have one helper static inline function that is
used across all drivers as Jakub and Alex suggested, one day it might
become arch/cacheline dependent and all drivers will benefit of any
change to it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES
  2019-04-23 18:24             ` Jesper Dangaard Brouer
  2019-04-23 18:46               ` Saeed Mahameed
@ 2019-04-23 20:12               ` Alexander Duyck
  1 sibling, 0 replies; 25+ messages in thread
From: Alexander Duyck @ 2019-04-23 20:12 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Saeed Mahameed, davem, netdev, jakub.kicinski, Tariq Toukan, bsd

On Tue, Apr 23, 2019 at 11:25 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Tue, 23 Apr 2019 10:27:32 -0700
> Alexander Duyck <alexander.duyck@gmail.com> wrote:
>
> > On Tue, Apr 23, 2019 at 9:42 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
> > >
> > > On Tue, 2019-04-23 at 08:21 -0700, Alexander Duyck wrote:
> > > > On Tue, Apr 23, 2019 at 6:23 AM Jesper Dangaard Brouer
> > > > <brouer@redhat.com> wrote:
> > > > > On Mon, 22 Apr 2019 19:46:47 -0700
> > > > > Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> > > > >
> > > > > > On Mon, 22 Apr 2019 15:32:53 -0700, Saeed Mahameed wrote:
> > > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > index 51e109fdeec1..6147be23a9b9 100644
> > > > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> > > > > > > @@ -50,6 +50,7 @@
> > > > > > >  #include <net/xdp.h>
> > > > > > >  #include <linux/net_dim.h>
> > > > > > >  #include <linux/bits.h>
> > > > > > > +#include <linux/prefetch.h>
> > > > > > >  #include "wq.h"
> > > > > > >  #include "mlx5_core.h"
> > > > > > >  #include "en_stats.h"
> > > > > > > @@ -986,6 +987,22 @@ static inline void mlx5e_cq_arm(struct
> > > > > > > mlx5e_cq *cq)
> > > > > > >     mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq-
> > > > > > > >wq.cc);
> > > > > > >  }
> > > > > > >
> > > > > > > +static inline void mlx5e_prefetch(void *p)
> > > > > > > +{
> > > > > > > +   prefetch(p);
> > > > > > > +#if L1_CACHE_BYTES < 128
> > > > > > > +   prefetch(p + L1_CACHE_BYTES);
> > > > > > > +#endif
> > > > > > > +}
> > > > > > > +
> > > > > > > +static inline void mlx5e_prefetchw(void *p)
> > > > > > > +{
> > > > > > > +   prefetchw(p);
> > > > > > > +#if L1_CACHE_BYTES < 128
> > > > > > > +   prefetchw(p + L1_CACHE_BYTES);
> > > > > > > +#endif
> > > > > > > +}
> > > > > >
> > > > > > All Intel drivers do the exact same thing, perhaps it's time to
> > > > > > add a
> > > > > > helper fot this?
> > > > > >
> > > > > > net_prefetch_headers()
> > > > > >
> > > > > > or some such?
> > > > >
> > > > > I wonder if Tariq measured any effect from doing this?
> > > > >
> > > > > Because Intel CPUs will usually already prefetch the next cache-
> > > > > line,
> > > > > as described in [1], you can even read (and modify) this MSR 0x1A4
> > > > > e.g. via tools in [2].  Maybe Intel guys added it before this was
> > > > > done
> > > > > in HW, and never cleaned it up?
> > > > >
> > > > > [1]
> > > > > https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors
> > > >
> > > > The issue is the adjacent cache line prefetcher can be on or off and
> > > > a
> > > > network driver shouldn't really be going through and twiddling those
> > > > sort of bits. In some cases having it on can result in more memory
> > > > being consumed then is needed. The reason why I enabled the
> > > > additional
> > > > cacheline prefetch for the Intel NICs is because most TCP packets are
> > > > at a minimum 68 bytes for just the headers so there was an advantage
> > > > for TCP traffic to make certain we prefetched at least enough for us
> > > > to process the headers.
> > > >
> > >
> > > So if L2 adjacent cache line prefetcher bit is enabled then this
>
> Nitpick: is it the DCU prefetcher bit that "Fetches the next cache line
> into L1-D cache" in the link[1].

I didn't think the DCU prefetcher kicked in unless you were
sequentially accessing data in a recently fetched cache line. So it
doesn't apply in this case. The definition of it from the x86 software
optimization manual says as much:

Data cache unit (DCU) prefetcher. This prefetcher, also known as the
streaming prefetcher, is
triggered by an ascending access to very recently loaded data. The
processor assumes that this
access is part of a streaming algorithm and automatically fetches the next line.

So basically unless we are doing sequental access the DCU prefetcher
won't kick in, and we aren't really sequential since we are accessing
headers so things get a bit out of order. Based on all this I would
guess the 2 prefetches are needed for x86 if you want a full TCP or
IPv6 header pulled into the L1 cache for instance.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-04-23 20:13 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-22 22:32 [pull request][net-next 00/14] Mellanox, mlx5 RX and XDP improvements Saeed Mahameed
2019-04-22 22:32 ` [net-next 01/14] net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES Saeed Mahameed
2019-04-23  2:46   ` Jakub Kicinski
2019-04-23 13:23     ` Jesper Dangaard Brouer
2019-04-23 15:21       ` Alexander Duyck
2019-04-23 16:42         ` Saeed Mahameed
2019-04-23 17:27           ` Alexander Duyck
2019-04-23 18:24             ` Jesper Dangaard Brouer
2019-04-23 18:46               ` Saeed Mahameed
2019-04-23 20:12               ` Alexander Duyck
2019-04-22 22:32 ` [net-next 02/14] net/mlx5e: RX, Support multiple outstanding UMR posts Saeed Mahameed
2019-04-22 22:32 ` [net-next 03/14] net/mlx5e: XDP, Fix shifted flag index in RQ bitmap Saeed Mahameed
2019-04-22 22:32 ` [net-next 04/14] net/mlx5e: XDP, Enhance RQ indication for XDP redirect flush Saeed Mahameed
2019-04-22 22:32 ` [net-next 05/14] net/mlx5e: XDP, Add TX MPWQE session counter Saeed Mahameed
2019-04-22 22:32 ` [net-next 06/14] net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow Saeed Mahameed
2019-04-23 12:53   ` Gal Pressman
2019-04-23 16:46     ` Saeed Mahameed
2019-04-22 22:32 ` [net-next 07/14] net/mlx5e: Remove unused parameter Saeed Mahameed
2019-04-22 22:33 ` [net-next 08/14] net/mlx5e: Report mlx5e_xdp_set errors Saeed Mahameed
2019-04-22 22:33 ` [net-next 09/14] net/mlx5e: Move parameter calculation functions to en/params.c Saeed Mahameed
2019-04-22 22:33 ` [net-next 10/14] net/mlx5e: Add an underflow warning comment Saeed Mahameed
2019-04-22 22:33 ` [net-next 11/14] net/mlx5e: Remove unused parameter Saeed Mahameed
2019-04-22 22:33 ` [net-next 12/14] net/mlx5e: Take HW interrupt trigger into a function Saeed Mahameed
2019-04-22 22:33 ` [net-next 13/14] net/mlx5e: Remove unused rx_page_reuse stat Saeed Mahameed
2019-04-22 22:33 ` [net-next 14/14] net/mlx5e: Use #define for the WQE wait timeout constant Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.