All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/15] mlx5 updates 2022-03-17
@ 2022-03-17 18:54 Saeed Mahameed
  2022-03-17 18:54 ` [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info Saeed Mahameed
                   ` (14 more replies)
  0 siblings, 15 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Hi Dave, Hi Jakub,

This series adds some updates to mlx5 driver:
 1) Preparation for XDP Multi-buffer support
 2) Memory consumption reduction in SW steering component of the driver

For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit 1abea24af42c35c6eb537e4402836e2cde2a5b13:

  selftests: net: fix array_size.cocci warning (2022-03-17 15:21:16 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2022-03-17

for you to fetch changes up to 770c9a3a01af178a90368a78c75eb91707c7233c:

  net/mlx5: Remove unused fill page array API function (2022-03-17 11:51:58 -0700)

----------------------------------------------------------------
mlx5-updates-2022-03-17

1) From Maxim Mikityanskiy,
   Datapath improvements in preparation for XDP multi buffer

   This series contains general improvements for the datapath that are
   useful for the upcoming XDP multi buffer support:

   a. Non-linear legacy RQ: validate MTU for robustness, build the linear
      part of SKB over the first hardware fragment (instead of copying the
      packet headers), adjust headroom calculations to allow enabling headroom
      in the non-linear mode (useful for XDP multi buffer).

   b. XDP: do the XDP program test before function call, optimize
      parameters of mlx5e_xdp_handle.

2) From Rongwei Liu, DR, reduce steering memory usage
   Currently, mlx5 driver uses mlx5_htbl/chunk/ste to organize
   steering logic. However there is a little memory waste.

   This update targets to reduce steering memory footprint by:
   a. Adjust struct member layout.
   b. Remove duplicated indicator by using simple functions call.

   With 500k TX rules(3 ste) plus 500k RX rules(6 stes), these patches
   can save around 17% memory.

3) Three cleanup commits at the end of this series.

----------------------------------------------------------------
Maxim Mikityanskiy (5):
      net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info
      net/mlx5e: Add headroom only to the first fragment in legacy RQ
      net/mlx5e: Build SKB in place over the first fragment in non-linear legacy RQ
      net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle
      net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear

Paul Blakey (1):
      net/mlx5: CT: Remove extra rhashtable remove on tuple entries

Rongwei Liu (6):
      net/mlx5: DR, Adjust structure member to reduce memory hole
      net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk
      net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory
      net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk
      net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory
      net/mlx5: DR, Remove hw_ste from mlx5dr_ste to reduce memory

Tariq Toukan (3):
      net/mlx5e: RX, Test the XDP program existence out of the handler
      net/mlx5: Remove unused exported contiguous coherent buffer allocation API
      net/mlx5: Remove unused fill page array API function

 drivers/net/ethernet/mellanox/mlx5/core/alloc.c    |  60 -----------
 .../net/ethernet/mellanox/mlx5/core/en/params.c    |  69 ++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c |   1 -
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c   |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h   |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c    |  16 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 111 ++++++++++++---------
 .../mellanox/mlx5/core/steering/dr_action.c        |  12 ++-
 .../ethernet/mellanox/mlx5/core/steering/dr_dbg.c  |  14 ++-
 .../mellanox/mlx5/core/steering/dr_icm_pool.c      |  57 ++++++++---
 .../mellanox/mlx5/core/steering/dr_matcher.c       |  18 ++--
 .../ethernet/mellanox/mlx5/core/steering/dr_rule.c |  71 +++++++------
 .../ethernet/mellanox/mlx5/core/steering/dr_send.c |  34 ++++---
 .../ethernet/mellanox/mlx5/core/steering/dr_ste.c  | 105 ++++++++++---------
 .../mellanox/mlx5/core/steering/dr_table.c         |  18 ++--
 .../mellanox/mlx5/core/steering/dr_types.h         |  31 +++---
 include/linux/mlx5/driver.h                        |   4 -
 17 files changed, 338 insertions(+), 292 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-18 11:00   ` patchwork-bot+netdevbpf
  2022-03-17 18:54 ` [net-next 02/15] net/mlx5e: Add headroom only to the first fragment in legacy RQ Saeed Mahameed
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

mlx5e_build_rq_frags_info() assumes that MTU is not bigger than
PAGE_SIZE * MLX5E_MAX_RX_FRAGS, which is 16K for 4K pages. Currently,
the firmware limits MTU to 10K, so the assumption doesn't lead to a bug.

This commits adds an additional driver check for reliability, since the
firmware boundary might be changed.

The calculation is taken to a separate function with a comment
explaining it. It's a preparation for the following patches that
introcuce XDP multi buffer support.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/params.c   | 34 +++++++++++++++----
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 0bd8698f7226..0f258e7a65e0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -392,16 +392,23 @@ void mlx5e_build_create_cq_param(struct mlx5e_create_cq_param *ccp, struct mlx5e
 	};
 }
 
+static int mlx5e_max_nonlinear_mtu(int frag_size)
+{
+	/* Optimization for small packets: the last fragment is bigger than the others. */
+	return (MLX5E_MAX_RX_FRAGS - 1) * frag_size + PAGE_SIZE;
+}
+
 #define DEFAULT_FRAG_SIZE (2048)
 
-static void mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
-				      struct mlx5e_params *params,
-				      struct mlx5e_xsk_param *xsk,
-				      struct mlx5e_rq_frags_info *info)
+static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
+				     struct mlx5e_params *params,
+				     struct mlx5e_xsk_param *xsk,
+				     struct mlx5e_rq_frags_info *info)
 {
 	u32 byte_count = MLX5E_SW2HW_MTU(params, params->sw_mtu);
 	int frag_size_max = DEFAULT_FRAG_SIZE;
 	u32 buf_size = 0;
+	int max_mtu;
 	int i;
 
 	if (mlx5_fpga_is_ipsec_device(mdev))
@@ -420,10 +427,18 @@ static void mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 		goto out;
 	}
 
-	if (byte_count > PAGE_SIZE +
-	    (MLX5E_MAX_RX_FRAGS - 1) * frag_size_max)
+	max_mtu = mlx5e_max_nonlinear_mtu(frag_size_max);
+	if (byte_count > max_mtu) {
 		frag_size_max = PAGE_SIZE;
 
+		max_mtu = mlx5e_max_nonlinear_mtu(frag_size_max);
+		if (byte_count > max_mtu) {
+			mlx5_core_err(mdev, "MTU %u is too big for non-linear legacy RQ (max %d)\n",
+				      params->sw_mtu, max_mtu);
+			return -EINVAL;
+		}
+	}
+
 	i = 0;
 	while (buf_size < byte_count) {
 		int frag_size = byte_count - buf_size;
@@ -444,6 +459,8 @@ static void mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 out:
 	info->wqe_bulk = max_t(u8, info->wqe_bulk, 8);
 	info->log_num_frags = order_base_2(info->num_frags);
+
+	return 0;
 }
 
 static u8 mlx5e_get_rqwq_log_stride(u8 wq_type, int ndsegs)
@@ -540,6 +557,7 @@ int mlx5e_build_rq_param(struct mlx5_core_dev *mdev,
 	void *rqc = param->rqc;
 	void *wq = MLX5_ADDR_OF(rqc, rqc, wq);
 	int ndsegs = 1;
+	int err;
 
 	switch (params->rq_wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ: {
@@ -579,7 +597,9 @@ int mlx5e_build_rq_param(struct mlx5_core_dev *mdev,
 	}
 	default: /* MLX5_WQ_TYPE_CYCLIC */
 		MLX5_SET(wq, wq, log_wq_sz, params->log_rq_mtu_frames);
-		mlx5e_build_rq_frags_info(mdev, params, xsk, &param->frags_info);
+		err = mlx5e_build_rq_frags_info(mdev, params, xsk, &param->frags_info);
+		if (err)
+			return err;
 		ndsegs = param->frags_info.num_frags;
 	}
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 02/15] net/mlx5e: Add headroom only to the first fragment in legacy RQ
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
  2022-03-17 18:54 ` [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 03/15] net/mlx5e: Build SKB in place over the first fragment in non-linear " Saeed Mahameed
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

Currently, rq->buff.headroom is applied to all fragments in legacy RQ.
In the linear mode, there is a non-zero headroom, but there is only one
fragment per packet. In the non-linear mode, the headroom is zero.

This commit changes the logic to apply the headroom only to the first
fragment. The current behavior remains the same for both linear and
non-linear modes. However, it allows the next commit to enable headroom
for the non-linear mode, which will be applied only to the first
fragment.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 074a44b281b6..6eda906342c0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -373,12 +373,15 @@ static int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe_cyc *wqe,
 	int i;
 
 	for (i = 0; i < rq->wqe.info.num_frags; i++, frag++) {
+		u16 headroom;
+
 		err = mlx5e_get_rx_frag(rq, frag);
 		if (unlikely(err))
 			goto free_frags;
 
+		headroom = i == 0 ? rq->buff.headroom : 0;
 		wqe->data[i].addr = cpu_to_be64(frag->di->addr +
-						frag->offset + rq->buff.headroom);
+						frag->offset + headroom);
 	}
 
 	return 0;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 03/15] net/mlx5e: Build SKB in place over the first fragment in non-linear legacy RQ
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
  2022-03-17 18:54 ` [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info Saeed Mahameed
  2022-03-17 18:54 ` [net-next 02/15] net/mlx5e: Add headroom only to the first fragment in legacy RQ Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 04/15] net/mlx5e: RX, Test the XDP program existence out of the handler Saeed Mahameed
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

As a performance optimization and preparation to enabling XDP multi
buffer on non-linear legacy RQ, build the linear part of the SKB over
the first fragment, instead of allocating a new buffer and copying the
first 256 bytes there.

To achieve this, add headroom and tailroom to the first fragment.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/params.c   | 43 ++++++++++++-----
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 48 ++++++++++---------
 2 files changed, 57 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 0f258e7a65e0..5c4711be6fae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -188,12 +188,18 @@ u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,
 			  struct mlx5e_params *params,
 			  struct mlx5e_xsk_param *xsk)
 {
-	bool is_linear_skb = (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) ?
-		mlx5e_rx_is_linear_skb(params, xsk) :
-		mlx5e_rx_mpwqe_is_linear_skb(mdev, params, xsk);
+	u16 linear_headroom = mlx5e_get_linear_rq_headroom(params, xsk);
 
-	return is_linear_skb || params->packet_merge.type == MLX5E_PACKET_MERGE_SHAMPO ?
-		mlx5e_get_linear_rq_headroom(params, xsk) : 0;
+	if (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC)
+		return linear_headroom;
+
+	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params, xsk))
+		return linear_headroom;
+
+	if (params->packet_merge.type == MLX5E_PACKET_MERGE_SHAMPO)
+		return linear_headroom;
+
+	return 0;
 }
 
 u16 mlx5e_calc_sq_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params)
@@ -392,10 +398,10 @@ void mlx5e_build_create_cq_param(struct mlx5e_create_cq_param *ccp, struct mlx5e
 	};
 }
 
-static int mlx5e_max_nonlinear_mtu(int frag_size)
+static int mlx5e_max_nonlinear_mtu(int first_frag_size, int frag_size)
 {
 	/* Optimization for small packets: the last fragment is bigger than the others. */
-	return (MLX5E_MAX_RX_FRAGS - 1) * frag_size + PAGE_SIZE;
+	return first_frag_size + (MLX5E_MAX_RX_FRAGS - 2) * frag_size + PAGE_SIZE;
 }
 
 #define DEFAULT_FRAG_SIZE (2048)
@@ -407,7 +413,9 @@ static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 {
 	u32 byte_count = MLX5E_SW2HW_MTU(params, params->sw_mtu);
 	int frag_size_max = DEFAULT_FRAG_SIZE;
+	int first_frag_size_max;
 	u32 buf_size = 0;
+	u16 headroom;
 	int max_mtu;
 	int i;
 
@@ -427,11 +435,15 @@ static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 		goto out;
 	}
 
-	max_mtu = mlx5e_max_nonlinear_mtu(frag_size_max);
+	headroom = mlx5e_get_linear_rq_headroom(params, xsk);
+	first_frag_size_max = SKB_WITH_OVERHEAD(frag_size_max - headroom);
+
+	max_mtu = mlx5e_max_nonlinear_mtu(first_frag_size_max, frag_size_max);
 	if (byte_count > max_mtu) {
 		frag_size_max = PAGE_SIZE;
+		first_frag_size_max = SKB_WITH_OVERHEAD(frag_size_max - headroom);
 
-		max_mtu = mlx5e_max_nonlinear_mtu(frag_size_max);
+		max_mtu = mlx5e_max_nonlinear_mtu(first_frag_size_max, frag_size_max);
 		if (byte_count > max_mtu) {
 			mlx5_core_err(mdev, "MTU %u is too big for non-linear legacy RQ (max %d)\n",
 				      params->sw_mtu, max_mtu);
@@ -443,13 +455,22 @@ static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
 	while (buf_size < byte_count) {
 		int frag_size = byte_count - buf_size;
 
-		if (i < MLX5E_MAX_RX_FRAGS - 1)
+		if (i == 0)
+			frag_size = min(frag_size, first_frag_size_max);
+		else if (i < MLX5E_MAX_RX_FRAGS - 1)
 			frag_size = min(frag_size, frag_size_max);
 
 		info->arr[i].frag_size = frag_size;
+		buf_size += frag_size;
+
+		if (i == 0) {
+			/* Ensure that headroom and tailroom are included. */
+			frag_size += headroom;
+			frag_size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+		}
+
 		info->arr[i].frag_stride = roundup_pow_of_two(frag_size);
 
-		buf_size += frag_size;
 		i++;
 	}
 	info->num_frags = i;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 6eda906342c0..b06aac087b2a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1560,43 +1560,45 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
 			     struct mlx5e_wqe_frag_info *wi, u32 cqe_bcnt)
 {
 	struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0];
-	struct mlx5e_wqe_frag_info *head_wi = wi;
-	u16 headlen      = min_t(u32, MLX5E_RX_MAX_HEAD, cqe_bcnt);
-	u16 frag_headlen = headlen;
-	u16 byte_cnt     = cqe_bcnt - headlen;
+	u16 rx_headroom = rq->buff.headroom;
+	struct mlx5e_dma_info *di = wi->di;
+	u32 frag_consumed_bytes;
+	u32 first_frag_size;
 	struct sk_buff *skb;
+	void *va;
+
+	va = page_address(di->page) + wi->offset;
+	frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt);
+	first_frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + frag_consumed_bytes);
+
+	dma_sync_single_range_for_cpu(rq->pdev, di->addr, wi->offset,
+				      first_frag_size, DMA_FROM_DEVICE);
+	net_prefetch(va + rx_headroom);
 
 	/* XDP is not supported in this configuration, as incoming packets
 	 * might spread among multiple pages.
 	 */
-	skb = napi_alloc_skb(rq->cq.napi,
-			     ALIGN(MLX5E_RX_MAX_HEAD, sizeof(long)));
-	if (unlikely(!skb)) {
-		rq->stats->buff_alloc_err++;
+	skb = mlx5e_build_linear_skb(rq, va, first_frag_size, rx_headroom,
+				     frag_consumed_bytes, 0);
+	if (unlikely(!skb))
 		return NULL;
-	}
 
-	net_prefetchw(skb->data);
+	page_ref_inc(di->page);
 
-	while (byte_cnt) {
-		u16 frag_consumed_bytes =
-			min_t(u16, frag_info->frag_size - frag_headlen, byte_cnt);
+	cqe_bcnt -= frag_consumed_bytes;
+	frag_info++;
+	wi++;
 
-		mlx5e_add_skb_frag(rq, skb, wi->di, wi->offset + frag_headlen,
+	while (cqe_bcnt) {
+		frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt);
+
+		mlx5e_add_skb_frag(rq, skb, wi->di, wi->offset,
 				   frag_consumed_bytes, frag_info->frag_stride);
-		byte_cnt -= frag_consumed_bytes;
-		frag_headlen = 0;
+		cqe_bcnt -= frag_consumed_bytes;
 		frag_info++;
 		wi++;
 	}
 
-	/* copy header */
-	mlx5e_copy_skb_header(rq->pdev, skb, head_wi->di, head_wi->offset, head_wi->offset,
-			      headlen);
-	/* skb linear part was allocated with headlen and aligned to long */
-	skb->tail += headlen;
-	skb->len  += headlen;
-
 	return skb;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 04/15] net/mlx5e: RX, Test the XDP program existence out of the handler
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 03/15] net/mlx5e: Build SKB in place over the first fragment in non-linear " Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 05/15] net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle Saeed Mahameed
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Maxim Mikityanskiy, Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Instead of early return inside mlx5e_xdp_handle(), let the caller check
if an XDP program is loaded.  This allows saving a few unnecessary
function calls and calculations in case !prog.

Performance test: single core, drop packets in iptables
Before: 3,872,504 pps
After:  3,975,628 pps (+2.66%)

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  5 +-
 .../net/ethernet/mellanox/mlx5/core/en/xdp.h  |  1 +
 .../ethernet/mellanox/mlx5/core/en/xsk/rx.c   |  9 +++-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 49 ++++++++++++-------
 4 files changed, 39 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index a7f020399370..fcb84971b138 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -120,15 +120,12 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 
 /* returns true if packet was consumed by xdp */
 bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
+		      struct bpf_prog *prog,
 		      u32 *len, struct xdp_buff *xdp)
 {
-	struct bpf_prog *prog = rcu_dereference(rq->xdp_prog);
 	u32 act;
 	int err;
 
-	if (!prog)
-		return false;
-
 	act = bpf_prog_run_xdp(prog, xdp);
 	switch (act) {
 	case XDP_PASS:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index c62f11d7ef6a..850540e94bb4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -48,6 +48,7 @@
 struct mlx5e_xsk_param;
 int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk);
 bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
+		      struct bpf_prog *prog,
 		      u32 *len, struct xdp_buff *xdp);
 void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq);
 bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index 8e7b877d8a12..162513594862 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -4,6 +4,7 @@
 #include "rx.h"
 #include "en/xdp.h"
 #include <net/xdp_sock_drv.h>
+#include <linux/filter.h>
 
 /* RX data path */
 
@@ -31,6 +32,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
 {
 	struct xdp_buff *xdp = wi->umr.dma_info[page_idx].xsk;
 	u32 cqe_bcnt32 = cqe_bcnt;
+	struct bpf_prog *prog;
 
 	/* Check packet size. Note LRO doesn't use linear SKB */
 	if (unlikely(cqe_bcnt > rq->hw_mtu)) {
@@ -65,7 +67,8 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
 	 * allocated first from the Reuse Ring, so it has enough space.
 	 */
 
-	if (likely(mlx5e_xdp_handle(rq, NULL, &cqe_bcnt32, xdp))) {
+	prog = rcu_dereference(rq->xdp_prog);
+	if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, &cqe_bcnt32, xdp))) {
 		if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)))
 			__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
 		return NULL; /* page/packet was consumed by XDP */
@@ -83,6 +86,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
 					      u32 cqe_bcnt)
 {
 	struct xdp_buff *xdp = wi->di->xsk;
+	struct bpf_prog *prog;
 
 	/* wi->offset is not used in this function, because xdp->data and the
 	 * DMA address point directly to the necessary place. Furthermore, the
@@ -101,7 +105,8 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
 		return NULL;
 	}
 
-	if (likely(mlx5e_xdp_handle(rq, NULL, &cqe_bcnt, xdp)))
+	prog = rcu_dereference(rq->xdp_prog);
+	if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, &cqe_bcnt, xdp)))
 		return NULL; /* page/packet was consumed by XDP */
 
 	/* XDP_PASS: copy the data from the UMEM to a new SKB. The frame reuse
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index b06aac087b2a..60c640fc430c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -34,6 +34,7 @@
 #include <linux/ipv6.h>
 #include <linux/tcp.h>
 #include <linux/bitmap.h>
+#include <linux/filter.h>
 #include <net/ip6_checksum.h>
 #include <net/page_pool.h>
 #include <net/inet_ecn.h>
@@ -1523,11 +1524,11 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
 {
 	struct mlx5e_dma_info *di = wi->di;
 	u16 rx_headroom = rq->buff.headroom;
-	struct xdp_buff xdp;
+	struct bpf_prog *prog;
 	struct sk_buff *skb;
+	u32 metasize = 0;
 	void *va, *data;
 	u32 frag_size;
-	u32 metasize;
 
 	va             = page_address(di->page) + wi->offset;
 	data           = va + rx_headroom;
@@ -1535,16 +1536,21 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
 
 	dma_sync_single_range_for_cpu(rq->pdev, di->addr, wi->offset,
 				      frag_size, DMA_FROM_DEVICE);
-	net_prefetchw(va); /* xdp_frame data area */
 	net_prefetch(data);
 
-	mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
-	if (mlx5e_xdp_handle(rq, di, &cqe_bcnt, &xdp))
-		return NULL; /* page/packet was consumed by XDP */
+	prog = rcu_dereference(rq->xdp_prog);
+	if (prog) {
+		struct xdp_buff xdp;
 
-	rx_headroom = xdp.data - xdp.data_hard_start;
+		net_prefetchw(va); /* xdp_frame data area */
+		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
+		if (mlx5e_xdp_handle(rq, di, prog, &cqe_bcnt, &xdp))
+			return NULL; /* page/packet was consumed by XDP */
+
+		rx_headroom = xdp.data - xdp.data_hard_start;
+		metasize = xdp.data - xdp.data_meta;
+	}
 	frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt);
-	metasize = xdp.data - xdp.data_meta;
 	skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize);
 	if (unlikely(!skb))
 		return NULL;
@@ -1842,11 +1848,11 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 	struct mlx5e_dma_info *di = &wi->umr.dma_info[page_idx];
 	u16 rx_headroom = rq->buff.headroom;
 	u32 cqe_bcnt32 = cqe_bcnt;
-	struct xdp_buff xdp;
+	struct bpf_prog *prog;
 	struct sk_buff *skb;
+	u32 metasize = 0;
 	void *va, *data;
 	u32 frag_size;
-	u32 metasize;
 
 	/* Check packet size. Note LRO doesn't use linear SKB */
 	if (unlikely(cqe_bcnt > rq->hw_mtu)) {
@@ -1860,19 +1866,24 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 	dma_sync_single_range_for_cpu(rq->pdev, di->addr, head_offset,
 				      frag_size, DMA_FROM_DEVICE);
-	net_prefetchw(va); /* xdp_frame data area */
 	net_prefetch(data);
 
-	mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt32, &xdp);
-	if (mlx5e_xdp_handle(rq, di, &cqe_bcnt32, &xdp)) {
-		if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
-			__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
-		return NULL; /* page/packet was consumed by XDP */
-	}
+	prog = rcu_dereference(rq->xdp_prog);
+	if (prog) {
+		struct xdp_buff xdp;
 
-	rx_headroom = xdp.data - xdp.data_hard_start;
+		net_prefetchw(va); /* xdp_frame data area */
+		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt32, &xdp);
+		if (mlx5e_xdp_handle(rq, di, prog, &cqe_bcnt32, &xdp)) {
+			if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
+				__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
+			return NULL; /* page/packet was consumed by XDP */
+		}
+
+		rx_headroom = xdp.data - xdp.data_hard_start;
+		metasize = xdp.data - xdp.data_meta;
+	}
 	frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt32);
-	metasize = xdp.data - xdp.data_meta;
 	skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt32, metasize);
 	if (unlikely(!skb))
 		return NULL;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 05/15] net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 04/15] net/mlx5e: RX, Test the XDP program existence out of the handler Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 06/15] net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear Saeed Mahameed
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

The len parameter of mlx5e_xdp_handle is used to output the new packet
length after XDP has processed the packet and returned XDP_PASS.
However, this value can be calculated on the caller site, as the caller
knows if it was an XDP_PASS.

This commit drops the len parameter and moves the calculation to the
caller, reducing the number of parameters passed to the function and
preparing for XDP support in non-linear legacy RQ.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c    |  4 +---
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h    |  3 +--
 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 11 +++++------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c     |  6 ++++--
 4 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index fcb84971b138..6aa77f0e094e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -120,8 +120,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 
 /* returns true if packet was consumed by xdp */
 bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
-		      struct bpf_prog *prog,
-		      u32 *len, struct xdp_buff *xdp)
+		      struct bpf_prog *prog, struct xdp_buff *xdp)
 {
 	u32 act;
 	int err;
@@ -129,7 +128,6 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
 	act = bpf_prog_run_xdp(prog, xdp);
 	switch (act) {
 	case XDP_PASS:
-		*len = xdp->data_end - xdp->data;
 		return false;
 	case XDP_TX:
 		if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, di, xdp)))
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index 850540e94bb4..20d8af66c072 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -48,8 +48,7 @@
 struct mlx5e_xsk_param;
 int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk);
 bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di,
-		      struct bpf_prog *prog,
-		      u32 *len, struct xdp_buff *xdp);
+		      struct bpf_prog *prog, struct xdp_buff *xdp);
 void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq);
 bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq);
 void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index 162513594862..021da085e603 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -31,7 +31,6 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
 						    u32 page_idx)
 {
 	struct xdp_buff *xdp = wi->umr.dma_info[page_idx].xsk;
-	u32 cqe_bcnt32 = cqe_bcnt;
 	struct bpf_prog *prog;
 
 	/* Check packet size. Note LRO doesn't use linear SKB */
@@ -47,7 +46,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
 	 */
 	WARN_ON_ONCE(head_offset);
 
-	xdp->data_end = xdp->data + cqe_bcnt32;
+	xdp->data_end = xdp->data + cqe_bcnt;
 	xdp_set_data_meta_invalid(xdp);
 	xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool);
 	net_prefetch(xdp->data);
@@ -68,7 +67,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
 	 */
 
 	prog = rcu_dereference(rq->xdp_prog);
-	if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, &cqe_bcnt32, xdp))) {
+	if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, xdp))) {
 		if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)))
 			__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
 		return NULL; /* page/packet was consumed by XDP */
@@ -77,7 +76,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
 	/* XDP_PASS: copy the data from the UMEM to a new SKB and reuse the
 	 * frame. On SKB allocation failure, NULL is returned.
 	 */
-	return mlx5e_xsk_construct_skb(rq, xdp->data, cqe_bcnt32);
+	return mlx5e_xsk_construct_skb(rq, xdp->data, xdp->data_end - xdp->data);
 }
 
 struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
@@ -106,12 +105,12 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
 	}
 
 	prog = rcu_dereference(rq->xdp_prog);
-	if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, &cqe_bcnt, xdp)))
+	if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, xdp)))
 		return NULL; /* page/packet was consumed by XDP */
 
 	/* XDP_PASS: copy the data from the UMEM to a new SKB. The frame reuse
 	 * will be handled by mlx5e_put_rx_frag.
 	 * On SKB allocation failure, NULL is returned.
 	 */
-	return mlx5e_xsk_construct_skb(rq, xdp->data, cqe_bcnt);
+	return mlx5e_xsk_construct_skb(rq, xdp->data, xdp->data_end - xdp->data);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 60c640fc430c..7c490c0ca370 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1544,11 +1544,12 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
 
 		net_prefetchw(va); /* xdp_frame data area */
 		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
-		if (mlx5e_xdp_handle(rq, di, prog, &cqe_bcnt, &xdp))
+		if (mlx5e_xdp_handle(rq, di, prog, &xdp))
 			return NULL; /* page/packet was consumed by XDP */
 
 		rx_headroom = xdp.data - xdp.data_hard_start;
 		metasize = xdp.data - xdp.data_meta;
+		cqe_bcnt = xdp.data_end - xdp.data;
 	}
 	frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt);
 	skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize);
@@ -1874,7 +1875,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 		net_prefetchw(va); /* xdp_frame data area */
 		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt32, &xdp);
-		if (mlx5e_xdp_handle(rq, di, prog, &cqe_bcnt32, &xdp)) {
+		if (mlx5e_xdp_handle(rq, di, prog, &xdp)) {
 			if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
 				__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
 			return NULL; /* page/packet was consumed by XDP */
@@ -1882,6 +1883,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 		rx_headroom = xdp.data - xdp.data_hard_start;
 		metasize = xdp.data - xdp.data_meta;
+		cqe_bcnt32 = xdp.data_end - xdp.data;
 	}
 	frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt32);
 	skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt32, metasize);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 06/15] net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 05/15] net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 07/15] net/mlx5: DR, Adjust structure member to reduce memory hole Saeed Mahameed
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maxim Mikityanskiy, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

The packet size in mlx5e_skb_from_cqe_mpwrq_linear can't overflow u16,
since the maximum packet size in linear striding RQ is 2^13 bytes. Drop
the unneeded u32 variable.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 7c490c0ca370..4b8699f39200 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1848,7 +1848,6 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 {
 	struct mlx5e_dma_info *di = &wi->umr.dma_info[page_idx];
 	u16 rx_headroom = rq->buff.headroom;
-	u32 cqe_bcnt32 = cqe_bcnt;
 	struct bpf_prog *prog;
 	struct sk_buff *skb;
 	u32 metasize = 0;
@@ -1863,7 +1862,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 	va             = page_address(di->page) + head_offset;
 	data           = va + rx_headroom;
-	frag_size      = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt32);
+	frag_size      = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt);
 
 	dma_sync_single_range_for_cpu(rq->pdev, di->addr, head_offset,
 				      frag_size, DMA_FROM_DEVICE);
@@ -1874,7 +1873,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 		struct xdp_buff xdp;
 
 		net_prefetchw(va); /* xdp_frame data area */
-		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt32, &xdp);
+		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
 		if (mlx5e_xdp_handle(rq, di, prog, &xdp)) {
 			if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
 				__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
@@ -1883,10 +1882,10 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 		rx_headroom = xdp.data - xdp.data_hard_start;
 		metasize = xdp.data - xdp.data_meta;
-		cqe_bcnt32 = xdp.data_end - xdp.data;
+		cqe_bcnt = xdp.data_end - xdp.data;
 	}
-	frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt32);
-	skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt32, metasize);
+	frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt);
+	skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize);
 	if (unlikely(!skb))
 		return NULL;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 07/15] net/mlx5: DR, Adjust structure member to reduce memory hole
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 06/15] net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 08/15] net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk Saeed Mahameed
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Rongwei Liu, Shun Hao, Yevgeny Kliteynik, Saeed Mahameed

From: Rongwei Liu <rongweil@nvidia.com>

Accord to profiling, mlx5dr_ste/mlx5dr_icm_chunk are the two
hot structures. Their memory layout can be optimized by
adjusting member sequences.

Struct mlx5dr_ste size changes from 64 bytes to 56 bytes.

In the upcoming commits, struct mlx5dr_icm_chunk memory layout
will change automatically after removing some members.
Keep it untouched here.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Reviewed-by: Shun Hao <shunh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 88092fabf55b..e906fef615a4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -151,6 +151,9 @@ struct mlx5dr_ste {
 	/* refcount: indicates the num of rules that using this ste */
 	u32 refcount;
 
+	/* this ste is part of a rule, located in ste's chain */
+	u8 ste_chain_location;
+
 	/* attached to the miss_list head at each htbl entry */
 	struct list_head miss_list_node;
 
@@ -161,9 +164,6 @@ struct mlx5dr_ste {
 
 	/* The rule this STE belongs to */
 	struct mlx5dr_rule_rx_tx *rule_rx_tx;
-
-	/* this ste is part of a rule, located in ste's chain */
-	u8 ste_chain_location;
 };
 
 struct mlx5dr_ste_htbl_ctrl {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 08/15] net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 07/15] net/mlx5: DR, Adjust structure member to reduce memory hole Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 09/15] net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory Saeed Mahameed
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Rongwei Liu, Shun Hao, Yevgeny Kliteynik, Saeed Mahameed

From: Rongwei Liu <rongweil@nvidia.com>

Reduce memory footprint by removing mr_addr and rkey from
mlx5_dr_icm_chunk.
1. mr_addr is calculated by mlx5dr_icm_pool_get_chunk_mr_addr()
2. rkey is calculated by mlx5dr_icm_pool_get_chunk_rkey()
The two new functions are very lightweight and straightforward.

Reduce 8 bytes from struct mlx5_dr_icm_chunk, its current size is
72 bytes.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Reviewed-by: Shun Hao <shunh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_icm_pool.c      | 14 ++++++++++++--
 .../ethernet/mellanox/mlx5/core/steering/dr_send.c | 11 ++++++-----
 .../ethernet/mellanox/mlx5/core/steering/dr_ste.c  |  2 +-
 .../mellanox/mlx5/core/steering/dr_types.h         |  5 +++--
 4 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
index e289cfdbce07..672d385a8f40 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
@@ -57,6 +57,18 @@ static int dr_icm_create_dm_mkey(struct mlx5_core_dev *mdev,
 	return mlx5_core_create_mkey(mdev, mkey, in, inlen);
 }
 
+u64 mlx5dr_icm_pool_get_chunk_mr_addr(struct mlx5dr_icm_chunk *chunk)
+{
+	u32 offset = mlx5dr_icm_pool_dm_type_to_entry_size(chunk->buddy_mem->pool->icm_type);
+
+	return (u64)offset * chunk->seg;
+}
+
+u32 mlx5dr_icm_pool_get_chunk_rkey(struct mlx5dr_icm_chunk *chunk)
+{
+	return chunk->buddy_mem->icm_mr->mkey;
+}
+
 static struct mlx5dr_icm_mr *
 dr_icm_pool_mr_create(struct mlx5dr_icm_pool *pool)
 {
@@ -298,8 +310,6 @@ dr_icm_chunk_create(struct mlx5dr_icm_pool *pool,
 
 	offset = mlx5dr_icm_pool_dm_type_to_entry_size(pool->icm_type) * seg;
 
-	chunk->rkey = buddy_mem_pool->icm_mr->mkey;
-	chunk->mr_addr = offset;
 	chunk->icm_addr =
 		(uintptr_t)buddy_mem_pool->icm_mr->icm_start_addr + offset;
 	chunk->num_of_entries =
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
index 00aef47d7682..57765d231993 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
@@ -453,7 +453,7 @@ int mlx5dr_send_postsend_ste(struct mlx5dr_domain *dmn, struct mlx5dr_ste *ste,
 	send_info.write.length = size;
 	send_info.write.lkey = 0;
 	send_info.remote_addr = mlx5dr_ste_get_mr_addr(ste) + offset;
-	send_info.rkey = ste->htbl->chunk->rkey;
+	send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(ste->htbl->chunk);
 
 	return dr_postsend_icm_data(dmn, &send_info);
 }
@@ -512,7 +512,7 @@ int mlx5dr_send_postsend_htbl(struct mlx5dr_domain *dmn,
 		send_info.write.lkey = 0;
 		send_info.remote_addr =
 			mlx5dr_ste_get_mr_addr(htbl->ste_arr + ste_index);
-		send_info.rkey = htbl->chunk->rkey;
+		send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(htbl->chunk);
 
 		ret = dr_postsend_icm_data(dmn, &send_info);
 		if (ret)
@@ -569,7 +569,7 @@ int mlx5dr_send_postsend_formatted_htbl(struct mlx5dr_domain *dmn,
 		send_info.write.lkey = 0;
 		send_info.remote_addr =
 			mlx5dr_ste_get_mr_addr(htbl->ste_arr + ste_index);
-		send_info.rkey = htbl->chunk->rkey;
+		send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(htbl->chunk);
 
 		ret = dr_postsend_icm_data(dmn, &send_info);
 		if (ret)
@@ -591,8 +591,9 @@ int mlx5dr_send_postsend_action(struct mlx5dr_domain *dmn,
 	send_info.write.length = action->rewrite->num_of_actions *
 				 DR_MODIFY_ACTION_SIZE;
 	send_info.write.lkey = 0;
-	send_info.remote_addr = action->rewrite->chunk->mr_addr;
-	send_info.rkey = action->rewrite->chunk->rkey;
+	send_info.remote_addr =
+		mlx5dr_icm_pool_get_chunk_mr_addr(action->rewrite->chunk);
+	send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(action->rewrite->chunk);
 
 	ret = dr_postsend_icm_data(dmn, &send_info);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
index 518e949847a3..c1465eb04a5b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
@@ -122,7 +122,7 @@ u64 mlx5dr_ste_get_mr_addr(struct mlx5dr_ste *ste)
 {
 	u32 index = ste - ste->htbl->ste_arr;
 
-	return ste->htbl->chunk->mr_addr + DR_STE_SIZE * index;
+	return mlx5dr_icm_pool_get_chunk_mr_addr(ste->htbl->chunk) + DR_STE_SIZE * index;
 }
 
 struct list_head *mlx5dr_ste_get_miss_list(struct mlx5dr_ste *ste)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index e906fef615a4..dd5b013e901c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -1097,11 +1097,9 @@ int mlx5dr_rule_get_reverse_rule_members(struct mlx5dr_ste **ste_arr,
 struct mlx5dr_icm_chunk {
 	struct mlx5dr_icm_buddy_mem *buddy_mem;
 	struct list_head chunk_list;
-	u32 rkey;
 	u32 num_of_entries;
 	u32 byte_size;
 	u64 icm_addr;
-	u64 mr_addr;
 
 	/* indicates the index of this chunk in the whole memory,
 	 * used for deleting the chunk from the buddy
@@ -1146,6 +1144,9 @@ int mlx5dr_matcher_select_builders(struct mlx5dr_matcher *matcher,
 				   enum mlx5dr_ipv outer_ipv,
 				   enum mlx5dr_ipv inner_ipv);
 
+u64 mlx5dr_icm_pool_get_chunk_mr_addr(struct mlx5dr_icm_chunk *chunk);
+u32 mlx5dr_icm_pool_get_chunk_rkey(struct mlx5dr_icm_chunk *chunk);
+
 static inline int
 mlx5dr_icm_pool_dm_type_to_entry_size(enum mlx5dr_icm_type icm_type)
 {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 09/15] net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 08/15] net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 10/15] net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk Saeed Mahameed
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Rongwei Liu, Shun Hao, Yevgeny Kliteynik, Saeed Mahameed

From: Rongwei Liu <rongweil@nvidia.com>

It can be calculated quickly from buddy memory pool by
function mlx5dr_icm_pool_get_chunk_icm_addr().
This function is very lightweight and straightforward.

Reduce 8 bytes and current size of struct mlx5_dr_icm_chunk
is 64 bytes.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Reviewed-by: Shun Hao <shunh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_action.c   | 12 +++++++-----
 .../mellanox/mlx5/core/steering/dr_dbg.c      | 11 ++++++++---
 .../mellanox/mlx5/core/steering/dr_icm_pool.c |  9 +++++++--
 .../mellanox/mlx5/core/steering/dr_matcher.c  |  2 +-
 .../mellanox/mlx5/core/steering/dr_rule.c     | 19 +++++++++++--------
 .../mellanox/mlx5/core/steering/dr_ste.c      | 14 +++++++++-----
 .../mellanox/mlx5/core/steering/dr_table.c    | 18 ++++++++++--------
 .../mellanox/mlx5/core/steering/dr_types.h    |  2 +-
 8 files changed, 54 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 743422acc3d8..850937cd8bf9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -570,6 +570,7 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 
 	for (i = 0; i < num_actions; i++) {
 		struct mlx5dr_action_dest_tbl *dest_tbl;
+		struct mlx5dr_icm_chunk *chunk;
 		struct mlx5dr_action *action;
 		int max_actions_type = 1;
 		u32 action_type;
@@ -598,9 +599,9 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 						   matcher->tbl->level,
 						   dest_tbl->tbl->level);
 				}
-				attr.final_icm_addr = rx_rule ?
-					dest_tbl->tbl->rx.s_anchor->chunk->icm_addr :
-					dest_tbl->tbl->tx.s_anchor->chunk->icm_addr;
+				chunk = rx_rule ? dest_tbl->tbl->rx.s_anchor->chunk :
+					dest_tbl->tbl->tx.s_anchor->chunk;
+				attr.final_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(chunk);
 			} else {
 				struct mlx5dr_cmd_query_flow_table_details output;
 				int ret;
@@ -1123,7 +1124,8 @@ dr_action_create_reformat_action(struct mlx5dr_domain *dmn,
 		}
 
 		action->rewrite->data = (void *)hw_actions;
-		action->rewrite->index = (action->rewrite->chunk->icm_addr -
+		action->rewrite->index = (mlx5dr_icm_pool_get_chunk_icm_addr
+					  (action->rewrite->chunk) -
 					 dmn->info.caps.hdr_modify_icm_addr) /
 					 ACTION_CACHE_LINE_SIZE;
 
@@ -1702,7 +1704,7 @@ static int dr_action_create_modify_action(struct mlx5dr_domain *dmn,
 	action->rewrite->modify_ttl = modify_ttl;
 	action->rewrite->data = (u8 *)hw_actions;
 	action->rewrite->num_of_actions = num_hw_actions;
-	action->rewrite->index = (chunk->icm_addr -
+	action->rewrite->index = (mlx5dr_icm_pool_get_chunk_icm_addr(chunk) -
 				  dmn->info.caps.hdr_modify_icm_addr) /
 				  ACTION_CACHE_LINE_SIZE;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
index d232f1ea34a2..8fd98b628740 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
@@ -346,16 +346,19 @@ dr_dump_matcher_rx_tx(struct seq_file *file, bool is_rx,
 		      const u64 matcher_id)
 {
 	enum dr_dump_rec_type rec_type;
+	u64 s_icm_addr, e_icm_addr;
 	int i, ret;
 
 	rec_type = is_rx ? DR_DUMP_REC_TYPE_MATCHER_RX :
 			   DR_DUMP_REC_TYPE_MATCHER_TX;
 
+	s_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(matcher_rx_tx->s_htbl->chunk);
+	e_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(matcher_rx_tx->e_anchor->chunk);
 	seq_printf(file, "%d,0x%llx,0x%llx,%d,0x%llx,0x%llx\n",
 		   rec_type, DR_DBG_PTR_TO_ID(matcher_rx_tx),
 		   matcher_id, matcher_rx_tx->num_of_builders,
-		   dr_dump_icm_to_idx(matcher_rx_tx->s_htbl->chunk->icm_addr),
-		   dr_dump_icm_to_idx(matcher_rx_tx->e_anchor->chunk->icm_addr));
+		   dr_dump_icm_to_idx(s_icm_addr),
+		   dr_dump_icm_to_idx(e_icm_addr));
 
 	for (i = 0; i < matcher_rx_tx->num_of_builders; i++) {
 		ret = dr_dump_matcher_builder(file,
@@ -426,12 +429,14 @@ dr_dump_table_rx_tx(struct seq_file *file, bool is_rx,
 		    const u64 table_id)
 {
 	enum dr_dump_rec_type rec_type;
+	u64 s_icm_addr;
 
 	rec_type = is_rx ? DR_DUMP_REC_TYPE_TABLE_RX :
 			   DR_DUMP_REC_TYPE_TABLE_TX;
 
+	s_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(table_rx_tx->s_anchor->chunk);
 	seq_printf(file, "%d,0x%llx,0x%llx\n", rec_type, table_id,
-		   dr_dump_icm_to_idx(table_rx_tx->s_anchor->chunk->icm_addr));
+		   dr_dump_icm_to_idx(s_icm_addr));
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
index 672d385a8f40..539af89da629 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
@@ -69,6 +69,13 @@ u32 mlx5dr_icm_pool_get_chunk_rkey(struct mlx5dr_icm_chunk *chunk)
 	return chunk->buddy_mem->icm_mr->mkey;
 }
 
+u64 mlx5dr_icm_pool_get_chunk_icm_addr(struct mlx5dr_icm_chunk *chunk)
+{
+	u32 size = mlx5dr_icm_pool_dm_type_to_entry_size(chunk->buddy_mem->pool->icm_type);
+
+	return (u64)chunk->buddy_mem->icm_mr->icm_start_addr + size * chunk->seg;
+}
+
 static struct mlx5dr_icm_mr *
 dr_icm_pool_mr_create(struct mlx5dr_icm_pool *pool)
 {
@@ -310,8 +317,6 @@ dr_icm_chunk_create(struct mlx5dr_icm_pool *pool,
 
 	offset = mlx5dr_icm_pool_dm_type_to_entry_size(pool->icm_type) * seg;
 
-	chunk->icm_addr =
-		(uintptr_t)buddy_mem_pool->icm_mr->icm_start_addr + offset;
 	chunk->num_of_entries =
 		mlx5dr_icm_pool_chunk_size_to_entries(chunk_size);
 	chunk->byte_size =
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c
index a4b5b415df90..35154ec9673a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c
@@ -705,7 +705,7 @@ static int dr_nic_matcher_connect(struct mlx5dr_domain *dmn,
 
 	/* Connect start hash table to end anchor */
 	info.type = CONNECT_MISS;
-	info.miss_icm_addr = curr_nic_matcher->e_anchor->chunk->icm_addr;
+	info.miss_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(curr_nic_matcher->e_anchor->chunk);
 	ret = mlx5dr_ste_htbl_init_and_postsend(dmn, nic_dmn,
 						curr_nic_matcher->s_htbl,
 						&info, false);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
index b4374578425b..e76c1fda2ac9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
@@ -41,6 +41,7 @@ dr_rule_create_collision_htbl(struct mlx5dr_matcher *matcher,
 	struct mlx5dr_ste_ctx *ste_ctx = dmn->ste_ctx;
 	struct mlx5dr_ste_htbl *new_htbl;
 	struct mlx5dr_ste *ste;
+	u64 icm_addr;
 
 	/* Create new table for miss entry */
 	new_htbl = mlx5dr_ste_htbl_alloc(dmn->ste_icm_pool,
@@ -54,8 +55,8 @@ dr_rule_create_collision_htbl(struct mlx5dr_matcher *matcher,
 
 	/* One and only entry, never grows */
 	ste = new_htbl->ste_arr;
-	mlx5dr_ste_set_miss_addr(ste_ctx, hw_ste,
-				 nic_matcher->e_anchor->chunk->icm_addr);
+	icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
+	mlx5dr_ste_set_miss_addr(ste_ctx, hw_ste, icm_addr);
 	mlx5dr_htbl_get(new_htbl);
 
 	return ste;
@@ -235,6 +236,7 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher,
 	bool use_update_list = false;
 	u8 hw_ste[DR_STE_SIZE] = {};
 	struct mlx5dr_ste *new_ste;
+	u64 icm_addr;
 	int new_idx;
 	u8 sb_idx;
 
@@ -243,9 +245,9 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher,
 	mlx5dr_ste_set_bit_mask(hw_ste, nic_matcher->ste_builder[sb_idx].bit_mask);
 
 	/* Copy STE control and tag */
+	icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
 	memcpy(hw_ste, cur_ste->hw_ste, DR_STE_SIZE_REDUCED);
-	mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste,
-				 nic_matcher->e_anchor->chunk->icm_addr);
+	mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste, icm_addr);
 
 	new_idx = mlx5dr_ste_calc_hash_index(hw_ste, new_htbl);
 	new_ste = &new_htbl->ste_arr[new_idx];
@@ -398,7 +400,7 @@ dr_rule_rehash_htbl(struct mlx5dr_rule *rule,
 
 	/* Write new table to HW */
 	info.type = CONNECT_MISS;
-	info.miss_icm_addr = nic_matcher->e_anchor->chunk->icm_addr;
+	info.miss_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
 	mlx5dr_ste_set_formatted_ste(dmn->ste_ctx,
 				     dmn->info.caps.gvmi,
 				     nic_dmn->type,
@@ -447,7 +449,7 @@ dr_rule_rehash_htbl(struct mlx5dr_rule *rule,
 		 */
 		mlx5dr_ste_set_hit_addr(dmn->ste_ctx,
 					prev_htbl->ste_arr[0].hw_ste,
-					new_htbl->chunk->icm_addr,
+					mlx5dr_icm_pool_get_chunk_icm_addr(new_htbl->chunk),
 					new_htbl->chunk->num_of_entries);
 
 		ste_to_update = &prev_htbl->ste_arr[0];
@@ -755,6 +757,7 @@ static int dr_rule_handle_empty_entry(struct mlx5dr_matcher *matcher,
 {
 	struct mlx5dr_domain *dmn = matcher->tbl->dmn;
 	struct mlx5dr_ste_send_info *ste_info;
+	u64 icm_addr;
 
 	/* Take ref on table, only on first time this ste is used */
 	mlx5dr_htbl_get(cur_htbl);
@@ -762,8 +765,8 @@ static int dr_rule_handle_empty_entry(struct mlx5dr_matcher *matcher,
 	/* new entry -> new branch */
 	list_add_tail(&ste->miss_list_node, miss_list);
 
-	mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste,
-				 nic_matcher->e_anchor->chunk->icm_addr);
+	icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
+	mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste, icm_addr);
 
 	ste->ste_chain_location = ste_location;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
index c1465eb04a5b..0208b859205c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
@@ -113,9 +113,10 @@ void mlx5dr_ste_set_hit_addr(struct mlx5dr_ste_ctx *ste_ctx,
 
 u64 mlx5dr_ste_get_icm_addr(struct mlx5dr_ste *ste)
 {
+	u64 base_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(ste->htbl->chunk);
 	u32 index = ste - ste->htbl->ste_arr;
 
-	return ste->htbl->chunk->icm_addr + DR_STE_SIZE * index;
+	return base_icm_addr + DR_STE_SIZE * index;
 }
 
 u64 mlx5dr_ste_get_mr_addr(struct mlx5dr_ste *ste)
@@ -141,7 +142,8 @@ static void dr_ste_always_hit_htbl(struct mlx5dr_ste_ctx *ste_ctx,
 
 	ste_ctx->set_byte_mask(hw_ste, next_htbl->byte_mask);
 	ste_ctx->set_next_lu_type(hw_ste, next_htbl->lu_type);
-	ste_ctx->set_hit_addr(hw_ste, chunk->icm_addr, chunk->num_of_entries);
+	ste_ctx->set_hit_addr(hw_ste, mlx5dr_icm_pool_get_chunk_icm_addr(chunk),
+			      chunk->num_of_entries);
 
 	dr_ste_set_always_hit((struct dr_hw_ste_format *)ste->hw_ste);
 }
@@ -193,7 +195,7 @@ dr_ste_remove_head_ste(struct mlx5dr_ste_ctx *ste_ctx,
 	 * touches bit_mask area which doesn't exist at ste->hw_ste.
 	 */
 	memcpy(tmp_ste.hw_ste, ste->hw_ste, DR_STE_SIZE_REDUCED);
-	miss_addr = nic_matcher->e_anchor->chunk->icm_addr;
+	miss_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
 	dr_ste_always_miss_addr(ste_ctx, &tmp_ste, miss_addr);
 	memcpy(ste->hw_ste, tmp_ste.hw_ste, DR_STE_SIZE_REDUCED);
 
@@ -364,9 +366,10 @@ void mlx5dr_ste_set_hit_addr_by_next_htbl(struct mlx5dr_ste_ctx *ste_ctx,
 					  u8 *hw_ste,
 					  struct mlx5dr_ste_htbl *next_htbl)
 {
+	u64 icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(next_htbl->chunk);
 	struct mlx5dr_icm_chunk *chunk = next_htbl->chunk;
 
-	ste_ctx->set_hit_addr(hw_ste, chunk->icm_addr, chunk->num_of_entries);
+	ste_ctx->set_hit_addr(hw_ste, icm_addr, chunk->num_of_entries);
 }
 
 void mlx5dr_ste_prepare_for_postsend(struct mlx5dr_ste_ctx *ste_ctx,
@@ -444,7 +447,8 @@ int mlx5dr_ste_create_next_htbl(struct mlx5dr_matcher *matcher,
 
 		/* Write new table to HW */
 		info.type = CONNECT_MISS;
-		info.miss_icm_addr = nic_matcher->e_anchor->chunk->icm_addr;
+		info.miss_icm_addr =
+			mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
 		if (mlx5dr_ste_htbl_init_and_postsend(dmn, nic_dmn, next_htbl,
 						      &info, false)) {
 			mlx5dr_info(dmn, "Failed writing table to HW\n");
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c
index f5f2d356e75f..e5f6412baea9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c
@@ -10,6 +10,7 @@ static int dr_table_set_miss_action_nic(struct mlx5dr_domain *dmn,
 	struct mlx5dr_matcher_rx_tx *last_nic_matcher = NULL;
 	struct mlx5dr_htbl_connect_info info;
 	struct mlx5dr_ste_htbl *last_htbl;
+	struct mlx5dr_icm_chunk *chunk;
 	int ret;
 
 	if (!list_empty(&nic_tbl->nic_matcher_list))
@@ -22,13 +23,14 @@ static int dr_table_set_miss_action_nic(struct mlx5dr_domain *dmn,
 	else
 		last_htbl = nic_tbl->s_anchor;
 
-	if (action)
-		nic_tbl->default_icm_addr =
-			nic_tbl->nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX ?
-				action->dest_tbl->tbl->rx.s_anchor->chunk->icm_addr :
-				action->dest_tbl->tbl->tx.s_anchor->chunk->icm_addr;
-	else
+	if (action) {
+		chunk = nic_tbl->nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX ?
+			action->dest_tbl->tbl->rx.s_anchor->chunk :
+			action->dest_tbl->tbl->tx.s_anchor->chunk;
+		nic_tbl->default_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(chunk);
+	} else {
 		nic_tbl->default_icm_addr = nic_tbl->nic_dmn->default_icm_addr;
+	}
 
 	info.type = CONNECT_MISS;
 	info.miss_icm_addr = nic_tbl->default_icm_addr;
@@ -222,10 +224,10 @@ static int dr_table_create_sw_owned_tbl(struct mlx5dr_table *tbl)
 	int ret;
 
 	if (tbl->rx.s_anchor)
-		icm_addr_rx = tbl->rx.s_anchor->chunk->icm_addr;
+		icm_addr_rx = mlx5dr_icm_pool_get_chunk_icm_addr(tbl->rx.s_anchor->chunk);
 
 	if (tbl->tx.s_anchor)
-		icm_addr_tx = tbl->tx.s_anchor->chunk->icm_addr;
+		icm_addr_tx = mlx5dr_icm_pool_get_chunk_icm_addr(tbl->tx.s_anchor->chunk);
 
 	ft_attr.table_type = tbl->table_type;
 	ft_attr.icm_addr_rx = icm_addr_rx;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index dd5b013e901c..4fe0c8c623ce 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -1099,7 +1099,6 @@ struct mlx5dr_icm_chunk {
 	struct list_head chunk_list;
 	u32 num_of_entries;
 	u32 byte_size;
-	u64 icm_addr;
 
 	/* indicates the index of this chunk in the whole memory,
 	 * used for deleting the chunk from the buddy
@@ -1146,6 +1145,7 @@ int mlx5dr_matcher_select_builders(struct mlx5dr_matcher *matcher,
 
 u64 mlx5dr_icm_pool_get_chunk_mr_addr(struct mlx5dr_icm_chunk *chunk);
 u32 mlx5dr_icm_pool_get_chunk_rkey(struct mlx5dr_icm_chunk *chunk);
+u64 mlx5dr_icm_pool_get_chunk_icm_addr(struct mlx5dr_icm_chunk *chunk);
 
 static inline int
 mlx5dr_icm_pool_dm_type_to_entry_size(enum mlx5dr_icm_type icm_type)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 10/15] net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 09/15] net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 11/15] net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory Saeed Mahameed
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Rongwei Liu, Shun Hao, Yevgeny Kliteynik, Saeed Mahameed

From: Rongwei Liu <rongweil@nvidia.com>

Target to reduce the memory consumption in large scale of flow rules.

They can be calculated quickly from buddy memory pool.
1. num_of_entries calls dr_icm_pool_get_chunk_num_of_entries().
2. byte_size calls dr_icm_pool_get_chunk_byte_size().

Use chunk size in dr_icm_chunk to speed up and the one in dr_ste_htbl
will be removed in the upcoming commit.

This commit reduce 8 bytes from struct mlx5_dr_icm_chunk and its
current size is 56 bytes.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Reviewed-by: Shun Hao <shunh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_icm_pool.c | 34 ++++++++++++-------
 .../mellanox/mlx5/core/steering/dr_rule.c     |  2 +-
 .../mellanox/mlx5/core/steering/dr_send.c     | 12 +++----
 .../mellanox/mlx5/core/steering/dr_ste.c      | 16 +++++----
 .../mellanox/mlx5/core/steering/dr_types.h    |  5 +--
 5 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
index 539af89da629..4ca67fa24cc6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
@@ -76,6 +76,17 @@ u64 mlx5dr_icm_pool_get_chunk_icm_addr(struct mlx5dr_icm_chunk *chunk)
 	return (u64)chunk->buddy_mem->icm_mr->icm_start_addr + size * chunk->seg;
 }
 
+u32 mlx5dr_icm_pool_get_chunk_byte_size(struct mlx5dr_icm_chunk *chunk)
+{
+	return mlx5dr_icm_pool_chunk_size_to_byte(chunk->size,
+			chunk->buddy_mem->pool->icm_type);
+}
+
+u32 mlx5dr_icm_pool_get_chunk_num_of_entries(struct mlx5dr_icm_chunk *chunk)
+{
+	return mlx5dr_icm_pool_chunk_size_to_entries(chunk->size);
+}
+
 static struct mlx5dr_icm_mr *
 dr_icm_pool_mr_create(struct mlx5dr_icm_pool *pool)
 {
@@ -177,12 +188,13 @@ static void dr_icm_chunk_ste_init(struct mlx5dr_icm_chunk *chunk, int offset)
 
 static void dr_icm_chunk_ste_cleanup(struct mlx5dr_icm_chunk *chunk)
 {
+	int num_of_entries = mlx5dr_icm_pool_get_chunk_num_of_entries(chunk);
 	struct mlx5dr_icm_buddy_mem *buddy = chunk->buddy_mem;
 
 	memset(chunk->hw_ste_arr, 0,
-	       chunk->num_of_entries * dr_icm_buddy_get_ste_size(buddy));
+	       num_of_entries * dr_icm_buddy_get_ste_size(buddy));
 	memset(chunk->ste_arr, 0,
-	       chunk->num_of_entries * sizeof(chunk->ste_arr[0]));
+	       num_of_entries * sizeof(chunk->ste_arr[0]));
 }
 
 static enum mlx5dr_icm_type
@@ -196,7 +208,7 @@ static void dr_icm_chunk_destroy(struct mlx5dr_icm_chunk *chunk,
 {
 	enum mlx5dr_icm_type icm_type = get_chunk_icm_type(chunk);
 
-	buddy->used_memory -= chunk->byte_size;
+	buddy->used_memory -= mlx5dr_icm_pool_get_chunk_byte_size(chunk);
 	list_del(&chunk->chunk_list);
 
 	if (icm_type == DR_ICM_TYPE_STE)
@@ -317,17 +329,14 @@ dr_icm_chunk_create(struct mlx5dr_icm_pool *pool,
 
 	offset = mlx5dr_icm_pool_dm_type_to_entry_size(pool->icm_type) * seg;
 
-	chunk->num_of_entries =
-		mlx5dr_icm_pool_chunk_size_to_entries(chunk_size);
-	chunk->byte_size =
-		mlx5dr_icm_pool_chunk_size_to_byte(chunk_size, pool->icm_type);
 	chunk->seg = seg;
+	chunk->size = chunk_size;
 	chunk->buddy_mem = buddy_mem_pool;
 
 	if (pool->icm_type == DR_ICM_TYPE_STE)
 		dr_icm_chunk_ste_init(chunk, offset);
 
-	buddy_mem_pool->used_memory += chunk->byte_size;
+	buddy_mem_pool->used_memory += mlx5dr_icm_pool_get_chunk_byte_size(chunk);
 	INIT_LIST_HEAD(&chunk->chunk_list);
 
 	/* chunk now is part of the used_list */
@@ -351,6 +360,7 @@ static bool dr_icm_pool_is_sync_required(struct mlx5dr_icm_pool *pool)
 static int dr_icm_pool_sync_all_buddy_pools(struct mlx5dr_icm_pool *pool)
 {
 	struct mlx5dr_icm_buddy_mem *buddy, *tmp_buddy;
+	u32 num_entries;
 	int err;
 
 	err = mlx5dr_cmd_sync_steering(pool->dmn->mdev);
@@ -363,9 +373,9 @@ static int dr_icm_pool_sync_all_buddy_pools(struct mlx5dr_icm_pool *pool)
 		struct mlx5dr_icm_chunk *chunk, *tmp_chunk;
 
 		list_for_each_entry_safe(chunk, tmp_chunk, &buddy->hot_list, chunk_list) {
-			mlx5dr_buddy_free_mem(buddy, chunk->seg,
-					      ilog2(chunk->num_of_entries));
-			pool->hot_memory_size -= chunk->byte_size;
+			num_entries = mlx5dr_icm_pool_get_chunk_num_of_entries(chunk);
+			mlx5dr_buddy_free_mem(buddy, chunk->seg, ilog2(num_entries));
+			pool->hot_memory_size -= mlx5dr_icm_pool_get_chunk_byte_size(chunk);
 			dr_icm_chunk_destroy(chunk, buddy);
 		}
 
@@ -463,7 +473,7 @@ void mlx5dr_icm_free_chunk(struct mlx5dr_icm_chunk *chunk)
 	/* move the memory to the waiting list AKA "hot" */
 	mutex_lock(&pool->mutex);
 	list_move_tail(&chunk->chunk_list, &buddy->hot_list);
-	pool->hot_memory_size += chunk->byte_size;
+	pool->hot_memory_size += mlx5dr_icm_pool_get_chunk_byte_size(chunk);
 
 	/* Check if we have chunks that are waiting for sync-ste */
 	if (dr_icm_pool_is_sync_required(pool))
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
index e76c1fda2ac9..91be9d9d95a8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
@@ -450,7 +450,7 @@ dr_rule_rehash_htbl(struct mlx5dr_rule *rule,
 		mlx5dr_ste_set_hit_addr(dmn->ste_ctx,
 					prev_htbl->ste_arr[0].hw_ste,
 					mlx5dr_icm_pool_get_chunk_icm_addr(new_htbl->chunk),
-					new_htbl->chunk->num_of_entries);
+					mlx5dr_icm_pool_get_chunk_num_of_entries(new_htbl->chunk));
 
 		ste_to_update = &prev_htbl->ste_arr[0];
 	} else {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
index 57765d231993..e0470dbd3116 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
@@ -407,17 +407,17 @@ static int dr_get_tbl_copy_details(struct mlx5dr_domain *dmn,
 				   int *iterations,
 				   int *num_stes)
 {
+	u32 chunk_byte_size = mlx5dr_icm_pool_get_chunk_byte_size(htbl->chunk);
 	int alloc_size;
 
-	if (htbl->chunk->byte_size > dmn->send_ring->max_post_send_size) {
-		*iterations = htbl->chunk->byte_size /
-			dmn->send_ring->max_post_send_size;
+	if (chunk_byte_size > dmn->send_ring->max_post_send_size) {
+		*iterations = chunk_byte_size / dmn->send_ring->max_post_send_size;
 		*byte_size = dmn->send_ring->max_post_send_size;
 		alloc_size = *byte_size;
 		*num_stes = *byte_size / DR_STE_SIZE;
 	} else {
 		*iterations = 1;
-		*num_stes = htbl->chunk->num_of_entries;
+		*num_stes = mlx5dr_icm_pool_get_chunk_num_of_entries(htbl->chunk);
 		alloc_size = *num_stes * DR_STE_SIZE;
 	}
 
@@ -462,7 +462,7 @@ int mlx5dr_send_postsend_htbl(struct mlx5dr_domain *dmn,
 			      struct mlx5dr_ste_htbl *htbl,
 			      u8 *formatted_ste, u8 *mask)
 {
-	u32 byte_size = htbl->chunk->byte_size;
+	u32 byte_size = mlx5dr_icm_pool_get_chunk_byte_size(htbl->chunk);
 	int num_stes_per_iter;
 	int iterations;
 	u8 *data;
@@ -530,7 +530,7 @@ int mlx5dr_send_postsend_formatted_htbl(struct mlx5dr_domain *dmn,
 					u8 *ste_init_data,
 					bool update_hw_ste)
 {
-	u32 byte_size = htbl->chunk->byte_size;
+	u32 byte_size = mlx5dr_icm_pool_get_chunk_byte_size(htbl->chunk);
 	int iterations;
 	int num_stes;
 	u8 *copy_dst;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
index 0208b859205c..3ff568e80e0e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
@@ -25,6 +25,7 @@ bool mlx5dr_ste_supp_ttl_cs_recalc(struct mlx5dr_cmd_caps *caps)
 
 u32 mlx5dr_ste_calc_hash_index(u8 *hw_ste_p, struct mlx5dr_ste_htbl *htbl)
 {
+	u32 num_entries = mlx5dr_icm_pool_get_chunk_num_of_entries(htbl->chunk);
 	struct dr_hw_ste_format *hw_ste = (struct dr_hw_ste_format *)hw_ste_p;
 	u8 masked[DR_STE_SIZE_TAG] = {};
 	u32 crc32, index;
@@ -32,7 +33,7 @@ u32 mlx5dr_ste_calc_hash_index(u8 *hw_ste_p, struct mlx5dr_ste_htbl *htbl)
 	int i;
 
 	/* Don't calculate CRC if the result is predicted */
-	if (htbl->chunk->num_of_entries == 1 || htbl->byte_mask == 0)
+	if (num_entries == 1 || htbl->byte_mask == 0)
 		return 0;
 
 	/* Mask tag using byte mask, bit per byte */
@@ -45,7 +46,7 @@ u32 mlx5dr_ste_calc_hash_index(u8 *hw_ste_p, struct mlx5dr_ste_htbl *htbl)
 	}
 
 	crc32 = dr_ste_crc32_calc(masked, DR_STE_SIZE_TAG);
-	index = crc32 & (htbl->chunk->num_of_entries - 1);
+	index = crc32 & (num_entries - 1);
 
 	return index;
 }
@@ -143,7 +144,7 @@ static void dr_ste_always_hit_htbl(struct mlx5dr_ste_ctx *ste_ctx,
 	ste_ctx->set_byte_mask(hw_ste, next_htbl->byte_mask);
 	ste_ctx->set_next_lu_type(hw_ste, next_htbl->lu_type);
 	ste_ctx->set_hit_addr(hw_ste, mlx5dr_icm_pool_get_chunk_icm_addr(chunk),
-			      chunk->num_of_entries);
+			      mlx5dr_icm_pool_get_chunk_num_of_entries(chunk));
 
 	dr_ste_set_always_hit((struct dr_hw_ste_format *)ste->hw_ste);
 }
@@ -367,9 +368,10 @@ void mlx5dr_ste_set_hit_addr_by_next_htbl(struct mlx5dr_ste_ctx *ste_ctx,
 					  struct mlx5dr_ste_htbl *next_htbl)
 {
 	u64 icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(next_htbl->chunk);
-	struct mlx5dr_icm_chunk *chunk = next_htbl->chunk;
+	u32 num_entries =
+		mlx5dr_icm_pool_get_chunk_num_of_entries(next_htbl->chunk);
 
-	ste_ctx->set_hit_addr(hw_ste, icm_addr, chunk->num_of_entries);
+	ste_ctx->set_hit_addr(hw_ste, icm_addr, num_entries);
 }
 
 void mlx5dr_ste_prepare_for_postsend(struct mlx5dr_ste_ctx *ste_ctx,
@@ -474,6 +476,7 @@ struct mlx5dr_ste_htbl *mlx5dr_ste_htbl_alloc(struct mlx5dr_icm_pool *pool,
 {
 	struct mlx5dr_icm_chunk *chunk;
 	struct mlx5dr_ste_htbl *htbl;
+	u32 num_entries;
 	int i;
 
 	htbl = kzalloc(sizeof(*htbl), GFP_KERNEL);
@@ -491,8 +494,9 @@ struct mlx5dr_ste_htbl *mlx5dr_ste_htbl_alloc(struct mlx5dr_icm_pool *pool,
 	htbl->hw_ste_arr = chunk->hw_ste_arr;
 	htbl->miss_list = chunk->miss_list;
 	htbl->refcount = 0;
+	num_entries = mlx5dr_icm_pool_get_chunk_num_of_entries(chunk);
 
-	for (i = 0; i < chunk->num_of_entries; i++) {
+	for (i = 0; i < num_entries; i++) {
 		struct mlx5dr_ste *ste = &htbl->ste_arr[i];
 
 		ste->hw_ste = htbl->hw_ste_arr + i * DR_STE_SIZE_REDUCED;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 4fe0c8c623ce..9660296d36aa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -1097,13 +1097,12 @@ int mlx5dr_rule_get_reverse_rule_members(struct mlx5dr_ste **ste_arr,
 struct mlx5dr_icm_chunk {
 	struct mlx5dr_icm_buddy_mem *buddy_mem;
 	struct list_head chunk_list;
-	u32 num_of_entries;
-	u32 byte_size;
 
 	/* indicates the index of this chunk in the whole memory,
 	 * used for deleting the chunk from the buddy
 	 */
 	unsigned int seg;
+	enum mlx5dr_icm_chunk_size size;
 
 	/* Memory optimisation */
 	struct mlx5dr_ste *ste_arr;
@@ -1146,6 +1145,8 @@ int mlx5dr_matcher_select_builders(struct mlx5dr_matcher *matcher,
 u64 mlx5dr_icm_pool_get_chunk_mr_addr(struct mlx5dr_icm_chunk *chunk);
 u32 mlx5dr_icm_pool_get_chunk_rkey(struct mlx5dr_icm_chunk *chunk);
 u64 mlx5dr_icm_pool_get_chunk_icm_addr(struct mlx5dr_icm_chunk *chunk);
+u32 mlx5dr_icm_pool_get_chunk_num_of_entries(struct mlx5dr_icm_chunk *chunk);
+u32 mlx5dr_icm_pool_get_chunk_byte_size(struct mlx5dr_icm_chunk *chunk);
 
 static inline int
 mlx5dr_icm_pool_dm_type_to_entry_size(enum mlx5dr_icm_type icm_type)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 11/15] net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 10/15] net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 12/15] net/mlx5: DR, Remove hw_ste from mlx5dr_ste " Saeed Mahameed
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Rongwei Liu, Shun Hao, Yevgeny Kliteynik, Saeed Mahameed

From: Rongwei Liu <rongweil@nvidia.com>

Remove chunk_size in struct mlx5dr_icm_chunk and use
chunk->size instead.

Remove ste_arr/hw_ste_arr/miss_list since they can be accessed
from htbl->chunk pointer, no need to keep a copy.

This commit reduces 28 bytes from struct mlx5dr_ste_htbl and its
size is 32 bytes now.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Reviewed-by: Shun Hao <shunh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_matcher.c  | 16 ++++++-----
 .../mellanox/mlx5/core/steering/dr_rule.c     | 28 +++++++++----------
 .../mellanox/mlx5/core/steering/dr_send.c     | 10 +++----
 .../mellanox/mlx5/core/steering/dr_ste.c      | 18 +++++-------
 .../mellanox/mlx5/core/steering/dr_types.h    | 11 ++------
 5 files changed, 37 insertions(+), 46 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c
index 35154ec9673a..0726848eb3ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c
@@ -726,12 +726,14 @@ static int dr_nic_matcher_connect(struct mlx5dr_domain *dmn,
 		return ret;
 
 	/* Update the pointing ste and next hash table */
-	curr_nic_matcher->s_htbl->pointing_ste = prev_htbl->ste_arr;
-	prev_htbl->ste_arr[0].next_htbl = curr_nic_matcher->s_htbl;
+	curr_nic_matcher->s_htbl->pointing_ste = prev_htbl->chunk->ste_arr;
+	prev_htbl->chunk->ste_arr[0].next_htbl = curr_nic_matcher->s_htbl;
 
 	if (next_nic_matcher) {
-		next_nic_matcher->s_htbl->pointing_ste = curr_nic_matcher->e_anchor->ste_arr;
-		curr_nic_matcher->e_anchor->ste_arr[0].next_htbl = next_nic_matcher->s_htbl;
+		next_nic_matcher->s_htbl->pointing_ste =
+			curr_nic_matcher->e_anchor->chunk->ste_arr;
+		curr_nic_matcher->e_anchor->chunk->ste_arr[0].next_htbl =
+			next_nic_matcher->s_htbl;
 	}
 
 	return 0;
@@ -1043,12 +1045,12 @@ static int dr_matcher_disconnect_nic(struct mlx5dr_domain *dmn,
 	if (next_nic_matcher) {
 		info.type = CONNECT_HIT;
 		info.hit_next_htbl = next_nic_matcher->s_htbl;
-		next_nic_matcher->s_htbl->pointing_ste = prev_anchor->ste_arr;
-		prev_anchor->ste_arr[0].next_htbl = next_nic_matcher->s_htbl;
+		next_nic_matcher->s_htbl->pointing_ste = prev_anchor->chunk->ste_arr;
+		prev_anchor->chunk->ste_arr[0].next_htbl = next_nic_matcher->s_htbl;
 	} else {
 		info.type = CONNECT_MISS;
 		info.miss_icm_addr = nic_tbl->default_icm_addr;
-		prev_anchor->ste_arr[0].next_htbl = NULL;
+		prev_anchor->chunk->ste_arr[0].next_htbl = NULL;
 	}
 
 	return mlx5dr_ste_htbl_init_and_postsend(dmn, nic_dmn, prev_anchor,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
index 91be9d9d95a8..698e1cfc9571 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
@@ -54,7 +54,7 @@ dr_rule_create_collision_htbl(struct mlx5dr_matcher *matcher,
 	}
 
 	/* One and only entry, never grows */
-	ste = new_htbl->ste_arr;
+	ste = new_htbl->chunk->ste_arr;
 	icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
 	mlx5dr_ste_set_miss_addr(ste_ctx, hw_ste, icm_addr);
 	mlx5dr_htbl_get(new_htbl);
@@ -80,7 +80,7 @@ dr_rule_create_collision_entry(struct mlx5dr_matcher *matcher,
 	ste->htbl->pointing_ste = orig_ste->htbl->pointing_ste;
 
 	/* In collision entry, all members share the same miss_list_head */
-	ste->htbl->miss_list = mlx5dr_ste_get_miss_list(orig_ste);
+	ste->htbl->chunk->miss_list = mlx5dr_ste_get_miss_list(orig_ste);
 
 	/* Next table */
 	if (mlx5dr_ste_create_next_htbl(matcher, nic_matcher, ste, hw_ste,
@@ -186,7 +186,7 @@ dr_rule_rehash_handle_collision(struct mlx5dr_matcher *matcher,
 	new_ste->htbl->pointing_ste = col_ste->htbl->pointing_ste;
 
 	/* In collision entry, all members share the same miss_list_head */
-	new_ste->htbl->miss_list = mlx5dr_ste_get_miss_list(col_ste);
+	new_ste->htbl->chunk->miss_list = mlx5dr_ste_get_miss_list(col_ste);
 
 	/* Update the previous from the list */
 	ret = dr_rule_append_to_miss_list(dmn->ste_ctx, new_ste,
@@ -250,7 +250,7 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher,
 	mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste, icm_addr);
 
 	new_idx = mlx5dr_ste_calc_hash_index(hw_ste, new_htbl);
-	new_ste = &new_htbl->ste_arr[new_idx];
+	new_ste = &new_htbl->chunk->ste_arr[new_idx];
 
 	if (mlx5dr_ste_is_not_used(new_ste)) {
 		mlx5dr_htbl_get(new_htbl);
@@ -336,7 +336,7 @@ static int dr_rule_rehash_copy_htbl(struct mlx5dr_matcher *matcher,
 	int err = 0;
 	int i;
 
-	cur_entries = mlx5dr_icm_pool_chunk_size_to_entries(cur_htbl->chunk_size);
+	cur_entries = mlx5dr_icm_pool_chunk_size_to_entries(cur_htbl->chunk->size);
 
 	if (cur_entries < 1) {
 		mlx5dr_dbg(matcher->tbl->dmn, "Invalid number of entries\n");
@@ -344,7 +344,7 @@ static int dr_rule_rehash_copy_htbl(struct mlx5dr_matcher *matcher,
 	}
 
 	for (i = 0; i < cur_entries; i++) {
-		cur_ste = &cur_htbl->ste_arr[i];
+		cur_ste = &cur_htbl->chunk->ste_arr[i];
 		if (mlx5dr_ste_is_not_used(cur_ste)) /* Empty, nothing to copy */
 			continue;
 
@@ -448,11 +448,11 @@ dr_rule_rehash_htbl(struct mlx5dr_rule *rule,
 		 * (48B len) which works only on first 32B
 		 */
 		mlx5dr_ste_set_hit_addr(dmn->ste_ctx,
-					prev_htbl->ste_arr[0].hw_ste,
+					prev_htbl->chunk->ste_arr[0].hw_ste,
 					mlx5dr_icm_pool_get_chunk_icm_addr(new_htbl->chunk),
 					mlx5dr_icm_pool_get_chunk_num_of_entries(new_htbl->chunk));
 
-		ste_to_update = &prev_htbl->ste_arr[0];
+		ste_to_update = &prev_htbl->chunk->ste_arr[0];
 	} else {
 		mlx5dr_ste_set_hit_addr_by_next_htbl(dmn->ste_ctx,
 						     cur_htbl->pointing_ste->hw_ste,
@@ -491,10 +491,10 @@ static struct mlx5dr_ste_htbl *dr_rule_rehash(struct mlx5dr_rule *rule,
 	struct mlx5dr_domain *dmn = rule->matcher->tbl->dmn;
 	enum mlx5dr_icm_chunk_size new_size;
 
-	new_size = mlx5dr_icm_next_higher_chunk(cur_htbl->chunk_size);
+	new_size = mlx5dr_icm_next_higher_chunk(cur_htbl->chunk->size);
 	new_size = min_t(u32, new_size, dmn->info.max_log_sw_icm_sz);
 
-	if (new_size == cur_htbl->chunk_size)
+	if (new_size == cur_htbl->chunk->size)
 		return NULL; /* Skip rehash, we already at the max size */
 
 	return dr_rule_rehash_htbl(rule, nic_rule, cur_htbl, ste_location,
@@ -661,13 +661,13 @@ static bool dr_rule_need_enlarge_hash(struct mlx5dr_ste_htbl *htbl,
 	struct mlx5dr_ste_htbl_ctrl *ctrl = &htbl->ctrl;
 	int threshold;
 
-	if (dmn->info.max_log_sw_icm_sz <= htbl->chunk_size)
+	if (dmn->info.max_log_sw_icm_sz <= htbl->chunk->size)
 		return false;
 
 	if (!mlx5dr_ste_htbl_may_grow(htbl))
 		return false;
 
-	if (dr_get_bits_per_mask(htbl->byte_mask) * BITS_PER_BYTE <= htbl->chunk_size)
+	if (dr_get_bits_per_mask(htbl->byte_mask) * BITS_PER_BYTE <= htbl->chunk->size)
 		return false;
 
 	threshold = mlx5dr_ste_htbl_increase_threshold(htbl);
@@ -825,7 +825,7 @@ dr_rule_handle_ste_branch(struct mlx5dr_rule *rule,
 again:
 	index = mlx5dr_ste_calc_hash_index(hw_ste, cur_htbl);
 	miss_list = &cur_htbl->chunk->miss_list[index];
-	ste = &cur_htbl->ste_arr[index];
+	ste = &cur_htbl->chunk->ste_arr[index];
 
 	if (mlx5dr_ste_is_not_used(ste)) {
 		if (dr_rule_handle_empty_entry(matcher, nic_matcher, cur_htbl,
@@ -861,7 +861,7 @@ dr_rule_handle_ste_branch(struct mlx5dr_rule *rule,
 						  ste_location, send_ste_list);
 			if (!new_htbl) {
 				mlx5dr_err(dmn, "Failed creating rehash table, htbl-log_size: %d\n",
-					   cur_htbl->chunk_size);
+					   cur_htbl->chunk->size);
 				mlx5dr_htbl_put(cur_htbl);
 			} else {
 				cur_htbl = new_htbl;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
index e0470dbd3116..26a91c4415c5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
@@ -486,7 +486,7 @@ int mlx5dr_send_postsend_htbl(struct mlx5dr_domain *dmn,
 		 * need to add the bit_mask
 		 */
 		for (j = 0; j < num_stes_per_iter; j++) {
-			struct mlx5dr_ste *ste = &htbl->ste_arr[ste_index + j];
+			struct mlx5dr_ste *ste = &htbl->chunk->ste_arr[ste_index + j];
 			u32 ste_off = j * DR_STE_SIZE;
 
 			if (mlx5dr_ste_is_not_used(ste)) {
@@ -495,7 +495,7 @@ int mlx5dr_send_postsend_htbl(struct mlx5dr_domain *dmn,
 			} else {
 				/* Copy data */
 				memcpy(data + ste_off,
-				       htbl->ste_arr[ste_index + j].hw_ste,
+				       htbl->chunk->ste_arr[ste_index + j].hw_ste,
 				       DR_STE_SIZE_REDUCED);
 				/* Copy bit_mask */
 				memcpy(data + ste_off + DR_STE_SIZE_REDUCED,
@@ -511,7 +511,7 @@ int mlx5dr_send_postsend_htbl(struct mlx5dr_domain *dmn,
 		send_info.write.length = byte_size;
 		send_info.write.lkey = 0;
 		send_info.remote_addr =
-			mlx5dr_ste_get_mr_addr(htbl->ste_arr + ste_index);
+			mlx5dr_ste_get_mr_addr(htbl->chunk->ste_arr + ste_index);
 		send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(htbl->chunk);
 
 		ret = dr_postsend_icm_data(dmn, &send_info);
@@ -546,7 +546,7 @@ int mlx5dr_send_postsend_formatted_htbl(struct mlx5dr_domain *dmn,
 	if (update_hw_ste) {
 		/* Copy the reduced STE to hash table ste_arr */
 		for (i = 0; i < num_stes; i++) {
-			copy_dst = htbl->hw_ste_arr + i * DR_STE_SIZE_REDUCED;
+			copy_dst = htbl->chunk->hw_ste_arr + i * DR_STE_SIZE_REDUCED;
 			memcpy(copy_dst, ste_init_data, DR_STE_SIZE_REDUCED);
 		}
 	}
@@ -568,7 +568,7 @@ int mlx5dr_send_postsend_formatted_htbl(struct mlx5dr_domain *dmn,
 		send_info.write.length = byte_size;
 		send_info.write.lkey = 0;
 		send_info.remote_addr =
-			mlx5dr_ste_get_mr_addr(htbl->ste_arr + ste_index);
+			mlx5dr_ste_get_mr_addr(htbl->chunk->ste_arr + ste_index);
 		send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(htbl->chunk);
 
 		ret = dr_postsend_icm_data(dmn, &send_info);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
index 3ff568e80e0e..3ab155feba5e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
@@ -115,23 +115,23 @@ void mlx5dr_ste_set_hit_addr(struct mlx5dr_ste_ctx *ste_ctx,
 u64 mlx5dr_ste_get_icm_addr(struct mlx5dr_ste *ste)
 {
 	u64 base_icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(ste->htbl->chunk);
-	u32 index = ste - ste->htbl->ste_arr;
+	u32 index = ste - ste->htbl->chunk->ste_arr;
 
 	return base_icm_addr + DR_STE_SIZE * index;
 }
 
 u64 mlx5dr_ste_get_mr_addr(struct mlx5dr_ste *ste)
 {
-	u32 index = ste - ste->htbl->ste_arr;
+	u32 index = ste - ste->htbl->chunk->ste_arr;
 
 	return mlx5dr_icm_pool_get_chunk_mr_addr(ste->htbl->chunk) + DR_STE_SIZE * index;
 }
 
 struct list_head *mlx5dr_ste_get_miss_list(struct mlx5dr_ste *ste)
 {
-	u32 index = ste - ste->htbl->ste_arr;
+	u32 index = ste - ste->htbl->chunk->ste_arr;
 
-	return &ste->htbl->miss_list[index];
+	return &ste->htbl->chunk->miss_list[index];
 }
 
 static void dr_ste_always_hit_htbl(struct mlx5dr_ste_ctx *ste_ctx,
@@ -490,23 +490,19 @@ struct mlx5dr_ste_htbl *mlx5dr_ste_htbl_alloc(struct mlx5dr_icm_pool *pool,
 	htbl->chunk = chunk;
 	htbl->lu_type = lu_type;
 	htbl->byte_mask = byte_mask;
-	htbl->ste_arr = chunk->ste_arr;
-	htbl->hw_ste_arr = chunk->hw_ste_arr;
-	htbl->miss_list = chunk->miss_list;
 	htbl->refcount = 0;
 	num_entries = mlx5dr_icm_pool_get_chunk_num_of_entries(chunk);
 
 	for (i = 0; i < num_entries; i++) {
-		struct mlx5dr_ste *ste = &htbl->ste_arr[i];
+		struct mlx5dr_ste *ste = &chunk->ste_arr[i];
 
-		ste->hw_ste = htbl->hw_ste_arr + i * DR_STE_SIZE_REDUCED;
+		ste->hw_ste = chunk->hw_ste_arr + i * DR_STE_SIZE_REDUCED;
 		ste->htbl = htbl;
 		ste->refcount = 0;
 		INIT_LIST_HEAD(&ste->miss_list_node);
-		INIT_LIST_HEAD(&htbl->miss_list[i]);
+		INIT_LIST_HEAD(&chunk->miss_list[i]);
 	}
 
-	htbl->chunk_size = chunk_size;
 	return htbl;
 
 out_free_htbl:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 9660296d36aa..1294c12ceb10 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -181,14 +181,7 @@ struct mlx5dr_ste_htbl {
 	u16 byte_mask;
 	u32 refcount;
 	struct mlx5dr_icm_chunk *chunk;
-	struct mlx5dr_ste *ste_arr;
-	u8 *hw_ste_arr;
-
-	struct list_head *miss_list;
-
-	enum mlx5dr_icm_chunk_size chunk_size;
 	struct mlx5dr_ste *pointing_ste;
-
 	struct mlx5dr_ste_htbl_ctrl ctrl;
 };
 
@@ -1180,7 +1173,7 @@ static inline int
 mlx5dr_ste_htbl_increase_threshold(struct mlx5dr_ste_htbl *htbl)
 {
 	int num_of_entries =
-		mlx5dr_icm_pool_chunk_size_to_entries(htbl->chunk_size);
+		mlx5dr_icm_pool_chunk_size_to_entries(htbl->chunk->size);
 
 	/* Threshold is 50%, one is added to table of size 1 */
 	return (num_of_entries + 1) / 2;
@@ -1189,7 +1182,7 @@ mlx5dr_ste_htbl_increase_threshold(struct mlx5dr_ste_htbl *htbl)
 static inline bool
 mlx5dr_ste_htbl_may_grow(struct mlx5dr_ste_htbl *htbl)
 {
-	if (htbl->chunk_size == DR_CHUNK_SIZE_MAX - 1 || !htbl->byte_mask)
+	if (htbl->chunk->size == DR_CHUNK_SIZE_MAX - 1 || !htbl->byte_mask)
 		return false;
 
 	return true;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 12/15] net/mlx5: DR, Remove hw_ste from mlx5dr_ste to reduce memory
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 11/15] net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 13/15] net/mlx5: CT: Remove extra rhashtable remove on tuple entries Saeed Mahameed
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Rongwei Liu, Shun Hao, Yevgeny Kliteynik, Saeed Mahameed

From: Rongwei Liu <rongweil@nvidia.com>

It can be calculated via function mlx5dr_ste_get_hw_ste().
Very simple and lightweight, no need to use a dedicated member.

Reduce 8 bytes from struct mlx5dr_ste and its size is 48 bytes now.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Reviewed-by: Shun Hao <shunh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_dbg.c      |  3 +-
 .../mellanox/mlx5/core/steering/dr_rule.c     | 24 +++----
 .../mellanox/mlx5/core/steering/dr_send.c     |  3 +-
 .../mellanox/mlx5/core/steering/dr_ste.c      | 63 +++++++++++--------
 .../mellanox/mlx5/core/steering/dr_types.h    |  2 +-
 5 files changed, 55 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
index 8fd98b628740..d5998ef59be4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
@@ -217,7 +217,8 @@ dr_dump_rule_mem(struct seq_file *file, struct mlx5dr_ste *ste,
 				       DR_DUMP_REC_TYPE_RULE_TX_ENTRY_V1;
 	}
 
-	dr_dump_hex_print(hw_ste_dump, (char *)ste->hw_ste, DR_STE_SIZE_REDUCED);
+	dr_dump_hex_print(hw_ste_dump, (char *)mlx5dr_ste_get_hw_ste(ste),
+			  DR_STE_SIZE_REDUCED);
 
 	seq_printf(file, "%d,0x%llx,0x%llx,%s\n", mem_rec_type,
 		   dr_dump_icm_to_idx(mlx5dr_ste_get_icm_addr(ste)), rule_id,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
index 698e1cfc9571..ddfaf7891188 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c
@@ -21,12 +21,12 @@ static int dr_rule_append_to_miss_list(struct mlx5dr_ste_ctx *ste_ctx,
 	if (!ste_info_last)
 		return -ENOMEM;
 
-	mlx5dr_ste_set_miss_addr(ste_ctx, last_ste->hw_ste,
+	mlx5dr_ste_set_miss_addr(ste_ctx, mlx5dr_ste_get_hw_ste(last_ste),
 				 mlx5dr_ste_get_icm_addr(new_last_ste));
 	list_add_tail(&new_last_ste->miss_list_node, miss_list);
 
 	mlx5dr_send_fill_and_append_ste_send_info(last_ste, DR_STE_SIZE_CTRL,
-						  0, last_ste->hw_ste,
+						  0, mlx5dr_ste_get_hw_ste(last_ste),
 						  ste_info_last, send_list, true);
 
 	return 0;
@@ -108,9 +108,11 @@ dr_rule_handle_one_ste_in_update_list(struct mlx5dr_ste_send_info *ste_info,
 	 * is already written to the hw.
 	 */
 	if (ste_info->size == DR_STE_SIZE_CTRL)
-		memcpy(ste_info->ste->hw_ste, ste_info->data, DR_STE_SIZE_CTRL);
+		memcpy(mlx5dr_ste_get_hw_ste(ste_info->ste),
+		       ste_info->data, DR_STE_SIZE_CTRL);
 	else
-		memcpy(ste_info->ste->hw_ste, ste_info->data, DR_STE_SIZE_REDUCED);
+		memcpy(mlx5dr_ste_get_hw_ste(ste_info->ste),
+		       ste_info->data, DR_STE_SIZE_REDUCED);
 
 	ret = mlx5dr_send_postsend_ste(dmn, ste_info->ste, ste_info->data,
 				       ste_info->size, ste_info->offset);
@@ -160,7 +162,7 @@ dr_rule_find_ste_in_miss_list(struct list_head *miss_list, u8 *hw_ste)
 
 	/* Check if hw_ste is present in the list */
 	list_for_each_entry(ste, miss_list, miss_list_node) {
-		if (mlx5dr_ste_equal_tag(ste->hw_ste, hw_ste))
+		if (mlx5dr_ste_equal_tag(mlx5dr_ste_get_hw_ste(ste), hw_ste))
 			return ste;
 	}
 
@@ -246,7 +248,7 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher,
 
 	/* Copy STE control and tag */
 	icm_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
-	memcpy(hw_ste, cur_ste->hw_ste, DR_STE_SIZE_REDUCED);
+	memcpy(hw_ste, mlx5dr_ste_get_hw_ste(cur_ste), DR_STE_SIZE_REDUCED);
 	mlx5dr_ste_set_miss_addr(dmn->ste_ctx, hw_ste, icm_addr);
 
 	new_idx = mlx5dr_ste_calc_hash_index(hw_ste, new_htbl);
@@ -271,7 +273,7 @@ dr_rule_rehash_copy_ste(struct mlx5dr_matcher *matcher,
 		use_update_list = true;
 	}
 
-	memcpy(new_ste->hw_ste, hw_ste, DR_STE_SIZE_REDUCED);
+	memcpy(mlx5dr_ste_get_hw_ste(new_ste), hw_ste, DR_STE_SIZE_REDUCED);
 
 	new_htbl->ctrl.num_of_valid_entries++;
 
@@ -448,21 +450,21 @@ dr_rule_rehash_htbl(struct mlx5dr_rule *rule,
 		 * (48B len) which works only on first 32B
 		 */
 		mlx5dr_ste_set_hit_addr(dmn->ste_ctx,
-					prev_htbl->chunk->ste_arr[0].hw_ste,
+					prev_htbl->chunk->hw_ste_arr,
 					mlx5dr_icm_pool_get_chunk_icm_addr(new_htbl->chunk),
 					mlx5dr_icm_pool_get_chunk_num_of_entries(new_htbl->chunk));
 
 		ste_to_update = &prev_htbl->chunk->ste_arr[0];
 	} else {
 		mlx5dr_ste_set_hit_addr_by_next_htbl(dmn->ste_ctx,
-						     cur_htbl->pointing_ste->hw_ste,
+						     mlx5dr_ste_get_hw_ste(cur_htbl->pointing_ste),
 						     new_htbl);
 		ste_to_update = cur_htbl->pointing_ste;
 	}
 
 	mlx5dr_send_fill_and_append_ste_send_info(ste_to_update, DR_STE_SIZE_CTRL,
-						  0, ste_to_update->hw_ste, ste_info,
-						  update_list, false);
+						  0, mlx5dr_ste_get_hw_ste(ste_to_update),
+						  ste_info, update_list, false);
 
 	return new_htbl;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
index 26a91c4415c5..ef19a66f5233 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
@@ -495,7 +495,8 @@ int mlx5dr_send_postsend_htbl(struct mlx5dr_domain *dmn,
 			} else {
 				/* Copy data */
 				memcpy(data + ste_off,
-				       htbl->chunk->ste_arr[ste_index + j].hw_ste,
+				       htbl->chunk->hw_ste_arr +
+				       DR_STE_SIZE_REDUCED * (ste_index + j),
 				       DR_STE_SIZE_REDUCED);
 				/* Copy bit_mask */
 				memcpy(data + ste_off + DR_STE_SIZE_REDUCED,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
index 3ab155feba5e..09ebd3088857 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
@@ -97,13 +97,11 @@ void mlx5dr_ste_set_miss_addr(struct mlx5dr_ste_ctx *ste_ctx,
 }
 
 static void dr_ste_always_miss_addr(struct mlx5dr_ste_ctx *ste_ctx,
-				    struct mlx5dr_ste *ste, u64 miss_addr)
+				    u8 *hw_ste, u64 miss_addr)
 {
-	u8 *hw_ste_p = ste->hw_ste;
-
-	ste_ctx->set_next_lu_type(hw_ste_p, MLX5DR_STE_LU_TYPE_DONT_CARE);
-	ste_ctx->set_miss_addr(hw_ste_p, miss_addr);
-	dr_ste_set_always_miss((struct dr_hw_ste_format *)ste->hw_ste);
+	ste_ctx->set_next_lu_type(hw_ste, MLX5DR_STE_LU_TYPE_DONT_CARE);
+	ste_ctx->set_miss_addr(hw_ste, miss_addr);
+	dr_ste_set_always_miss((struct dr_hw_ste_format *)hw_ste);
 }
 
 void mlx5dr_ste_set_hit_addr(struct mlx5dr_ste_ctx *ste_ctx,
@@ -127,6 +125,13 @@ u64 mlx5dr_ste_get_mr_addr(struct mlx5dr_ste *ste)
 	return mlx5dr_icm_pool_get_chunk_mr_addr(ste->htbl->chunk) + DR_STE_SIZE * index;
 }
 
+u8 *mlx5dr_ste_get_hw_ste(struct mlx5dr_ste *ste)
+{
+	u64 index = ste - ste->htbl->chunk->ste_arr;
+
+	return ste->htbl->chunk->hw_ste_arr + DR_STE_SIZE_REDUCED * index;
+}
+
 struct list_head *mlx5dr_ste_get_miss_list(struct mlx5dr_ste *ste)
 {
 	u32 index = ste - ste->htbl->chunk->ste_arr;
@@ -135,18 +140,17 @@ struct list_head *mlx5dr_ste_get_miss_list(struct mlx5dr_ste *ste)
 }
 
 static void dr_ste_always_hit_htbl(struct mlx5dr_ste_ctx *ste_ctx,
-				   struct mlx5dr_ste *ste,
+				   u8 *hw_ste,
 				   struct mlx5dr_ste_htbl *next_htbl)
 {
 	struct mlx5dr_icm_chunk *chunk = next_htbl->chunk;
-	u8 *hw_ste = ste->hw_ste;
 
 	ste_ctx->set_byte_mask(hw_ste, next_htbl->byte_mask);
 	ste_ctx->set_next_lu_type(hw_ste, next_htbl->lu_type);
 	ste_ctx->set_hit_addr(hw_ste, mlx5dr_icm_pool_get_chunk_icm_addr(chunk),
 			      mlx5dr_icm_pool_get_chunk_num_of_entries(chunk));
 
-	dr_ste_set_always_hit((struct dr_hw_ste_format *)ste->hw_ste);
+	dr_ste_set_always_hit((struct dr_hw_ste_format *)hw_ste);
 }
 
 bool mlx5dr_ste_is_last_in_rule(struct mlx5dr_matcher_rx_tx *nic_matcher,
@@ -169,7 +173,8 @@ bool mlx5dr_ste_is_last_in_rule(struct mlx5dr_matcher_rx_tx *nic_matcher,
  */
 static void dr_ste_replace(struct mlx5dr_ste *dst, struct mlx5dr_ste *src)
 {
-	memcpy(dst->hw_ste, src->hw_ste, DR_STE_SIZE_REDUCED);
+	memcpy(mlx5dr_ste_get_hw_ste(dst), mlx5dr_ste_get_hw_ste(src),
+	       DR_STE_SIZE_REDUCED);
 	dst->next_htbl = src->next_htbl;
 	if (dst->next_htbl)
 		dst->next_htbl->pointing_ste = dst;
@@ -187,18 +192,17 @@ dr_ste_remove_head_ste(struct mlx5dr_ste_ctx *ste_ctx,
 		       struct mlx5dr_ste_htbl *stats_tbl)
 {
 	u8 tmp_data_ste[DR_STE_SIZE] = {};
-	struct mlx5dr_ste tmp_ste = {};
 	u64 miss_addr;
 
-	tmp_ste.hw_ste = tmp_data_ste;
+	miss_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
 
 	/* Use temp ste because dr_ste_always_miss_addr
 	 * touches bit_mask area which doesn't exist at ste->hw_ste.
+	 * Need to use a full-sized (DR_STE_SIZE) hw_ste.
 	 */
-	memcpy(tmp_ste.hw_ste, ste->hw_ste, DR_STE_SIZE_REDUCED);
-	miss_addr = mlx5dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk);
-	dr_ste_always_miss_addr(ste_ctx, &tmp_ste, miss_addr);
-	memcpy(ste->hw_ste, tmp_ste.hw_ste, DR_STE_SIZE_REDUCED);
+	memcpy(tmp_data_ste, mlx5dr_ste_get_hw_ste(ste), DR_STE_SIZE_REDUCED);
+	dr_ste_always_miss_addr(ste_ctx, tmp_data_ste, miss_addr);
+	memcpy(mlx5dr_ste_get_hw_ste(ste), tmp_data_ste, DR_STE_SIZE_REDUCED);
 
 	list_del_init(&ste->miss_list_node);
 
@@ -240,7 +244,7 @@ dr_ste_replace_head_ste(struct mlx5dr_matcher_rx_tx *nic_matcher,
 	mlx5dr_rule_set_last_member(next_ste->rule_rx_tx, ste, false);
 
 	/* Copy all 64 hw_ste bytes */
-	memcpy(hw_ste, ste->hw_ste, DR_STE_SIZE_REDUCED);
+	memcpy(hw_ste, mlx5dr_ste_get_hw_ste(ste), DR_STE_SIZE_REDUCED);
 	sb_idx = ste->ste_chain_location - 1;
 	mlx5dr_ste_set_bit_mask(hw_ste,
 				nic_matcher->ste_builder[sb_idx].bit_mask);
@@ -276,12 +280,13 @@ static void dr_ste_remove_middle_ste(struct mlx5dr_ste_ctx *ste_ctx,
 	if (WARN_ON(!prev_ste))
 		return;
 
-	miss_addr = ste_ctx->get_miss_addr(ste->hw_ste);
-	ste_ctx->set_miss_addr(prev_ste->hw_ste, miss_addr);
+	miss_addr = ste_ctx->get_miss_addr(mlx5dr_ste_get_hw_ste(ste));
+	ste_ctx->set_miss_addr(mlx5dr_ste_get_hw_ste(prev_ste), miss_addr);
 
 	mlx5dr_send_fill_and_append_ste_send_info(prev_ste, DR_STE_SIZE_CTRL, 0,
-						  prev_ste->hw_ste, ste_info,
-						  send_ste_list, true /* Copy data*/);
+						  mlx5dr_ste_get_hw_ste(prev_ste),
+						  ste_info, send_ste_list,
+						  true /* Copy data*/);
 
 	list_del_init(&ste->miss_list_node);
 
@@ -390,15 +395,22 @@ void mlx5dr_ste_set_formatted_ste(struct mlx5dr_ste_ctx *ste_ctx,
 				  struct mlx5dr_htbl_connect_info *connect_info)
 {
 	bool is_rx = nic_type == DR_DOMAIN_NIC_TYPE_RX;
-	struct mlx5dr_ste ste = {};
+	u8 tmp_hw_ste[DR_STE_SIZE] = {0};
 
 	ste_ctx->ste_init(formatted_ste, htbl->lu_type, is_rx, gvmi);
-	ste.hw_ste = formatted_ste;
 
+	/* Use temp ste because dr_ste_always_miss_addr/hit_htbl
+	 * touches bit_mask area which doesn't exist at ste->hw_ste.
+	 * Need to use a full-sized (DR_STE_SIZE) hw_ste.
+	 */
+	memcpy(tmp_hw_ste, formatted_ste, DR_STE_SIZE_REDUCED);
 	if (connect_info->type == CONNECT_HIT)
-		dr_ste_always_hit_htbl(ste_ctx, &ste, connect_info->hit_next_htbl);
+		dr_ste_always_hit_htbl(ste_ctx, tmp_hw_ste,
+				       connect_info->hit_next_htbl);
 	else
-		dr_ste_always_miss_addr(ste_ctx, &ste, connect_info->miss_icm_addr);
+		dr_ste_always_miss_addr(ste_ctx, tmp_hw_ste,
+					connect_info->miss_icm_addr);
+	memcpy(formatted_ste, tmp_hw_ste, DR_STE_SIZE_REDUCED);
 }
 
 int mlx5dr_ste_htbl_init_and_postsend(struct mlx5dr_domain *dmn,
@@ -496,7 +508,6 @@ struct mlx5dr_ste_htbl *mlx5dr_ste_htbl_alloc(struct mlx5dr_icm_pool *pool,
 	for (i = 0; i < num_entries; i++) {
 		struct mlx5dr_ste *ste = &chunk->ste_arr[i];
 
-		ste->hw_ste = chunk->hw_ste_arr + i * DR_STE_SIZE_REDUCED;
 		ste->htbl = htbl;
 		ste->refcount = 0;
 		INIT_LIST_HEAD(&ste->miss_list_node);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 1294c12ceb10..46866a5fc5ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -147,7 +147,6 @@ struct mlx5dr_matcher_rx_tx;
 struct mlx5dr_ste_ctx;
 
 struct mlx5dr_ste {
-	u8 *hw_ste;
 	/* refcount: indicates the num of rules that using this ste */
 	u32 refcount;
 
@@ -1140,6 +1139,7 @@ u32 mlx5dr_icm_pool_get_chunk_rkey(struct mlx5dr_icm_chunk *chunk);
 u64 mlx5dr_icm_pool_get_chunk_icm_addr(struct mlx5dr_icm_chunk *chunk);
 u32 mlx5dr_icm_pool_get_chunk_num_of_entries(struct mlx5dr_icm_chunk *chunk);
 u32 mlx5dr_icm_pool_get_chunk_byte_size(struct mlx5dr_icm_chunk *chunk);
+u8 *mlx5dr_ste_get_hw_ste(struct mlx5dr_ste *ste);
 
 static inline int
 mlx5dr_icm_pool_dm_type_to_entry_size(enum mlx5dr_icm_type icm_type)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 13/15] net/mlx5: CT: Remove extra rhashtable remove on tuple entries
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 12/15] net/mlx5: DR, Remove hw_ste from mlx5dr_ste " Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 14/15] net/mlx5: Remove unused exported contiguous coherent buffer allocation API Saeed Mahameed
  2022-03-17 18:54 ` [net-next 15/15] net/mlx5: Remove unused fill page array API function Saeed Mahameed
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Paul Blakey, Maor Dickman, Saeed Mahameed

From: Paul Blakey <paulb@nvidia.com>

On tuple offload del command, tuples are tried to be removed twice
from the hashtable, once directly via mlx5_tc_ct_entry_remove_from_tuples()
and a second time in the following mlx5_tc_ct_entry_put()->
mlx5_tc_ct_entry_del()->mlx5_tc_ct_entry_remove_from_tuples() call.

This doesn't cause any issue since rhashtable first checks if the
removed object exists in the hashtable.

Remove the extra mlx5_tc_ct_entry_remove_from_tuples().

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
index 7db9d8ee1304..e49f51124c74 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
@@ -1161,7 +1161,6 @@ mlx5_tc_ct_block_flow_offload_del(struct mlx5_ct_ft *ft,
 	}
 
 	rhashtable_remove_fast(&ft->ct_entries_ht, &entry->node, cts_ht_params);
-	mlx5_tc_ct_entry_remove_from_tuples(entry);
 	spin_unlock_bh(&ct_priv->ht_lock);
 
 	mlx5_tc_ct_entry_put(entry);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 14/15] net/mlx5: Remove unused exported contiguous coherent buffer allocation API
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 13/15] net/mlx5: CT: Remove extra rhashtable remove on tuple entries Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  2022-03-17 18:54 ` [net-next 15/15] net/mlx5: Remove unused fill page array API function Saeed Mahameed
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Moshe Shemesh, Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

All WQ types moved to using the fragmented allocation API
for coherent memory. Contiguous API is not used anymore.
Remove it, reduce the number of exported functions.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/alloc.c   | 47 -------------------
 include/linux/mlx5/driver.h                   |  3 --
 2 files changed, 50 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index d5408f6ce5a7..1762c5c36042 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -71,53 +71,6 @@ static void *mlx5_dma_zalloc_coherent_node(struct mlx5_core_dev *dev,
 	return cpu_handle;
 }
 
-static int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size,
-			       struct mlx5_frag_buf *buf, int node)
-{
-	dma_addr_t t;
-
-	buf->size = size;
-	buf->npages       = 1;
-	buf->page_shift   = (u8)get_order(size) + PAGE_SHIFT;
-
-	buf->frags = kzalloc(sizeof(*buf->frags), GFP_KERNEL);
-	if (!buf->frags)
-		return -ENOMEM;
-
-	buf->frags->buf   = mlx5_dma_zalloc_coherent_node(dev, size,
-							  &t, node);
-	if (!buf->frags->buf)
-		goto err_out;
-
-	buf->frags->map = t;
-
-	while (t & ((1 << buf->page_shift) - 1)) {
-		--buf->page_shift;
-		buf->npages *= 2;
-	}
-
-	return 0;
-err_out:
-	kfree(buf->frags);
-	return -ENOMEM;
-}
-
-int mlx5_buf_alloc(struct mlx5_core_dev *dev,
-		   int size, struct mlx5_frag_buf *buf)
-{
-	return mlx5_buf_alloc_node(dev, size, buf, dev->priv.numa_node);
-}
-EXPORT_SYMBOL(mlx5_buf_alloc);
-
-void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_frag_buf *buf)
-{
-	dma_free_coherent(mlx5_core_dma_dev(dev), buf->size, buf->frags->buf,
-			  buf->frags->map);
-
-	kfree(buf->frags);
-}
-EXPORT_SYMBOL_GPL(mlx5_buf_free);
-
 int mlx5_frag_buf_alloc_node(struct mlx5_core_dev *dev, int size,
 			     struct mlx5_frag_buf *buf, int node)
 {
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 00a914b0716e..a386aec1eb65 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1009,9 +1009,6 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev);
 void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health);
 void mlx5_drain_health_wq(struct mlx5_core_dev *dev);
 void mlx5_trigger_health_work(struct mlx5_core_dev *dev);
-int mlx5_buf_alloc(struct mlx5_core_dev *dev,
-		   int size, struct mlx5_frag_buf *buf);
-void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_frag_buf *buf);
 int mlx5_frag_buf_alloc_node(struct mlx5_core_dev *dev, int size,
 			     struct mlx5_frag_buf *buf, int node);
 void mlx5_frag_buf_free(struct mlx5_core_dev *dev, struct mlx5_frag_buf *buf);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 15/15] net/mlx5: Remove unused fill page array API function
  2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
                   ` (13 preceding siblings ...)
  2022-03-17 18:54 ` [net-next 14/15] net/mlx5: Remove unused exported contiguous coherent buffer allocation API Saeed Mahameed
@ 2022-03-17 18:54 ` Saeed Mahameed
  14 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2022-03-17 18:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Moshe Shemesh, Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

mlx5_fill_page_array API function is not used.
Remove it, reduce the number of exported functions.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 13 -------------
 include/linux/mlx5/driver.h                     |  1 -
 2 files changed, 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index 1762c5c36042..e52b0bac09da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -239,19 +239,6 @@ void mlx5_db_free(struct mlx5_core_dev *dev, struct mlx5_db *db)
 }
 EXPORT_SYMBOL_GPL(mlx5_db_free);
 
-void mlx5_fill_page_array(struct mlx5_frag_buf *buf, __be64 *pas)
-{
-	u64 addr;
-	int i;
-
-	for (i = 0; i < buf->npages; i++) {
-		addr = buf->frags->map + (i << buf->page_shift);
-
-		pas[i] = cpu_to_be64(addr);
-	}
-}
-EXPORT_SYMBOL_GPL(mlx5_fill_page_array);
-
 void mlx5_fill_page_frag_array_perm(struct mlx5_frag_buf *buf, __be64 *pas, u8 perm)
 {
 	int i;
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index a386aec1eb65..96cd740d94a3 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1036,7 +1036,6 @@ int mlx5_reclaim_startup_pages(struct mlx5_core_dev *dev);
 void mlx5_register_debugfs(void);
 void mlx5_unregister_debugfs(void);
 
-void mlx5_fill_page_array(struct mlx5_frag_buf *buf, __be64 *pas);
 void mlx5_fill_page_frag_array_perm(struct mlx5_frag_buf *buf, __be64 *pas, u8 perm);
 void mlx5_fill_page_frag_array(struct mlx5_frag_buf *frag_buf, __be64 *pas);
 int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info
  2022-03-17 18:54 ` [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info Saeed Mahameed
@ 2022-03-18 11:00   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 17+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-18 11:00 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: davem, kuba, netdev, maximmi, tariqt, saeedm

Hello:

This series was applied to netdev/net-next.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:

On Thu, 17 Mar 2022 11:54:10 -0700 you wrote:
> From: Maxim Mikityanskiy <maximmi@nvidia.com>
> 
> mlx5e_build_rq_frags_info() assumes that MTU is not bigger than
> PAGE_SIZE * MLX5E_MAX_RX_FRAGS, which is 16K for 4K pages. Currently,
> the firmware limits MTU to 10K, so the assumption doesn't lead to a bug.
> 
> This commits adds an additional driver check for reliability, since the
> firmware boundary might be changed.
> 
> [...]

Here is the summary with links:
  - [net-next,01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info
    https://git.kernel.org/netdev/net-next/c/7c3b4df594b6
  - [net-next,02/15] net/mlx5e: Add headroom only to the first fragment in legacy RQ
    https://git.kernel.org/netdev/net-next/c/c3cce0fff3a3
  - [net-next,03/15] net/mlx5e: Build SKB in place over the first fragment in non-linear legacy RQ
    https://git.kernel.org/netdev/net-next/c/8d35fb57fd90
  - [net-next,04/15] net/mlx5e: RX, Test the XDP program existence out of the handler
    https://git.kernel.org/netdev/net-next/c/e26eceb90b01
  - [net-next,05/15] net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle
    https://git.kernel.org/netdev/net-next/c/064990d0b65f
  - [net-next,06/15] net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear
    https://git.kernel.org/netdev/net-next/c/998923932f13
  - [net-next,07/15] net/mlx5: DR, Adjust structure member to reduce memory hole
    https://git.kernel.org/netdev/net-next/c/8f8533650325
  - [net-next,08/15] net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk
    https://git.kernel.org/netdev/net-next/c/003f4f9acb05
  - [net-next,09/15] net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory
    https://git.kernel.org/netdev/net-next/c/5c4f9b6e91e8
  - [net-next,10/15] net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk
    https://git.kernel.org/netdev/net-next/c/f51bb5179300
  - [net-next,11/15] net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory
    https://git.kernel.org/netdev/net-next/c/597534bd5633
  - [net-next,12/15] net/mlx5: DR, Remove hw_ste from mlx5dr_ste to reduce memory
    https://git.kernel.org/netdev/net-next/c/0d7f1595bb96
  - [net-next,13/15] net/mlx5: CT: Remove extra rhashtable remove on tuple entries
    https://git.kernel.org/netdev/net-next/c/ebf04231cf14
  - [net-next,14/15] net/mlx5: Remove unused exported contiguous coherent buffer allocation API
    https://git.kernel.org/netdev/net-next/c/4206fe40b2c0
  - [net-next,15/15] net/mlx5: Remove unused fill page array API function
    https://git.kernel.org/netdev/net-next/c/770c9a3a01af

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-03-18 11:00 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17 18:54 [pull request][net-next 00/15] mlx5 updates 2022-03-17 Saeed Mahameed
2022-03-17 18:54 ` [net-next 01/15] net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info Saeed Mahameed
2022-03-18 11:00   ` patchwork-bot+netdevbpf
2022-03-17 18:54 ` [net-next 02/15] net/mlx5e: Add headroom only to the first fragment in legacy RQ Saeed Mahameed
2022-03-17 18:54 ` [net-next 03/15] net/mlx5e: Build SKB in place over the first fragment in non-linear " Saeed Mahameed
2022-03-17 18:54 ` [net-next 04/15] net/mlx5e: RX, Test the XDP program existence out of the handler Saeed Mahameed
2022-03-17 18:54 ` [net-next 05/15] net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle Saeed Mahameed
2022-03-17 18:54 ` [net-next 06/15] net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear Saeed Mahameed
2022-03-17 18:54 ` [net-next 07/15] net/mlx5: DR, Adjust structure member to reduce memory hole Saeed Mahameed
2022-03-17 18:54 ` [net-next 08/15] net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk Saeed Mahameed
2022-03-17 18:54 ` [net-next 09/15] net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory Saeed Mahameed
2022-03-17 18:54 ` [net-next 10/15] net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk Saeed Mahameed
2022-03-17 18:54 ` [net-next 11/15] net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory Saeed Mahameed
2022-03-17 18:54 ` [net-next 12/15] net/mlx5: DR, Remove hw_ste from mlx5dr_ste " Saeed Mahameed
2022-03-17 18:54 ` [net-next 13/15] net/mlx5: CT: Remove extra rhashtable remove on tuple entries Saeed Mahameed
2022-03-17 18:54 ` [net-next 14/15] net/mlx5: Remove unused exported contiguous coherent buffer allocation API Saeed Mahameed
2022-03-17 18:54 ` [net-next 15/15] net/mlx5: Remove unused fill page array API function Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.