All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/17] mlx5 updates 2021-08-16
@ 2021-08-16 21:18 Saeed Mahameed
  2021-08-16 21:18 ` [net-next 01/17] net/mlx5e: Do not try enable RSS when resetting indir table Saeed Mahameed
                   ` (16 more replies)
  0 siblings, 17 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Hi Dave and Jakub,

This series adds the support for TC MQPRIO channel mode and Lag mode for
mlx5 bridge offloads.

For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.

---
The following changes since commit 1b3f78df6a80932d7deb0155d8b0871e8d3e4bca:

  bonding: improve nl error msg when device can't be enslaved because of IFF_MASTER (2021-08-16 14:03:30 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2021-08-16

for you to fetch changes up to 833239e1dfb194cb2c4f2085c1c06af570843796:

  net/mlx5: Bridge, support LAG (2021-08-16 14:16:36 -0700)

----------------------------------------------------------------
mlx5-updates-2021-08-16

The following patchset provides two separate mlx5 updates
1) Ethtool RSS context and MQPRIO channel mode support:
  1.1) enable mlx5e netdev driver to allow creating Transport Interface RX
       (TIRs) objects on the fly to be used for ethtool RSS contexts and
       TX MQPRIO channel mode
  1.2) Introduce mlx5e_rss object to manage such TIRs.
  1.3) Ethtool support for RSS context
  1.4) Support MQPRIO channel mode

2) Bridge offloads Lag support:
   to allow adding bond net devices to mlx5 bridge
  2.1) Address bridge port by (vport_num, esw_owner_vhca_id) pair
       since vport_num is only unique per eswitch and in lag mode we
       need to manage ports from both eswitches.
  2.2) Allow connectivity between representors of different eswitch
       instances that are attached to same bridge
  2.3) Bridge LAG, Require representors to be in shared FDB mode and
       introduce local and peer ports representors,
       match on paired eswitch metadata in peer FDB entries,
       And finally support addition/deletion and aging of peer flows.

----------------------------------------------------------------
Tariq Toukan (11):
      net/mlx5e: Do not try enable RSS when resetting indir table
      net/mlx5e: Introduce TIR create/destroy API in rx_res
      net/mlx5e: Introduce abstraction of RSS context
      net/mlx5e: Convert RSS to a dedicated object
      net/mlx5e: Dynamically allocate TIRs in RSS contexts
      net/mlx5e: Support multiple RSS contexts
      net/mlx5e: Support flow classification into RSS contexts
      net/mlx5e: Abstract MQPRIO params
      net/mlx5e: Maintain MQPRIO mode parameter
      net/mlx5e: Handle errors of netdev_set_num_tc()
      net/mlx5e: Support MQPRIO channel mode

Vlad Buslov (6):
      net/mlx5: Bridge, release bridge in same function where it is taken
      net/mlx5: Bridge, obtain core device from eswitch instead of priv
      net/mlx5: Bridge, identify port by vport_num+esw_owner_vhca_id pair
      net/mlx5: Bridge, extract FDB delete notification to function
      net/mlx5: Bridge, allow merged eswitch connectivity
      net/mlx5: Bridge, support LAG

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  12 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c   |  18 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/qos.c   |   2 +-
 .../ethernet/mellanox/mlx5/core/en/rep/bridge.c    | 327 +++++++----
 .../ethernet/mellanox/mlx5/core/en/reporter_tx.c   |   8 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/rss.c   | 588 ++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en/rss.h   |  49 ++
 .../net/ethernet/mellanox/mlx5/core/en/rx_res.c    | 603 ++++++++-------------
 .../net/ethernet/mellanox/mlx5/core/en/rx_res.h    |  20 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  71 ++-
 .../ethernet/mellanox/mlx5/core/en_fs_ethtool.c    |  99 +++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 176 ++++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |   5 +-
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.c   | 359 +++++++-----
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.h   |  46 +-
 .../ethernet/mellanox/mlx5/core/esw/bridge_priv.h  |   9 +
 .../mlx5/core/esw/diag/bridge_tracepoint.h         |   9 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h  |   3 -
 19 files changed, 1696 insertions(+), 714 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rss.h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [net-next 01/17] net/mlx5e: Do not try enable RSS when resetting indir table
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 02/17] net/mlx5e: Introduce TIR create/destroy API in rx_res Saeed Mahameed
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

All calls to mlx5e_rx_res_rss_set_indir_uniform() occur while the RSS
state is inactive, i.e. the RQT is pointing to the drop RQ, not to the
channels' RQs.
It means that the "apply" part of the function is not called.
Remove this part from the function, and document the change. It will be
useful for next patches in the series, allows code simplifications when
multiple RSS contexts are introduced.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index e2a8fe13f29d..2d0e8c809936 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -617,14 +617,11 @@ mlx5e_rx_res_rss_get_current_tt_config(struct mlx5e_rx_res *res, enum mlx5_traff
 	return rss_tt;
 }
 
+/* Updates the indirection table SW shadow, does not update the HW resources yet */
 void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch)
 {
+	WARN_ON_ONCE(res->rss_active);
 	mlx5e_rss_params_indir_init_uniform(&res->rss_params.indir, nch);
-
-	if (!res->rss_active)
-		return;
-
-	mlx5e_rx_res_rss_enable(res);
 }
 
 void mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 02/17] net/mlx5e: Introduce TIR create/destroy API in rx_res
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
  2021-08-16 21:18 ` [net-next 01/17] net/mlx5e: Do not try enable RSS when resetting indir table Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 03/17] net/mlx5e: Introduce abstraction of RSS context Saeed Mahameed
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Take TIR control operations in rx_res into functions.
This is in preparation to supporting on-demand TIR operations in
downstream patches.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/rx_res.c   | 140 +++++++++++-------
 1 file changed, 83 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 2d0e8c809936..dfa492a14928 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -117,84 +117,114 @@ static void mlx5e_rx_res_rss_params_init(struct mlx5e_rx_res *res, unsigned int
 			mlx5e_rss_get_default_tt_config(tt).rx_hash_fields;
 }
 
-static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
-				 const struct mlx5e_lro_param *init_lro_param)
+static void mlx5e_rx_res_rss_destroy_tir(struct mlx5e_rx_res *res,
+					 enum mlx5_traffic_types tt,
+					 bool inner)
+{
+	struct mlx5e_tir *tir;
+
+	tir = inner ? &res->rss[tt].inner_indir_tir : &res->rss[tt].indir_tir;
+	mlx5e_tir_destroy(tir);
+}
+
+static int mlx5e_rx_res_rss_create_tir(struct mlx5e_rx_res *res,
+				       struct mlx5e_tir_builder *builder,
+				       enum mlx5_traffic_types tt,
+				       const struct mlx5e_lro_param *init_lro_param,
+				       bool inner)
 {
 	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
+	struct mlx5e_rss_params_traffic_type rss_tt;
+	struct mlx5e_tir *tir;
+	u32 rqtn;
+	int err;
+
+	tir = inner ? &res->rss[tt].inner_indir_tir : &res->rss[tt].indir_tir;
+
+	rqtn = mlx5e_rqt_get_rqtn(&res->indir_rqt);
+	mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn,
+				    rqtn, inner_ft_support);
+	mlx5e_tir_builder_build_lro(builder, init_lro_param);
+	rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
+	mlx5e_tir_builder_build_rss(builder, &res->rss_params.hash, &rss_tt, inner);
+
+	err = mlx5e_tir_init(tir, builder, res->mdev, true);
+	if (err) {
+		mlx5_core_warn(res->mdev, "Failed to create %sindirect TIR: err = %d, tt = %d\n",
+			       inner ? "inner " : "", err, tt);
+		return err;
+	}
+
+	return 0;
+}
+
+static int mlx5e_rx_res_rss_create_tirs(struct mlx5e_rx_res *res,
+					const struct mlx5e_lro_param *init_lro_param,
+					bool inner)
+{
 	enum mlx5_traffic_types tt, max_tt;
 	struct mlx5e_tir_builder *builder;
-	u32 indir_rqtn;
 	int err;
 
 	builder = mlx5e_tir_builder_alloc(false);
 	if (!builder)
 		return -ENOMEM;
 
-	err = mlx5e_rqt_init_direct(&res->indir_rqt, res->mdev, true, res->drop_rqn);
-	if (err)
-		goto out;
-
-	indir_rqtn = mlx5e_rqt_get_rqtn(&res->indir_rqt);
-
 	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-		struct mlx5e_rss_params_traffic_type rss_tt;
-
-		mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn,
-					    indir_rqtn, inner_ft_support);
-		mlx5e_tir_builder_build_lro(builder, init_lro_param);
-		rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
-		mlx5e_tir_builder_build_rss(builder, &res->rss_params.hash, &rss_tt, false);
-
-		err = mlx5e_tir_init(&res->rss[tt].indir_tir, builder, res->mdev, true);
-		if (err) {
-			mlx5_core_warn(res->mdev, "Failed to create an indirect TIR: err = %d, tt = %d\n",
-				       err, tt);
+		err = mlx5e_rx_res_rss_create_tir(res, builder, tt, init_lro_param, inner);
+		if (err)
 			goto err_destroy_tirs;
-		}
 
 		mlx5e_tir_builder_clear(builder);
 	}
 
-	if (!inner_ft_support)
-		goto out;
+out:
+	mlx5e_tir_builder_free(builder);
+	return err;
 
-	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-		struct mlx5e_rss_params_traffic_type rss_tt;
+err_destroy_tirs:
+	max_tt = tt;
+	for (tt = 0; tt < max_tt; tt++)
+		mlx5e_rx_res_rss_destroy_tir(res, tt, inner);
+	goto out;
+}
 
-		mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn,
-					    indir_rqtn, inner_ft_support);
-		mlx5e_tir_builder_build_lro(builder, init_lro_param);
-		rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
-		mlx5e_tir_builder_build_rss(builder, &res->rss_params.hash, &rss_tt, true);
+static void mlx5e_rx_res_rss_destroy_tirs(struct mlx5e_rx_res *res, bool inner)
+{
+	enum mlx5_traffic_types tt;
 
-		err = mlx5e_tir_init(&res->rss[tt].inner_indir_tir, builder, res->mdev, true);
-		if (err) {
-			mlx5_core_warn(res->mdev, "Failed to create an inner indirect TIR: err = %d, tt = %d\n",
-				       err, tt);
-			goto err_destroy_inner_tirs;
-		}
+	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
+		mlx5e_rx_res_rss_destroy_tir(res, tt, inner);
+}
 
-		mlx5e_tir_builder_clear(builder);
-	}
+static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
+				 const struct mlx5e_lro_param *init_lro_param)
+{
+	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
+	int err;
 
-	goto out;
+	err = mlx5e_rqt_init_direct(&res->indir_rqt, res->mdev, true, res->drop_rqn);
+	if (err)
+		return err;
 
-err_destroy_inner_tirs:
-	max_tt = tt;
-	for (tt = 0; tt < max_tt; tt++)
-		mlx5e_tir_destroy(&res->rss[tt].inner_indir_tir);
+	err = mlx5e_rx_res_rss_create_tirs(res, init_lro_param, false);
+	if (err)
+		goto err_destroy_rqt;
+
+	if (inner_ft_support) {
+		err = mlx5e_rx_res_rss_create_tirs(res, init_lro_param, true);
+		if (err)
+			goto err_destroy_tirs;
+	}
+
+	return 0;
 
-	tt = MLX5E_NUM_INDIR_TIRS;
 err_destroy_tirs:
-	max_tt = tt;
-	for (tt = 0; tt < max_tt; tt++)
-		mlx5e_tir_destroy(&res->rss[tt].indir_tir);
+	mlx5e_rx_res_rss_destroy_tirs(res, false);
 
+err_destroy_rqt:
 	mlx5e_rqt_destroy(&res->indir_rqt);
 
-out:
-	mlx5e_tir_builder_free(builder);
-
 	return err;
 }
 
@@ -337,14 +367,10 @@ static int mlx5e_rx_res_ptp_init(struct mlx5e_rx_res *res)
 
 static void mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res)
 {
-	enum mlx5_traffic_types tt;
-
-	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
-		mlx5e_tir_destroy(&res->rss[tt].indir_tir);
+	mlx5e_rx_res_rss_destroy_tirs(res, false);
 
 	if (res->features & MLX5E_RX_RES_FEATURE_INNER_FT)
-		for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
-			mlx5e_tir_destroy(&res->rss[tt].inner_indir_tir);
+		mlx5e_rx_res_rss_destroy_tirs(res, true);
 
 	mlx5e_rqt_destroy(&res->indir_rqt);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 03/17] net/mlx5e: Introduce abstraction of RSS context
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
  2021-08-16 21:18 ` [net-next 01/17] net/mlx5e: Do not try enable RSS when resetting indir table Saeed Mahameed
  2021-08-16 21:18 ` [net-next 02/17] net/mlx5e: Introduce TIR create/destroy API in rx_res Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 04/17] net/mlx5e: Convert RSS to a dedicated object Saeed Mahameed
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Bring all fields that define and maintain RSS behavior together
into a new structure.
Align all usages with this new structure. Keep it hidden within
rx_res.c.
This helps supporting multiple RSS contexts in downstream patch.

Use dynamic allocations for the RSS context.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/rx_res.c   | 170 +++++++++++-------
 .../ethernet/mellanox/mlx5/core/en/rx_res.h   |   2 +-
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  |   6 +-
 3 files changed, 105 insertions(+), 73 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index dfa492a14928..336930cfd632 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -64,24 +64,22 @@ mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt)
 	return rss_default_config[tt];
 }
 
+struct mlx5e_rss {
+	struct mlx5e_rss_params_hash hash;
+	struct mlx5e_rss_params_indir indir;
+	u32 rx_hash_fields[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_tir tir[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_tir inner_tir[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_rqt rqt;
+};
+
 struct mlx5e_rx_res {
 	struct mlx5_core_dev *mdev;
 	enum mlx5e_rx_res_features features;
 	unsigned int max_nch;
 	u32 drop_rqn;
 
-	struct {
-		struct mlx5e_rss_params_hash hash;
-		struct mlx5e_rss_params_indir indir;
-		u32 rx_hash_fields[MLX5E_NUM_INDIR_TIRS];
-	} rss_params;
-
-	struct mlx5e_rqt indir_rqt;
-	struct {
-		struct mlx5e_tir indir_tir;
-		struct mlx5e_tir inner_indir_tir;
-	} rss[MLX5E_NUM_INDIR_TIRS];
-
+	struct mlx5e_rss *rss;
 	bool rss_active;
 	u32 rss_rqns[MLX5E_INDIR_RQT_SIZE];
 	unsigned int rss_nch;
@@ -106,14 +104,15 @@ struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
 
 static void mlx5e_rx_res_rss_params_init(struct mlx5e_rx_res *res, unsigned int init_nch)
 {
+	struct mlx5e_rss *rss = res->rss;
 	enum mlx5_traffic_types tt;
 
-	res->rss_params.hash.hfunc = ETH_RSS_HASH_TOP;
-	netdev_rss_key_fill(res->rss_params.hash.toeplitz_hash_key,
-			    sizeof(res->rss_params.hash.toeplitz_hash_key));
-	mlx5e_rss_params_indir_init_uniform(&res->rss_params.indir, init_nch);
+	rss->hash.hfunc = ETH_RSS_HASH_TOP;
+	netdev_rss_key_fill(rss->hash.toeplitz_hash_key,
+			    sizeof(rss->hash.toeplitz_hash_key));
+	mlx5e_rss_params_indir_init_uniform(&rss->indir, init_nch);
 	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
-		res->rss_params.rx_hash_fields[tt] =
+		rss->rx_hash_fields[tt] =
 			mlx5e_rss_get_default_tt_config(tt).rx_hash_fields;
 }
 
@@ -121,9 +120,10 @@ static void mlx5e_rx_res_rss_destroy_tir(struct mlx5e_rx_res *res,
 					 enum mlx5_traffic_types tt,
 					 bool inner)
 {
+	struct mlx5e_rss *rss = res->rss;
 	struct mlx5e_tir *tir;
 
-	tir = inner ? &res->rss[tt].inner_indir_tir : &res->rss[tt].indir_tir;
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
 	mlx5e_tir_destroy(tir);
 }
 
@@ -135,18 +135,19 @@ static int mlx5e_rx_res_rss_create_tir(struct mlx5e_rx_res *res,
 {
 	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
 	struct mlx5e_rss_params_traffic_type rss_tt;
+	struct mlx5e_rss *rss = res->rss;
 	struct mlx5e_tir *tir;
 	u32 rqtn;
 	int err;
 
-	tir = inner ? &res->rss[tt].inner_indir_tir : &res->rss[tt].indir_tir;
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
 
-	rqtn = mlx5e_rqt_get_rqtn(&res->indir_rqt);
+	rqtn = mlx5e_rqt_get_rqtn(&rss->rqt);
 	mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn,
 				    rqtn, inner_ft_support);
 	mlx5e_tir_builder_build_lro(builder, init_lro_param);
 	rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
-	mlx5e_tir_builder_build_rss(builder, &res->rss_params.hash, &rss_tt, inner);
+	mlx5e_tir_builder_build_rss(builder, &rss->hash, &rss_tt, inner);
 
 	err = mlx5e_tir_init(tir, builder, res->mdev, true);
 	if (err) {
@@ -198,14 +199,24 @@ static void mlx5e_rx_res_rss_destroy_tirs(struct mlx5e_rx_res *res, bool inner)
 }
 
 static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
-				 const struct mlx5e_lro_param *init_lro_param)
+				 const struct mlx5e_lro_param *init_lro_param,
+				 unsigned int init_nch)
 {
 	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
+	struct mlx5e_rss *rss;
 	int err;
 
-	err = mlx5e_rqt_init_direct(&res->indir_rqt, res->mdev, true, res->drop_rqn);
+	rss = kvzalloc(sizeof(*rss), GFP_KERNEL);
+	if (!rss)
+		return -ENOMEM;
+
+	res->rss = rss;
+
+	mlx5e_rx_res_rss_params_init(res, init_nch);
+
+	err = mlx5e_rqt_init_direct(&rss->rqt, res->mdev, true, res->drop_rqn);
 	if (err)
-		return err;
+		goto err_free_rss;
 
 	err = mlx5e_rx_res_rss_create_tirs(res, init_lro_param, false);
 	if (err)
@@ -223,8 +234,11 @@ static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
 	mlx5e_rx_res_rss_destroy_tirs(res, false);
 
 err_destroy_rqt:
-	mlx5e_rqt_destroy(&res->indir_rqt);
+	mlx5e_rqt_destroy(&rss->rqt);
 
+err_free_rss:
+	kvfree(rss);
+	res->rss = NULL;
 	return err;
 }
 
@@ -367,12 +381,16 @@ static int mlx5e_rx_res_ptp_init(struct mlx5e_rx_res *res)
 
 static void mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res)
 {
+	struct mlx5e_rss *rss = res->rss;
+
 	mlx5e_rx_res_rss_destroy_tirs(res, false);
 
 	if (res->features & MLX5E_RX_RES_FEATURE_INNER_FT)
 		mlx5e_rx_res_rss_destroy_tirs(res, true);
 
-	mlx5e_rqt_destroy(&res->indir_rqt);
+	mlx5e_rqt_destroy(&rss->rqt);
+	kvfree(rss);
+	res->rss = NULL;
 }
 
 static void mlx5e_rx_res_channels_destroy(struct mlx5e_rx_res *res)
@@ -411,9 +429,7 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
 	res->max_nch = max_nch;
 	res->drop_rqn = drop_rqn;
 
-	mlx5e_rx_res_rss_params_init(res, init_nch);
-
-	err = mlx5e_rx_res_rss_init(res, init_lro_param);
+	err = mlx5e_rx_res_rss_init(res, init_lro_param, init_nch);
 	if (err)
 		return err;
 
@@ -460,13 +476,17 @@ u32 mlx5e_rx_res_get_tirn_xsk(struct mlx5e_rx_res *res, unsigned int ix)
 
 u32 mlx5e_rx_res_get_tirn_rss(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
-	return mlx5e_tir_get_tirn(&res->rss[tt].indir_tir);
+	struct mlx5e_rss *rss = res->rss;
+
+	return mlx5e_tir_get_tirn(&rss->tir[tt]);
 }
 
 u32 mlx5e_rx_res_get_tirn_rss_inner(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
+	struct mlx5e_rss *rss = res->rss;
+
 	WARN_ON(!(res->features & MLX5E_RX_RES_FEATURE_INNER_FT));
-	return mlx5e_tir_get_tirn(&res->rss[tt].inner_indir_tir);
+	return mlx5e_tir_get_tirn(&rss->inner_tir[tt]);
 }
 
 u32 mlx5e_rx_res_get_tirn_ptp(struct mlx5e_rx_res *res)
@@ -482,28 +502,30 @@ u32 mlx5e_rx_res_get_rqtn_direct(struct mlx5e_rx_res *res, unsigned int ix)
 
 static void mlx5e_rx_res_rss_enable(struct mlx5e_rx_res *res)
 {
+	struct mlx5e_rss *rss = res->rss;
 	int err;
 
 	res->rss_active = true;
 
-	err = mlx5e_rqt_redirect_indir(&res->indir_rqt, res->rss_rqns, res->rss_nch,
-				       res->rss_params.hash.hfunc,
-				       &res->rss_params.indir);
+	err = mlx5e_rqt_redirect_indir(&rss->rqt, res->rss_rqns, res->rss_nch,
+				       rss->hash.hfunc,
+				       &rss->indir);
 	if (err)
-		mlx5_core_warn(res->mdev, "Failed to redirect indirect RQT %#x to channels: err = %d\n",
-			       mlx5e_rqt_get_rqtn(&res->indir_rqt), err);
+		mlx5_core_warn(res->mdev, "Failed to redirect RQT %#x to channels: err = %d\n",
+			       mlx5e_rqt_get_rqtn(&rss->rqt), err);
 }
 
 static void mlx5e_rx_res_rss_disable(struct mlx5e_rx_res *res)
 {
+	struct mlx5e_rss *rss = res->rss;
 	int err;
 
 	res->rss_active = false;
 
-	err = mlx5e_rqt_redirect_direct(&res->indir_rqt, res->drop_rqn);
+	err = mlx5e_rqt_redirect_direct(&rss->rqt, res->drop_rqn);
 	if (err)
-		mlx5_core_warn(res->mdev, "Failed to redirect indirect RQT %#x to drop RQ %#x: err = %d\n",
-			       mlx5e_rqt_get_rqtn(&res->indir_rqt), res->drop_rqn, err);
+		mlx5_core_warn(res->mdev, "Failed to redirect RQT %#x to drop RQ %#x: err = %d\n",
+			       mlx5e_rqt_get_rqtn(&rss->rqt), res->drop_rqn, err);
 }
 
 void mlx5e_rx_res_channels_activate(struct mlx5e_rx_res *res, struct mlx5e_channels *chs)
@@ -637,9 +659,10 @@ struct mlx5e_rss_params_traffic_type
 mlx5e_rx_res_rss_get_current_tt_config(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
 	struct mlx5e_rss_params_traffic_type rss_tt;
+	struct mlx5e_rss *rss = res->rss;
 
 	rss_tt = mlx5e_rss_get_default_tt_config(tt);
-	rss_tt.rx_hash_fields = res->rss_params.rx_hash_fields[tt];
+	rss_tt.rx_hash_fields = rss->rx_hash_fields[tt];
 	return rss_tt;
 }
 
@@ -647,23 +670,26 @@ mlx5e_rx_res_rss_get_current_tt_config(struct mlx5e_rx_res *res, enum mlx5_traff
 void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch)
 {
 	WARN_ON_ONCE(res->rss_active);
-	mlx5e_rss_params_indir_init_uniform(&res->rss_params.indir, nch);
+	mlx5e_rss_params_indir_init_uniform(&res->rss->indir, nch);
 }
 
-void mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc)
+int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc)
 {
+	struct mlx5e_rss *rss = res->rss;
 	unsigned int i;
 
 	if (indir)
 		for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
-			indir[i] = res->rss_params.indir.table[i];
+			indir[i] = rss->indir.table[i];
 
 	if (key)
-		memcpy(key, res->rss_params.hash.toeplitz_hash_key,
-		       sizeof(res->rss_params.hash.toeplitz_hash_key));
+		memcpy(key, rss->hash.toeplitz_hash_key,
+		       sizeof(rss->hash.toeplitz_hash_key));
 
 	if (hfunc)
-		*hfunc = res->rss_params.hash.hfunc;
+		*hfunc = rss->hash.hfunc;
+
+	return 0;
 }
 
 static int mlx5e_rx_res_rss_update_tir(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
@@ -671,6 +697,7 @@ static int mlx5e_rx_res_rss_update_tir(struct mlx5e_rx_res *res, enum mlx5_traff
 {
 	struct mlx5e_rss_params_traffic_type rss_tt;
 	struct mlx5e_tir_builder *builder;
+	struct mlx5e_rss *rss = res->rss;
 	struct mlx5e_tir *tir;
 	int err;
 
@@ -680,8 +707,8 @@ static int mlx5e_rx_res_rss_update_tir(struct mlx5e_rx_res *res, enum mlx5_traff
 
 	rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
 
-	mlx5e_tir_builder_build_rss(builder, &res->rss_params.hash, &rss_tt, inner);
-	tir = inner ? &res->rss[tt].inner_indir_tir : &res->rss[tt].indir_tir;
+	mlx5e_tir_builder_build_rss(builder, &rss->hash, &rss_tt, inner);
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
 	err = mlx5e_tir_modify(tir, builder);
 
 	mlx5e_tir_builder_free(builder);
@@ -691,12 +718,13 @@ static int mlx5e_rx_res_rss_update_tir(struct mlx5e_rx_res *res, enum mlx5_traff
 int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
 			      const u8 *key, const u8 *hfunc)
 {
+	struct mlx5e_rss *rss = res->rss;
 	enum mlx5_traffic_types tt;
 	bool changed_indir = false;
 	bool changed_hash = false;
 	int err;
 
-	if (hfunc && *hfunc != res->rss_params.hash.hfunc) {
+	if (hfunc && *hfunc != rss->hash.hfunc) {
 		switch (*hfunc) {
 		case ETH_RSS_HASH_XOR:
 		case ETH_RSS_HASH_TOP:
@@ -706,14 +734,14 @@ int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
 		}
 		changed_hash = true;
 		changed_indir = true;
-		res->rss_params.hash.hfunc = *hfunc;
+		rss->hash.hfunc = *hfunc;
 	}
 
 	if (key) {
-		if (res->rss_params.hash.hfunc == ETH_RSS_HASH_TOP)
+		if (rss->hash.hfunc == ETH_RSS_HASH_TOP)
 			changed_hash = true;
-		memcpy(res->rss_params.hash.toeplitz_hash_key, key,
-		       sizeof(res->rss_params.hash.toeplitz_hash_key));
+		memcpy(rss->hash.toeplitz_hash_key, key,
+		       sizeof(rss->hash.toeplitz_hash_key));
 	}
 
 	if (indir) {
@@ -722,16 +750,15 @@ int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
 		changed_indir = true;
 
 		for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
-			res->rss_params.indir.table[i] = indir[i];
+			rss->indir.table[i] = indir[i];
 	}
 
 	if (changed_indir && res->rss_active) {
-		err = mlx5e_rqt_redirect_indir(&res->indir_rqt, res->rss_rqns, res->rss_nch,
-					       res->rss_params.hash.hfunc,
-					       &res->rss_params.indir);
+		err = mlx5e_rqt_redirect_indir(&rss->rqt, res->rss_rqns, res->rss_nch,
+					       rss->hash.hfunc, &rss->indir);
 		if (err)
 			mlx5_core_warn(res->mdev, "Failed to redirect indirect RQT %#x to channels: err = %d\n",
-				       mlx5e_rqt_get_rqtn(&res->indir_rqt), err);
+				       mlx5e_rqt_get_rqtn(&rss->rqt), err);
 	}
 
 	if (changed_hash)
@@ -755,25 +782,28 @@ int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
 
 u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
-	return res->rss_params.rx_hash_fields[tt];
+	struct mlx5e_rss *rss = res->rss;
+
+	return rss->rx_hash_fields[tt];
 }
 
 int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
 				     u8 rx_hash_fields)
 {
+	struct mlx5e_rss *rss = res->rss;
 	u8 old_rx_hash_fields;
 	int err;
 
-	old_rx_hash_fields = res->rss_params.rx_hash_fields[tt];
+	old_rx_hash_fields = rss->rx_hash_fields[tt];
 
 	if (old_rx_hash_fields == rx_hash_fields)
 		return 0;
 
-	res->rss_params.rx_hash_fields[tt] = rx_hash_fields;
+	rss->rx_hash_fields[tt] = rx_hash_fields;
 
 	err = mlx5e_rx_res_rss_update_tir(res, tt, false);
 	if (err) {
-		res->rss_params.rx_hash_fields[tt] = old_rx_hash_fields;
+		rss->rx_hash_fields[tt] = old_rx_hash_fields;
 		mlx5_core_warn(res->mdev, "Failed to update RSS hash fields of indirect TIR for traffic type %d: err = %d\n",
 			       tt, err);
 		return err;
@@ -787,11 +817,12 @@ int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic
 		/* Partial update happened. Try to revert - it may fail too, but
 		 * there is nothing more we can do.
 		 */
-		res->rss_params.rx_hash_fields[tt] = old_rx_hash_fields;
+		rss->rx_hash_fields[tt] = old_rx_hash_fields;
 		mlx5_core_warn(res->mdev, "Failed to update RSS hash fields of inner indirect TIR for traffic type %d: err = %d\n",
 			       tt, err);
 		if (mlx5e_rx_res_rss_update_tir(res, tt, false))
-			mlx5_core_warn(res->mdev, "Partial update of RSS hash fields happened: failed to revert indirect TIR for traffic type %d to the old values\n",
+			mlx5_core_warn(res->mdev,
+				       "Partial update of RSS hash fields happened: failed to revert indirect TIR for traffic type %d to the old values\n",
 				       tt);
 	}
 
@@ -800,6 +831,7 @@ int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic
 
 int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param *lro_param)
 {
+	struct mlx5e_rss *rss = res->rss;
 	struct mlx5e_tir_builder *builder;
 	enum mlx5_traffic_types tt;
 	int err, final_err;
@@ -814,10 +846,10 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 	final_err = 0;
 
 	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-		err = mlx5e_tir_modify(&res->rss[tt].indir_tir, builder);
+		err = mlx5e_tir_modify(&rss->tir[tt], builder);
 		if (err) {
 			mlx5_core_warn(res->mdev, "Failed to update LRO state of indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(&res->rss[tt].indir_tir), tt, err);
+				       mlx5e_tir_get_tirn(&rss->tir[tt]), tt, err);
 			if (!final_err)
 				final_err = err;
 		}
@@ -825,10 +857,10 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 		if (!(res->features & MLX5E_RX_RES_FEATURE_INNER_FT))
 			continue;
 
-		err = mlx5e_tir_modify(&res->rss[tt].inner_indir_tir, builder);
+		err = mlx5e_tir_modify(&rss->inner_tir[tt], builder);
 		if (err) {
 			mlx5_core_warn(res->mdev, "Failed to update LRO state of inner indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(&res->rss[tt].inner_indir_tir), tt, err);
+				       mlx5e_tir_get_tirn(&rss->inner_tir[tt]), tt, err);
 			if (!final_err)
 				final_err = err;
 		}
@@ -850,5 +882,5 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 
 struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res)
 {
-	return res->rss_params.hash;
+	return res->rss->hash;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index 1baeec5158a3..1703fb981d6d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -53,7 +53,7 @@ int mlx5e_rx_res_xsk_deactivate(struct mlx5e_rx_res *res, unsigned int ix);
 struct mlx5e_rss_params_traffic_type
 mlx5e_rx_res_rss_get_current_tt_config(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt);
 void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch);
-void mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc);
+int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc);
 int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
 			      const u8 *key, const u8 *hfunc);
 u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 2cf59bb5f898..62eef3e7f993 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1198,12 +1198,12 @@ int mlx5e_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
 		   u8 *hfunc)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
+	int err;
 
 	mutex_lock(&priv->state_lock);
-	mlx5e_rx_res_rss_get_rxfh(priv->rx_res, indir, key, hfunc);
+	err = mlx5e_rx_res_rss_get_rxfh(priv->rx_res, indir, key, hfunc);
 	mutex_unlock(&priv->state_lock);
-
-	return 0;
+	return err;
 }
 
 int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 04/17] net/mlx5e: Convert RSS to a dedicated object
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 03/17] net/mlx5e: Introduce abstraction of RSS context Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 05/17] net/mlx5e: Dynamically allocate TIRs in RSS contexts Saeed Mahameed
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Code related to RSS is now encapsulated into a dedicated object and put
into new files en/rss.{c,h}. All usages are converted.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   6 +-
 .../net/ethernet/mellanox/mlx5/core/en/rss.c  | 488 +++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/en/rss.h  |  38 ++
 .../ethernet/mellanox/mlx5/core/en/rx_res.c   | 494 +++---------------
 .../ethernet/mellanox/mlx5/core/en/rx_res.h   |   6 +-
 5 files changed, 604 insertions(+), 428 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rss.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 33e550d77fa6..4fccc9bc0328 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -22,13 +22,13 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 #
 # Netdev basic
 #
-mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
+mlx5_core-$(CONFIG_MLX5_CORE_EN) += en/rqt.o en/tir.o en/rss.o en/rx_res.o \
+		en/channels.o en_main.o en_common.o en_fs.o en_ethtool.o \
 		en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
 		en_selftest.o en/port.o en/monitor_stats.o en/health.o \
 		en/reporter_tx.o en/reporter_rx.o en/params.o en/xsk/pool.o \
 		en/xsk/setup.o en/xsk/rx.o en/xsk/tx.o en/devlink.o en/ptp.o \
-		en/qos.o en/trap.o en/fs_tt_redirect.o en/rqt.o en/tir.o \
-		en/rx_res.o en/channels.o
+		en/qos.o en/trap.o en/fs_tt_redirect.o
 
 #
 # Netdev extra
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
new file mode 100644
index 000000000000..f4a72b6b8a02
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -0,0 +1,488 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+// Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES.
+
+#include "rss.h"
+
+#define mlx5e_rss_warn(__dev, format, ...)			\
+	dev_warn((__dev)->device, "%s:%d:(pid %d): " format,	\
+		 __func__, __LINE__, current->pid,		\
+		 ##__VA_ARGS__)
+
+static const struct mlx5e_rss_params_traffic_type rss_default_config[MLX5E_NUM_INDIR_TIRS] = {
+	[MLX5_TT_IPV4_TCP] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
+		.l4_prot_type = MLX5_L4_PROT_TYPE_TCP,
+		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
+	},
+	[MLX5_TT_IPV6_TCP] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
+		.l4_prot_type = MLX5_L4_PROT_TYPE_TCP,
+		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
+	},
+	[MLX5_TT_IPV4_UDP] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
+		.l4_prot_type = MLX5_L4_PROT_TYPE_UDP,
+		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
+	},
+	[MLX5_TT_IPV6_UDP] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
+		.l4_prot_type = MLX5_L4_PROT_TYPE_UDP,
+		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
+	},
+	[MLX5_TT_IPV4_IPSEC_AH] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
+		.l4_prot_type = 0,
+		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
+	},
+	[MLX5_TT_IPV6_IPSEC_AH] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
+		.l4_prot_type = 0,
+		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
+	},
+	[MLX5_TT_IPV4_IPSEC_ESP] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
+		.l4_prot_type = 0,
+		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
+	},
+	[MLX5_TT_IPV6_IPSEC_ESP] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
+		.l4_prot_type = 0,
+		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
+	},
+	[MLX5_TT_IPV4] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
+		.l4_prot_type = 0,
+		.rx_hash_fields = MLX5_HASH_IP,
+	},
+	[MLX5_TT_IPV6] = {
+		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
+		.l4_prot_type = 0,
+		.rx_hash_fields = MLX5_HASH_IP,
+	},
+};
+
+struct mlx5e_rss_params_traffic_type
+mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt)
+{
+	return rss_default_config[tt];
+}
+
+struct mlx5e_rss {
+	struct mlx5e_rss_params_hash hash;
+	struct mlx5e_rss_params_indir indir;
+	u32 rx_hash_fields[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_tir tir[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_tir inner_tir[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_rqt rqt;
+	struct mlx5_core_dev *mdev;
+	u32 drop_rqn;
+	bool inner_ft_support;
+	bool enabled;
+};
+
+struct mlx5e_rss *mlx5e_rss_alloc(void)
+{
+	return kvzalloc(sizeof(struct mlx5e_rss), GFP_KERNEL);
+}
+
+void mlx5e_rss_free(struct mlx5e_rss *rss)
+{
+	kvfree(rss);
+}
+
+static void mlx5e_rss_params_init(struct mlx5e_rss *rss)
+{
+	enum mlx5_traffic_types tt;
+
+	rss->hash.hfunc = ETH_RSS_HASH_TOP;
+	netdev_rss_key_fill(rss->hash.toeplitz_hash_key,
+			    sizeof(rss->hash.toeplitz_hash_key));
+	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
+		rss->rx_hash_fields[tt] =
+			mlx5e_rss_get_default_tt_config(tt).rx_hash_fields;
+}
+
+static struct mlx5e_rss_params_traffic_type
+mlx5e_rss_get_tt_config(struct mlx5e_rss *rss, enum mlx5_traffic_types tt)
+{
+	struct mlx5e_rss_params_traffic_type rss_tt;
+
+	rss_tt = mlx5e_rss_get_default_tt_config(tt);
+	rss_tt.rx_hash_fields = rss->rx_hash_fields[tt];
+	return rss_tt;
+}
+
+static int mlx5e_rss_create_tir(struct mlx5e_rss *rss,
+				enum mlx5_traffic_types tt,
+				const struct mlx5e_lro_param *init_lro_param,
+				bool inner)
+{
+	struct mlx5e_rss_params_traffic_type rss_tt;
+	struct mlx5e_tir_builder *builder;
+	struct mlx5e_tir *tir;
+	u32 rqtn;
+	int err;
+
+	if (inner && !rss->inner_ft_support) {
+		mlx5e_rss_warn(rss->mdev,
+			       "Cannot create inner indirect TIR[%d], RSS inner FT is not supported.\n",
+			       tt);
+		return -EINVAL;
+	}
+
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+
+	builder = mlx5e_tir_builder_alloc(false);
+	if (!builder)
+		return -ENOMEM;
+
+	rqtn = mlx5e_rqt_get_rqtn(&rss->rqt);
+	mlx5e_tir_builder_build_rqt(builder, rss->mdev->mlx5e_res.hw_objs.td.tdn,
+				    rqtn, rss->inner_ft_support);
+	mlx5e_tir_builder_build_lro(builder, init_lro_param);
+	rss_tt = mlx5e_rss_get_tt_config(rss, tt);
+	mlx5e_tir_builder_build_rss(builder, &rss->hash, &rss_tt, inner);
+
+	err = mlx5e_tir_init(tir, builder, rss->mdev, true);
+	mlx5e_tir_builder_free(builder);
+	if (err)
+		mlx5e_rss_warn(rss->mdev, "Failed to create %sindirect TIR: err = %d, tt = %d\n",
+			       inner ? "inner " : "", err, tt);
+	return err;
+}
+
+static void mlx5e_rss_destroy_tir(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+				  bool inner)
+{
+	struct mlx5e_tir *tir;
+
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+	mlx5e_tir_destroy(tir);
+}
+
+static int mlx5e_rss_create_tirs(struct mlx5e_rss *rss,
+				 const struct mlx5e_lro_param *init_lro_param,
+				 bool inner)
+{
+	enum mlx5_traffic_types tt, max_tt;
+	int err;
+
+	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
+		err = mlx5e_rss_create_tir(rss, tt, init_lro_param, inner);
+		if (err)
+			goto err_destroy_tirs;
+	}
+
+	return 0;
+
+err_destroy_tirs:
+	max_tt = tt;
+	for (tt = 0; tt < max_tt; tt++)
+		mlx5e_rss_destroy_tir(rss, tt, inner);
+	return err;
+}
+
+static void mlx5e_rss_destroy_tirs(struct mlx5e_rss *rss, bool inner)
+{
+	enum mlx5_traffic_types tt;
+
+	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
+		mlx5e_rss_destroy_tir(rss, tt, inner);
+}
+
+static int mlx5e_rss_update_tir(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+				bool inner)
+{
+	struct mlx5e_rss_params_traffic_type rss_tt;
+	struct mlx5e_tir_builder *builder;
+	struct mlx5e_tir *tir;
+	int err;
+
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+
+	builder = mlx5e_tir_builder_alloc(true);
+	if (!builder)
+		return -ENOMEM;
+
+	rss_tt = mlx5e_rss_get_tt_config(rss, tt);
+
+	mlx5e_tir_builder_build_rss(builder, &rss->hash, &rss_tt, inner);
+	err = mlx5e_tir_modify(tir, builder);
+
+	mlx5e_tir_builder_free(builder);
+	return err;
+}
+
+static int mlx5e_rss_update_tirs(struct mlx5e_rss *rss)
+{
+	enum mlx5_traffic_types tt;
+	int err, retval;
+
+	retval = 0;
+
+	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
+		err = mlx5e_rss_update_tir(rss, tt, false);
+		if (err) {
+			retval = retval ? : err;
+			mlx5e_rss_warn(rss->mdev,
+				       "Failed to update RSS hash of indirect TIR for traffic type %d: err = %d\n",
+				       tt, err);
+		}
+
+		if (!rss->inner_ft_support)
+			continue;
+
+		err = mlx5e_rss_update_tir(rss, tt, true);
+		if (err) {
+			retval = retval ? : err;
+			mlx5e_rss_warn(rss->mdev,
+				       "Failed to update RSS hash of inner indirect TIR for traffic type %d: err = %d\n",
+				       tt, err);
+		}
+	}
+	return retval;
+}
+
+int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
+		   bool inner_ft_support, u32 drop_rqn,
+		   const struct mlx5e_lro_param *init_lro_param)
+{
+	int err;
+
+	rss->mdev = mdev;
+	rss->inner_ft_support = inner_ft_support;
+	rss->drop_rqn = drop_rqn;
+
+	mlx5e_rss_params_init(rss);
+
+	err = mlx5e_rqt_init_direct(&rss->rqt, mdev, true, drop_rqn);
+	if (err)
+		goto err_out;
+
+	err = mlx5e_rss_create_tirs(rss, init_lro_param, false);
+	if (err)
+		goto err_destroy_rqt;
+
+	if (inner_ft_support) {
+		err = mlx5e_rss_create_tirs(rss, init_lro_param, true);
+		if (err)
+			goto err_destroy_tirs;
+	}
+
+	return 0;
+
+err_destroy_tirs:
+	mlx5e_rss_destroy_tirs(rss, false);
+err_destroy_rqt:
+	mlx5e_rqt_destroy(&rss->rqt);
+err_out:
+	return err;
+}
+
+void mlx5e_rss_cleanup(struct mlx5e_rss *rss)
+{
+	mlx5e_rss_destroy_tirs(rss, false);
+
+	if (rss->inner_ft_support)
+		mlx5e_rss_destroy_tirs(rss, true);
+
+	mlx5e_rqt_destroy(&rss->rqt);
+}
+
+u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+		       bool inner)
+{
+	struct mlx5e_tir *tir;
+
+	WARN_ON(inner && !rss->inner_ft_support);
+	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+
+	return mlx5e_tir_get_tirn(tir);
+}
+
+static void mlx5e_rss_apply(struct mlx5e_rss *rss, u32 *rqns, unsigned int num_rqns)
+{
+	int err;
+
+	err = mlx5e_rqt_redirect_indir(&rss->rqt, rqns, num_rqns, rss->hash.hfunc, &rss->indir);
+	if (err)
+		mlx5e_rss_warn(rss->mdev, "Failed to redirect RQT %#x to channels: err = %d\n",
+			       mlx5e_rqt_get_rqtn(&rss->rqt), err);
+}
+
+void mlx5e_rss_enable(struct mlx5e_rss *rss, u32 *rqns, unsigned int num_rqns)
+{
+	rss->enabled = true;
+	mlx5e_rss_apply(rss, rqns, num_rqns);
+}
+
+void mlx5e_rss_disable(struct mlx5e_rss *rss)
+{
+	int err;
+
+	rss->enabled = false;
+	err = mlx5e_rqt_redirect_direct(&rss->rqt, rss->drop_rqn);
+	if (err)
+		mlx5e_rss_warn(rss->mdev, "Failed to redirect RQT %#x to drop RQ %#x: err = %d\n",
+			       mlx5e_rqt_get_rqtn(&rss->rqt), rss->drop_rqn, err);
+}
+
+int mlx5e_rss_lro_set_param(struct mlx5e_rss *rss, struct mlx5e_lro_param *lro_param)
+{
+	struct mlx5e_tir_builder *builder;
+	enum mlx5_traffic_types tt;
+	int err, final_err;
+
+	builder = mlx5e_tir_builder_alloc(true);
+	if (!builder)
+		return -ENOMEM;
+
+	mlx5e_tir_builder_build_lro(builder, lro_param);
+
+	final_err = 0;
+
+	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
+		err = mlx5e_tir_modify(&rss->tir[tt], builder);
+		if (err) {
+			mlx5e_rss_warn(rss->mdev, "Failed to update LRO state of indirect TIR %#x for traffic type %d: err = %d\n",
+				       mlx5e_tir_get_tirn(&rss->tir[tt]), tt, err);
+			if (!final_err)
+				final_err = err;
+		}
+
+		if (!rss->inner_ft_support)
+			continue;
+
+		err = mlx5e_tir_modify(&rss->inner_tir[tt], builder);
+		if (err) {
+			mlx5e_rss_warn(rss->mdev, "Failed to update LRO state of inner indirect TIR %#x for traffic type %d: err = %d\n",
+				       mlx5e_tir_get_tirn(&rss->inner_tir[tt]), tt, err);
+			if (!final_err)
+				final_err = err;
+		}
+	}
+
+	mlx5e_tir_builder_free(builder);
+	return final_err;
+}
+
+int mlx5e_rss_get_rxfh(struct mlx5e_rss *rss, u32 *indir, u8 *key, u8 *hfunc)
+{
+	unsigned int i;
+
+	if (indir)
+		for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
+			indir[i] = rss->indir.table[i];
+
+	if (key)
+		memcpy(key, rss->hash.toeplitz_hash_key,
+		       sizeof(rss->hash.toeplitz_hash_key));
+
+	if (hfunc)
+		*hfunc = rss->hash.hfunc;
+
+	return 0;
+}
+
+int mlx5e_rss_set_rxfh(struct mlx5e_rss *rss, const u32 *indir,
+		       const u8 *key, const u8 *hfunc,
+		       u32 *rqns, unsigned int num_rqns)
+{
+	bool changed_indir = false;
+	bool changed_hash = false;
+
+	if (hfunc && *hfunc != rss->hash.hfunc) {
+		switch (*hfunc) {
+		case ETH_RSS_HASH_XOR:
+		case ETH_RSS_HASH_TOP:
+			break;
+		default:
+			return -EINVAL;
+		}
+		changed_hash = true;
+		changed_indir = true;
+		rss->hash.hfunc = *hfunc;
+	}
+
+	if (key) {
+		if (rss->hash.hfunc == ETH_RSS_HASH_TOP)
+			changed_hash = true;
+		memcpy(rss->hash.toeplitz_hash_key, key,
+		       sizeof(rss->hash.toeplitz_hash_key));
+	}
+
+	if (indir) {
+		unsigned int i;
+
+		changed_indir = true;
+
+		for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
+			rss->indir.table[i] = indir[i];
+	}
+
+	if (changed_indir && rss->enabled)
+		mlx5e_rss_apply(rss, rqns, num_rqns);
+
+	if (changed_hash)
+		mlx5e_rss_update_tirs(rss);
+
+	return 0;
+}
+
+struct mlx5e_rss_params_hash mlx5e_rss_get_hash(struct mlx5e_rss *rss)
+{
+	return rss->hash;
+}
+
+u8 mlx5e_rss_get_hash_fields(struct mlx5e_rss *rss, enum mlx5_traffic_types tt)
+{
+	return rss->rx_hash_fields[tt];
+}
+
+int mlx5e_rss_set_hash_fields(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+			      u8 rx_hash_fields)
+{
+	u8 old_rx_hash_fields;
+	int err;
+
+	old_rx_hash_fields = rss->rx_hash_fields[tt];
+
+	if (old_rx_hash_fields == rx_hash_fields)
+		return 0;
+
+	rss->rx_hash_fields[tt] = rx_hash_fields;
+
+	err = mlx5e_rss_update_tir(rss, tt, false);
+	if (err) {
+		rss->rx_hash_fields[tt] = old_rx_hash_fields;
+		mlx5e_rss_warn(rss->mdev,
+			       "Failed to update RSS hash fields of indirect TIR for traffic type %d: err = %d\n",
+			       tt, err);
+		return err;
+	}
+
+	if (!(rss->inner_ft_support))
+		return 0;
+
+	err = mlx5e_rss_update_tir(rss, tt, true);
+	if (err) {
+		/* Partial update happened. Try to revert - it may fail too, but
+		 * there is nothing more we can do.
+		 */
+		rss->rx_hash_fields[tt] = old_rx_hash_fields;
+		mlx5e_rss_warn(rss->mdev,
+			       "Failed to update RSS hash fields of inner indirect TIR for traffic type %d: err = %d\n",
+			       tt, err);
+		if (mlx5e_rss_update_tir(rss, tt, false))
+			mlx5e_rss_warn(rss->mdev,
+				       "Partial update of RSS hash fields happened: failed to revert indirect TIR for traffic type %d to the old values\n",
+				       tt);
+	}
+
+	return err;
+}
+
+void mlx5e_rss_set_indir_uniform(struct mlx5e_rss *rss, unsigned int nch)
+{
+	mlx5e_rss_params_indir_init_uniform(&rss->indir, nch);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
new file mode 100644
index 000000000000..e71e712ed842
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. */
+
+#ifndef __MLX5_EN_RSS_H__
+#define __MLX5_EN_RSS_H__
+
+#include "rqt.h"
+#include "tir.h"
+#include "fs.h"
+
+struct mlx5e_rss_params_traffic_type
+mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt);
+
+struct mlx5e_rss;
+
+struct mlx5e_rss *mlx5e_rss_alloc(void);
+void mlx5e_rss_free(struct mlx5e_rss *rss);
+int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
+		   bool inner_ft_support, u32 drop_rqn,
+		   const struct mlx5e_lro_param *init_lro_param);
+void mlx5e_rss_cleanup(struct mlx5e_rss *rss);
+
+u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+		       bool inner);
+void mlx5e_rss_enable(struct mlx5e_rss *rss, u32 *rqns, unsigned int num_rqns);
+void mlx5e_rss_disable(struct mlx5e_rss *rss);
+
+int mlx5e_rss_lro_set_param(struct mlx5e_rss *rss, struct mlx5e_lro_param *lro_param);
+int mlx5e_rss_get_rxfh(struct mlx5e_rss *rss, u32 *indir, u8 *key, u8 *hfunc);
+int mlx5e_rss_set_rxfh(struct mlx5e_rss *rss, const u32 *indir,
+		       const u8 *key, const u8 *hfunc,
+		       u32 *rqns, unsigned int num_rqns);
+struct mlx5e_rss_params_hash mlx5e_rss_get_hash(struct mlx5e_rss *rss);
+u8 mlx5e_rss_get_hash_fields(struct mlx5e_rss *rss, enum mlx5_traffic_types tt);
+int mlx5e_rss_set_hash_fields(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+			      u8 rx_hash_fields);
+void mlx5e_rss_set_indir_uniform(struct mlx5e_rss *rss, unsigned int nch);
+#endif /* __MLX5_EN_RSS_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 336930cfd632..590d94196370 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -5,74 +5,6 @@
 #include "channels.h"
 #include "params.h"
 
-static const struct mlx5e_rss_params_traffic_type rss_default_config[MLX5E_NUM_INDIR_TIRS] = {
-	[MLX5_TT_IPV4_TCP] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
-		.l4_prot_type = MLX5_L4_PROT_TYPE_TCP,
-		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
-	},
-	[MLX5_TT_IPV6_TCP] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
-		.l4_prot_type = MLX5_L4_PROT_TYPE_TCP,
-		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
-	},
-	[MLX5_TT_IPV4_UDP] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
-		.l4_prot_type = MLX5_L4_PROT_TYPE_UDP,
-		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
-	},
-	[MLX5_TT_IPV6_UDP] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
-		.l4_prot_type = MLX5_L4_PROT_TYPE_UDP,
-		.rx_hash_fields = MLX5_HASH_IP_L4PORTS,
-	},
-	[MLX5_TT_IPV4_IPSEC_AH] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
-		.l4_prot_type = 0,
-		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
-	},
-	[MLX5_TT_IPV6_IPSEC_AH] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
-		.l4_prot_type = 0,
-		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
-	},
-	[MLX5_TT_IPV4_IPSEC_ESP] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
-		.l4_prot_type = 0,
-		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
-	},
-	[MLX5_TT_IPV6_IPSEC_ESP] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
-		.l4_prot_type = 0,
-		.rx_hash_fields = MLX5_HASH_IP_IPSEC_SPI,
-	},
-	[MLX5_TT_IPV4] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV4,
-		.l4_prot_type = 0,
-		.rx_hash_fields = MLX5_HASH_IP,
-	},
-	[MLX5_TT_IPV6] = {
-		.l3_prot_type = MLX5_L3_PROT_TYPE_IPV6,
-		.l4_prot_type = 0,
-		.rx_hash_fields = MLX5_HASH_IP,
-	},
-};
-
-struct mlx5e_rss_params_traffic_type
-mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt)
-{
-	return rss_default_config[tt];
-}
-
-struct mlx5e_rss {
-	struct mlx5e_rss_params_hash hash;
-	struct mlx5e_rss_params_indir indir;
-	u32 rx_hash_fields[MLX5E_NUM_INDIR_TIRS];
-	struct mlx5e_tir tir[MLX5E_NUM_INDIR_TIRS];
-	struct mlx5e_tir inner_tir[MLX5E_NUM_INDIR_TIRS];
-	struct mlx5e_rqt rqt;
-};
-
 struct mlx5e_rx_res {
 	struct mlx5_core_dev *mdev;
 	enum mlx5e_rx_res_features features;
@@ -97,149 +29,105 @@ struct mlx5e_rx_res {
 	} ptp;
 };
 
-struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
-{
-	return kvzalloc(sizeof(struct mlx5e_rx_res), GFP_KERNEL);
-}
+/* API for rx_res_rss_* */
 
-static void mlx5e_rx_res_rss_params_init(struct mlx5e_rx_res *res, unsigned int init_nch)
+static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
+				 const struct mlx5e_lro_param *init_lro_param,
+				 unsigned int init_nch)
 {
-	struct mlx5e_rss *rss = res->rss;
-	enum mlx5_traffic_types tt;
-
-	rss->hash.hfunc = ETH_RSS_HASH_TOP;
-	netdev_rss_key_fill(rss->hash.toeplitz_hash_key,
-			    sizeof(rss->hash.toeplitz_hash_key));
-	mlx5e_rss_params_indir_init_uniform(&rss->indir, init_nch);
-	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
-		rss->rx_hash_fields[tt] =
-			mlx5e_rss_get_default_tt_config(tt).rx_hash_fields;
+	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
+	struct mlx5e_rss *rss;
+	int err;
+
+	rss = mlx5e_rss_alloc();
+	if (!rss)
+		return -ENOMEM;
+
+	res->rss = rss;
+
+	err = mlx5e_rss_init(rss, res->mdev, inner_ft_support, res->drop_rqn, init_lro_param);
+	if (err)
+		goto err_rss_free;
+
+	mlx5e_rss_set_indir_uniform(rss, init_nch);
+
+	return 0;
+
+err_rss_free:
+	mlx5e_rss_free(rss);
+	res->rss = NULL;
+	return err;
 }
 
-static void mlx5e_rx_res_rss_destroy_tir(struct mlx5e_rx_res *res,
-					 enum mlx5_traffic_types tt,
-					 bool inner)
+static void mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res)
 {
 	struct mlx5e_rss *rss = res->rss;
-	struct mlx5e_tir *tir;
 
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
-	mlx5e_tir_destroy(tir);
+	mlx5e_rss_cleanup(rss);
+	mlx5e_rss_free(rss);
+	res->rss = NULL;
 }
 
-static int mlx5e_rx_res_rss_create_tir(struct mlx5e_rx_res *res,
-				       struct mlx5e_tir_builder *builder,
-				       enum mlx5_traffic_types tt,
-				       const struct mlx5e_lro_param *init_lro_param,
-				       bool inner)
+static void mlx5e_rx_res_rss_enable(struct mlx5e_rx_res *res)
 {
-	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
-	struct mlx5e_rss_params_traffic_type rss_tt;
 	struct mlx5e_rss *rss = res->rss;
-	struct mlx5e_tir *tir;
-	u32 rqtn;
-	int err;
 
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
-
-	rqtn = mlx5e_rqt_get_rqtn(&rss->rqt);
-	mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn,
-				    rqtn, inner_ft_support);
-	mlx5e_tir_builder_build_lro(builder, init_lro_param);
-	rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
-	mlx5e_tir_builder_build_rss(builder, &rss->hash, &rss_tt, inner);
-
-	err = mlx5e_tir_init(tir, builder, res->mdev, true);
-	if (err) {
-		mlx5_core_warn(res->mdev, "Failed to create %sindirect TIR: err = %d, tt = %d\n",
-			       inner ? "inner " : "", err, tt);
-		return err;
-	}
+	res->rss_active = true;
 
-	return 0;
+	mlx5e_rss_enable(rss, res->rss_rqns, res->rss_nch);
 }
 
-static int mlx5e_rx_res_rss_create_tirs(struct mlx5e_rx_res *res,
-					const struct mlx5e_lro_param *init_lro_param,
-					bool inner)
+static void mlx5e_rx_res_rss_disable(struct mlx5e_rx_res *res)
 {
-	enum mlx5_traffic_types tt, max_tt;
-	struct mlx5e_tir_builder *builder;
-	int err;
-
-	builder = mlx5e_tir_builder_alloc(false);
-	if (!builder)
-		return -ENOMEM;
-
-	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-		err = mlx5e_rx_res_rss_create_tir(res, builder, tt, init_lro_param, inner);
-		if (err)
-			goto err_destroy_tirs;
-
-		mlx5e_tir_builder_clear(builder);
-	}
+	struct mlx5e_rss *rss = res->rss;
 
-out:
-	mlx5e_tir_builder_free(builder);
-	return err;
+	res->rss_active = false;
 
-err_destroy_tirs:
-	max_tt = tt;
-	for (tt = 0; tt < max_tt; tt++)
-		mlx5e_rx_res_rss_destroy_tir(res, tt, inner);
-	goto out;
+	mlx5e_rss_disable(rss);
 }
 
-static void mlx5e_rx_res_rss_destroy_tirs(struct mlx5e_rx_res *res, bool inner)
+/* Updates the indirection table SW shadow, does not update the HW resources yet */
+void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch)
 {
-	enum mlx5_traffic_types tt;
-
-	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++)
-		mlx5e_rx_res_rss_destroy_tir(res, tt, inner);
+	WARN_ON_ONCE(res->rss_active);
+	mlx5e_rss_set_indir_uniform(res->rss, nch);
 }
 
-static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
-				 const struct mlx5e_lro_param *init_lro_param,
-				 unsigned int init_nch)
+int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc)
 {
-	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
-	struct mlx5e_rss *rss;
-	int err;
-
-	rss = kvzalloc(sizeof(*rss), GFP_KERNEL);
-	if (!rss)
-		return -ENOMEM;
+	struct mlx5e_rss *rss = res->rss;
 
-	res->rss = rss;
+	return mlx5e_rss_get_rxfh(rss, indir, key, hfunc);
+}
 
-	mlx5e_rx_res_rss_params_init(res, init_nch);
+int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
+			      const u8 *key, const u8 *hfunc)
+{
+	struct mlx5e_rss *rss = res->rss;
 
-	err = mlx5e_rqt_init_direct(&rss->rqt, res->mdev, true, res->drop_rqn);
-	if (err)
-		goto err_free_rss;
+	return mlx5e_rss_set_rxfh(rss, indir, key, hfunc, res->rss_rqns, res->rss_nch);
+}
 
-	err = mlx5e_rx_res_rss_create_tirs(res, init_lro_param, false);
-	if (err)
-		goto err_destroy_rqt;
+u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
+{
+	struct mlx5e_rss *rss = res->rss;
 
-	if (inner_ft_support) {
-		err = mlx5e_rx_res_rss_create_tirs(res, init_lro_param, true);
-		if (err)
-			goto err_destroy_tirs;
-	}
+	return mlx5e_rss_get_hash_fields(rss, tt);
+}
 
-	return 0;
+int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
+				     u8 rx_hash_fields)
+{
+	struct mlx5e_rss *rss = res->rss;
 
-err_destroy_tirs:
-	mlx5e_rx_res_rss_destroy_tirs(res, false);
+	return mlx5e_rss_set_hash_fields(rss, tt, rx_hash_fields);
+}
 
-err_destroy_rqt:
-	mlx5e_rqt_destroy(&rss->rqt);
+/* End of API rx_res_rss_* */
 
-err_free_rss:
-	kvfree(rss);
-	res->rss = NULL;
-	return err;
+struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
+{
+	return kvzalloc(sizeof(struct mlx5e_rx_res), GFP_KERNEL);
 }
 
 static int mlx5e_rx_res_channels_init(struct mlx5e_rx_res *res,
@@ -379,20 +267,6 @@ static int mlx5e_rx_res_ptp_init(struct mlx5e_rx_res *res)
 	return err;
 }
 
-static void mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res)
-{
-	struct mlx5e_rss *rss = res->rss;
-
-	mlx5e_rx_res_rss_destroy_tirs(res, false);
-
-	if (res->features & MLX5E_RX_RES_FEATURE_INNER_FT)
-		mlx5e_rx_res_rss_destroy_tirs(res, true);
-
-	mlx5e_rqt_destroy(&rss->rqt);
-	kvfree(rss);
-	res->rss = NULL;
-}
-
 static void mlx5e_rx_res_channels_destroy(struct mlx5e_rx_res *res)
 {
 	unsigned int ix;
@@ -431,7 +305,7 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
 
 	err = mlx5e_rx_res_rss_init(res, init_lro_param, init_nch);
 	if (err)
-		return err;
+		goto err_out;
 
 	err = mlx5e_rx_res_channels_init(res, init_lro_param);
 	if (err)
@@ -447,6 +321,7 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
 	mlx5e_rx_res_channels_destroy(res);
 err_rss_destroy:
 	mlx5e_rx_res_rss_destroy(res);
+err_out:
 	return err;
 }
 
@@ -478,15 +353,14 @@ u32 mlx5e_rx_res_get_tirn_rss(struct mlx5e_rx_res *res, enum mlx5_traffic_types
 {
 	struct mlx5e_rss *rss = res->rss;
 
-	return mlx5e_tir_get_tirn(&rss->tir[tt]);
+	return mlx5e_rss_get_tirn(rss, tt, false);
 }
 
 u32 mlx5e_rx_res_get_tirn_rss_inner(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
 	struct mlx5e_rss *rss = res->rss;
 
-	WARN_ON(!(res->features & MLX5E_RX_RES_FEATURE_INNER_FT));
-	return mlx5e_tir_get_tirn(&rss->inner_tir[tt]);
+	return mlx5e_rss_get_tirn(rss, tt, true);
 }
 
 u32 mlx5e_rx_res_get_tirn_ptp(struct mlx5e_rx_res *res)
@@ -500,34 +374,6 @@ u32 mlx5e_rx_res_get_rqtn_direct(struct mlx5e_rx_res *res, unsigned int ix)
 	return mlx5e_rqt_get_rqtn(&res->channels[ix].direct_rqt);
 }
 
-static void mlx5e_rx_res_rss_enable(struct mlx5e_rx_res *res)
-{
-	struct mlx5e_rss *rss = res->rss;
-	int err;
-
-	res->rss_active = true;
-
-	err = mlx5e_rqt_redirect_indir(&rss->rqt, res->rss_rqns, res->rss_nch,
-				       rss->hash.hfunc,
-				       &rss->indir);
-	if (err)
-		mlx5_core_warn(res->mdev, "Failed to redirect RQT %#x to channels: err = %d\n",
-			       mlx5e_rqt_get_rqtn(&rss->rqt), err);
-}
-
-static void mlx5e_rx_res_rss_disable(struct mlx5e_rx_res *res)
-{
-	struct mlx5e_rss *rss = res->rss;
-	int err;
-
-	res->rss_active = false;
-
-	err = mlx5e_rqt_redirect_direct(&rss->rqt, res->drop_rqn);
-	if (err)
-		mlx5_core_warn(res->mdev, "Failed to redirect RQT %#x to drop RQ %#x: err = %d\n",
-			       mlx5e_rqt_get_rqtn(&rss->rqt), res->drop_rqn, err);
-}
-
 void mlx5e_rx_res_channels_activate(struct mlx5e_rx_res *res, struct mlx5e_channels *chs)
 {
 	unsigned int nch, ix;
@@ -655,185 +501,10 @@ int mlx5e_rx_res_xsk_deactivate(struct mlx5e_rx_res *res, unsigned int ix)
 	return err;
 }
 
-struct mlx5e_rss_params_traffic_type
-mlx5e_rx_res_rss_get_current_tt_config(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
-{
-	struct mlx5e_rss_params_traffic_type rss_tt;
-	struct mlx5e_rss *rss = res->rss;
-
-	rss_tt = mlx5e_rss_get_default_tt_config(tt);
-	rss_tt.rx_hash_fields = rss->rx_hash_fields[tt];
-	return rss_tt;
-}
-
-/* Updates the indirection table SW shadow, does not update the HW resources yet */
-void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch)
-{
-	WARN_ON_ONCE(res->rss_active);
-	mlx5e_rss_params_indir_init_uniform(&res->rss->indir, nch);
-}
-
-int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc)
-{
-	struct mlx5e_rss *rss = res->rss;
-	unsigned int i;
-
-	if (indir)
-		for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
-			indir[i] = rss->indir.table[i];
-
-	if (key)
-		memcpy(key, rss->hash.toeplitz_hash_key,
-		       sizeof(rss->hash.toeplitz_hash_key));
-
-	if (hfunc)
-		*hfunc = rss->hash.hfunc;
-
-	return 0;
-}
-
-static int mlx5e_rx_res_rss_update_tir(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
-				       bool inner)
-{
-	struct mlx5e_rss_params_traffic_type rss_tt;
-	struct mlx5e_tir_builder *builder;
-	struct mlx5e_rss *rss = res->rss;
-	struct mlx5e_tir *tir;
-	int err;
-
-	builder = mlx5e_tir_builder_alloc(true);
-	if (!builder)
-		return -ENOMEM;
-
-	rss_tt = mlx5e_rx_res_rss_get_current_tt_config(res, tt);
-
-	mlx5e_tir_builder_build_rss(builder, &rss->hash, &rss_tt, inner);
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
-	err = mlx5e_tir_modify(tir, builder);
-
-	mlx5e_tir_builder_free(builder);
-	return err;
-}
-
-int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
-			      const u8 *key, const u8 *hfunc)
-{
-	struct mlx5e_rss *rss = res->rss;
-	enum mlx5_traffic_types tt;
-	bool changed_indir = false;
-	bool changed_hash = false;
-	int err;
-
-	if (hfunc && *hfunc != rss->hash.hfunc) {
-		switch (*hfunc) {
-		case ETH_RSS_HASH_XOR:
-		case ETH_RSS_HASH_TOP:
-			break;
-		default:
-			return -EINVAL;
-		}
-		changed_hash = true;
-		changed_indir = true;
-		rss->hash.hfunc = *hfunc;
-	}
-
-	if (key) {
-		if (rss->hash.hfunc == ETH_RSS_HASH_TOP)
-			changed_hash = true;
-		memcpy(rss->hash.toeplitz_hash_key, key,
-		       sizeof(rss->hash.toeplitz_hash_key));
-	}
-
-	if (indir) {
-		unsigned int i;
-
-		changed_indir = true;
-
-		for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
-			rss->indir.table[i] = indir[i];
-	}
-
-	if (changed_indir && res->rss_active) {
-		err = mlx5e_rqt_redirect_indir(&rss->rqt, res->rss_rqns, res->rss_nch,
-					       rss->hash.hfunc, &rss->indir);
-		if (err)
-			mlx5_core_warn(res->mdev, "Failed to redirect indirect RQT %#x to channels: err = %d\n",
-				       mlx5e_rqt_get_rqtn(&rss->rqt), err);
-	}
-
-	if (changed_hash)
-		for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-			err = mlx5e_rx_res_rss_update_tir(res, tt, false);
-			if (err)
-				mlx5_core_warn(res->mdev, "Failed to update RSS hash of indirect TIR for traffic type %d: err = %d\n",
-					       tt, err);
-
-			if (!(res->features & MLX5E_RX_RES_FEATURE_INNER_FT))
-				continue;
-
-			err = mlx5e_rx_res_rss_update_tir(res, tt, true);
-			if (err)
-				mlx5_core_warn(res->mdev, "Failed to update RSS hash of inner indirect TIR for traffic type %d: err = %d\n",
-					       tt, err);
-		}
-
-	return 0;
-}
-
-u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
-{
-	struct mlx5e_rss *rss = res->rss;
-
-	return rss->rx_hash_fields[tt];
-}
-
-int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
-				     u8 rx_hash_fields)
-{
-	struct mlx5e_rss *rss = res->rss;
-	u8 old_rx_hash_fields;
-	int err;
-
-	old_rx_hash_fields = rss->rx_hash_fields[tt];
-
-	if (old_rx_hash_fields == rx_hash_fields)
-		return 0;
-
-	rss->rx_hash_fields[tt] = rx_hash_fields;
-
-	err = mlx5e_rx_res_rss_update_tir(res, tt, false);
-	if (err) {
-		rss->rx_hash_fields[tt] = old_rx_hash_fields;
-		mlx5_core_warn(res->mdev, "Failed to update RSS hash fields of indirect TIR for traffic type %d: err = %d\n",
-			       tt, err);
-		return err;
-	}
-
-	if (!(res->features & MLX5E_RX_RES_FEATURE_INNER_FT))
-		return 0;
-
-	err = mlx5e_rx_res_rss_update_tir(res, tt, true);
-	if (err) {
-		/* Partial update happened. Try to revert - it may fail too, but
-		 * there is nothing more we can do.
-		 */
-		rss->rx_hash_fields[tt] = old_rx_hash_fields;
-		mlx5_core_warn(res->mdev, "Failed to update RSS hash fields of inner indirect TIR for traffic type %d: err = %d\n",
-			       tt, err);
-		if (mlx5e_rx_res_rss_update_tir(res, tt, false))
-			mlx5_core_warn(res->mdev,
-				       "Partial update of RSS hash fields happened: failed to revert indirect TIR for traffic type %d to the old values\n",
-				       tt);
-	}
-
-	return err;
-}
-
 int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param *lro_param)
 {
 	struct mlx5e_rss *rss = res->rss;
 	struct mlx5e_tir_builder *builder;
-	enum mlx5_traffic_types tt;
 	int err, final_err;
 	unsigned int ix;
 
@@ -845,26 +516,9 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 
 	final_err = 0;
 
-	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-		err = mlx5e_tir_modify(&rss->tir[tt], builder);
-		if (err) {
-			mlx5_core_warn(res->mdev, "Failed to update LRO state of indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(&rss->tir[tt]), tt, err);
-			if (!final_err)
-				final_err = err;
-		}
-
-		if (!(res->features & MLX5E_RX_RES_FEATURE_INNER_FT))
-			continue;
-
-		err = mlx5e_tir_modify(&rss->inner_tir[tt], builder);
-		if (err) {
-			mlx5_core_warn(res->mdev, "Failed to update LRO state of inner indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(&rss->inner_tir[tt]), tt, err);
-			if (!final_err)
-				final_err = err;
-		}
-	}
+	err = mlx5e_rss_lro_set_param(rss, lro_param);
+	if (err)
+		final_err = final_err ? : err;
 
 	for (ix = 0; ix < res->max_nch; ix++) {
 		err = mlx5e_tir_modify(&res->channels[ix].direct_tir, builder);
@@ -882,5 +536,5 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 
 struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res)
 {
-	return res->rss->hash;
+	return mlx5e_rss_get_hash(res->rss);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index 1703fb981d6d..af017f516f4a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -8,6 +8,7 @@
 #include "rqt.h"
 #include "tir.h"
 #include "fs.h"
+#include "rss.h"
 
 struct mlx5e_rx_res;
 
@@ -20,9 +21,6 @@ enum mlx5e_rx_res_features {
 	MLX5E_RX_RES_FEATURE_PTP = BIT(2),
 };
 
-struct mlx5e_rss_params_traffic_type
-mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt);
-
 /* Setup */
 struct mlx5e_rx_res *mlx5e_rx_res_alloc(void);
 int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
@@ -50,8 +48,6 @@ int mlx5e_rx_res_xsk_activate(struct mlx5e_rx_res *res, struct mlx5e_channels *c
 int mlx5e_rx_res_xsk_deactivate(struct mlx5e_rx_res *res, unsigned int ix);
 
 /* Configuration API */
-struct mlx5e_rss_params_traffic_type
-mlx5e_rx_res_rss_get_current_tt_config(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt);
 void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch);
 int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc);
 int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 05/17] net/mlx5e: Dynamically allocate TIRs in RSS contexts
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 04/17] net/mlx5e: Convert RSS to a dedicated object Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 06/17] net/mlx5e: Support multiple " Saeed Mahameed
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Move from static to dynamic memory allocations for TIR.
This is in preparation to supporting on-demand TIR operations in
downstream patches, where every RSS context will be init with an
empty set of TIRs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/rss.c  | 69 +++++++++++++++----
 1 file changed, 56 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
index f4a72b6b8a02..34c5b8f0d100 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -71,8 +71,8 @@ struct mlx5e_rss {
 	struct mlx5e_rss_params_hash hash;
 	struct mlx5e_rss_params_indir indir;
 	u32 rx_hash_fields[MLX5E_NUM_INDIR_TIRS];
-	struct mlx5e_tir tir[MLX5E_NUM_INDIR_TIRS];
-	struct mlx5e_tir inner_tir[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_tir *tir[MLX5E_NUM_INDIR_TIRS];
+	struct mlx5e_tir *inner_tir[MLX5E_NUM_INDIR_TIRS];
 	struct mlx5e_rqt rqt;
 	struct mlx5_core_dev *mdev;
 	u32 drop_rqn;
@@ -102,6 +102,18 @@ static void mlx5e_rss_params_init(struct mlx5e_rss *rss)
 			mlx5e_rss_get_default_tt_config(tt).rx_hash_fields;
 }
 
+static struct mlx5e_tir **rss_get_tirp(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+				       bool inner)
+{
+	return inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+}
+
+static struct mlx5e_tir *rss_get_tir(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
+				     bool inner)
+{
+	return *rss_get_tirp(rss, tt, inner);
+}
+
 static struct mlx5e_rss_params_traffic_type
 mlx5e_rss_get_tt_config(struct mlx5e_rss *rss, enum mlx5_traffic_types tt)
 {
@@ -119,6 +131,7 @@ static int mlx5e_rss_create_tir(struct mlx5e_rss *rss,
 {
 	struct mlx5e_rss_params_traffic_type rss_tt;
 	struct mlx5e_tir_builder *builder;
+	struct mlx5e_tir **tir_p;
 	struct mlx5e_tir *tir;
 	u32 rqtn;
 	int err;
@@ -130,12 +143,20 @@ static int mlx5e_rss_create_tir(struct mlx5e_rss *rss,
 		return -EINVAL;
 	}
 
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+	tir_p = rss_get_tirp(rss, tt, inner);
+	if (*tir_p)
+		return -EINVAL;
 
-	builder = mlx5e_tir_builder_alloc(false);
-	if (!builder)
+	tir = kvzalloc(sizeof(*tir), GFP_KERNEL);
+	if (!tir)
 		return -ENOMEM;
 
+	builder = mlx5e_tir_builder_alloc(false);
+	if (!builder) {
+		err = -ENOMEM;
+		goto free_tir;
+	}
+
 	rqtn = mlx5e_rqt_get_rqtn(&rss->rqt);
 	mlx5e_tir_builder_build_rqt(builder, rss->mdev->mlx5e_res.hw_objs.td.tdn,
 				    rqtn, rss->inner_ft_support);
@@ -145,19 +166,34 @@ static int mlx5e_rss_create_tir(struct mlx5e_rss *rss,
 
 	err = mlx5e_tir_init(tir, builder, rss->mdev, true);
 	mlx5e_tir_builder_free(builder);
-	if (err)
+	if (err) {
 		mlx5e_rss_warn(rss->mdev, "Failed to create %sindirect TIR: err = %d, tt = %d\n",
 			       inner ? "inner " : "", err, tt);
+		goto free_tir;
+	}
+
+	*tir_p = tir;
+	return 0;
+
+free_tir:
+	kvfree(tir);
 	return err;
 }
 
 static void mlx5e_rss_destroy_tir(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
 				  bool inner)
 {
+	struct mlx5e_tir **tir_p;
 	struct mlx5e_tir *tir;
 
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+	tir_p = rss_get_tirp(rss, tt, inner);
+	if (!*tir_p)
+		return;
+
+	tir = *tir_p;
 	mlx5e_tir_destroy(tir);
+	kvfree(tir);
+	*tir_p = NULL;
 }
 
 static int mlx5e_rss_create_tirs(struct mlx5e_rss *rss,
@@ -198,7 +234,9 @@ static int mlx5e_rss_update_tir(struct mlx5e_rss *rss, enum mlx5_traffic_types t
 	struct mlx5e_tir *tir;
 	int err;
 
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+	tir = rss_get_tir(rss, tt, inner);
+	if (!tir)
+		return 0;
 
 	builder = mlx5e_tir_builder_alloc(true);
 	if (!builder)
@@ -295,7 +333,8 @@ u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
 	struct mlx5e_tir *tir;
 
 	WARN_ON(inner && !rss->inner_ft_support);
-	tir = inner ? &rss->inner_tir[tt] : &rss->tir[tt];
+	tir = rss_get_tir(rss, tt, inner);
+	WARN_ON(!tir);
 
 	return mlx5e_tir_get_tirn(tir);
 }
@@ -342,10 +381,13 @@ int mlx5e_rss_lro_set_param(struct mlx5e_rss *rss, struct mlx5e_lro_param *lro_p
 	final_err = 0;
 
 	for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
-		err = mlx5e_tir_modify(&rss->tir[tt], builder);
+		struct mlx5e_tir *tir;
+
+		tir = rss_get_tir(rss, tt, false);
+		err = mlx5e_tir_modify(tir, builder);
 		if (err) {
 			mlx5e_rss_warn(rss->mdev, "Failed to update LRO state of indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(&rss->tir[tt]), tt, err);
+				       mlx5e_tir_get_tirn(rss->tir[tt]), tt, err);
 			if (!final_err)
 				final_err = err;
 		}
@@ -353,10 +395,11 @@ int mlx5e_rss_lro_set_param(struct mlx5e_rss *rss, struct mlx5e_lro_param *lro_p
 		if (!rss->inner_ft_support)
 			continue;
 
-		err = mlx5e_tir_modify(&rss->inner_tir[tt], builder);
+		tir = rss_get_tir(rss, tt, true);
+		err = mlx5e_tir_modify(tir, builder);
 		if (err) {
 			mlx5e_rss_warn(rss->mdev, "Failed to update LRO state of inner indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(&rss->inner_tir[tt]), tt, err);
+				       mlx5e_tir_get_tirn(rss->inner_tir[tt]), tt, err);
 			if (!final_err)
 				final_err = err;
 		}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 06/17] net/mlx5e: Support multiple RSS contexts
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 05/17] net/mlx5e: Dynamically allocate TIRs in RSS contexts Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 07/17] net/mlx5e: Support flow classification into " Saeed Mahameed
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Add support to multiple RSS contexts. Resources of the non-default
RSS contexts are allocated and created on demand. Each RSS context
can be controlled and configured separately, via the implemented
ethtool ops. Here we limit the num of total contexts to 16.

We do not enforce any kind of new limitation over the indirection table
content. More specifically, two separate contexts can be configured to
fully or partially point to the same set of receive rings.

The default RSS context (index 0) is created with its full set of TIRs.
All other contexts are created with an empty set, then TIRs are added
upon first usage when steering rules are added.
We use a reference counting mechanism to make sure an RSS context is
not removed before the rules pointing to it.

Block ethtool set_channels operations when multiple RSS contexts exist,
as currently the kernel doesn't protect against inconsistent channels
configs that break non-default RSS contexts.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/rss.c  |  51 ++++-
 .../net/ethernet/mellanox/mlx5/core/en/rss.h  |   8 +-
 .../ethernet/mellanox/mlx5/core/en/rx_res.c   | 194 +++++++++++++++---
 .../ethernet/mellanox/mlx5/core/en/rx_res.h   |  12 +-
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  |  59 +++++-
 5 files changed, 273 insertions(+), 51 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
index 34c5b8f0d100..d2c4ace7c8ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -78,6 +78,7 @@ struct mlx5e_rss {
 	u32 drop_rqn;
 	bool inner_ft_support;
 	bool enabled;
+	refcount_t refcnt;
 };
 
 struct mlx5e_rss *mlx5e_rss_alloc(void)
@@ -281,19 +282,26 @@ static int mlx5e_rss_update_tirs(struct mlx5e_rss *rss)
 	return retval;
 }
 
-int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
-		   bool inner_ft_support, u32 drop_rqn,
-		   const struct mlx5e_lro_param *init_lro_param)
+int mlx5e_rss_init_no_tirs(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
+			   bool inner_ft_support, u32 drop_rqn)
 {
-	int err;
-
 	rss->mdev = mdev;
 	rss->inner_ft_support = inner_ft_support;
 	rss->drop_rqn = drop_rqn;
 
 	mlx5e_rss_params_init(rss);
+	refcount_set(&rss->refcnt, 1);
+
+	return mlx5e_rqt_init_direct(&rss->rqt, mdev, true, drop_rqn);
+}
+
+int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
+		   bool inner_ft_support, u32 drop_rqn,
+		   const struct mlx5e_lro_param *init_lro_param)
+{
+	int err;
 
-	err = mlx5e_rqt_init_direct(&rss->rqt, mdev, true, drop_rqn);
+	err = mlx5e_rss_init_no_tirs(rss, mdev, inner_ft_support, drop_rqn);
 	if (err)
 		goto err_out;
 
@@ -317,14 +325,34 @@ int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
 	return err;
 }
 
-void mlx5e_rss_cleanup(struct mlx5e_rss *rss)
+int mlx5e_rss_cleanup(struct mlx5e_rss *rss)
 {
+	if (!refcount_dec_if_one(&rss->refcnt))
+		return -EBUSY;
+
 	mlx5e_rss_destroy_tirs(rss, false);
 
 	if (rss->inner_ft_support)
 		mlx5e_rss_destroy_tirs(rss, true);
 
 	mlx5e_rqt_destroy(&rss->rqt);
+
+	return 0;
+}
+
+void mlx5e_rss_refcnt_inc(struct mlx5e_rss *rss)
+{
+	refcount_inc(&rss->refcnt);
+}
+
+void mlx5e_rss_refcnt_dec(struct mlx5e_rss *rss)
+{
+	refcount_dec(&rss->refcnt);
+}
+
+unsigned int mlx5e_rss_refcnt_read(struct mlx5e_rss *rss)
+{
+	return refcount_read(&rss->refcnt);
 }
 
 u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
@@ -384,22 +412,27 @@ int mlx5e_rss_lro_set_param(struct mlx5e_rss *rss, struct mlx5e_lro_param *lro_p
 		struct mlx5e_tir *tir;
 
 		tir = rss_get_tir(rss, tt, false);
+		if (!tir)
+			goto inner_tir;
 		err = mlx5e_tir_modify(tir, builder);
 		if (err) {
 			mlx5e_rss_warn(rss->mdev, "Failed to update LRO state of indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(rss->tir[tt]), tt, err);
+				       mlx5e_tir_get_tirn(tir), tt, err);
 			if (!final_err)
 				final_err = err;
 		}
 
+inner_tir:
 		if (!rss->inner_ft_support)
 			continue;
 
 		tir = rss_get_tir(rss, tt, true);
+		if (!tir)
+			continue;
 		err = mlx5e_tir_modify(tir, builder);
 		if (err) {
 			mlx5e_rss_warn(rss->mdev, "Failed to update LRO state of inner indirect TIR %#x for traffic type %d: err = %d\n",
-				       mlx5e_tir_get_tirn(rss->inner_tir[tt]), tt, err);
+				       mlx5e_tir_get_tirn(tir), tt, err);
 			if (!final_err)
 				final_err = err;
 		}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
index e71e712ed842..6f52d78a36da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
@@ -18,7 +18,13 @@ void mlx5e_rss_free(struct mlx5e_rss *rss);
 int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
 		   bool inner_ft_support, u32 drop_rqn,
 		   const struct mlx5e_lro_param *init_lro_param);
-void mlx5e_rss_cleanup(struct mlx5e_rss *rss);
+int mlx5e_rss_init_no_tirs(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
+			   bool inner_ft_support, u32 drop_rqn);
+int mlx5e_rss_cleanup(struct mlx5e_rss *rss);
+
+void mlx5e_rss_refcnt_inc(struct mlx5e_rss *rss);
+void mlx5e_rss_refcnt_dec(struct mlx5e_rss *rss);
+unsigned int mlx5e_rss_refcnt_read(struct mlx5e_rss *rss);
 
 u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
 		       bool inner);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 590d94196370..432963594b8e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -5,13 +5,15 @@
 #include "channels.h"
 #include "params.h"
 
+#define MLX5E_MAX_NUM_RSS 16
+
 struct mlx5e_rx_res {
 	struct mlx5_core_dev *mdev;
 	enum mlx5e_rx_res_features features;
 	unsigned int max_nch;
 	u32 drop_rqn;
 
-	struct mlx5e_rss *rss;
+	struct mlx5e_rss *rss[MLX5E_MAX_NUM_RSS];
 	bool rss_active;
 	u32 rss_rqns[MLX5E_INDIR_RQT_SIZE];
 	unsigned int rss_nch;
@@ -31,86 +33,194 @@ struct mlx5e_rx_res {
 
 /* API for rx_res_rss_* */
 
-static int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res,
-				 const struct mlx5e_lro_param *init_lro_param,
-				 unsigned int init_nch)
+static int mlx5e_rx_res_rss_init_def(struct mlx5e_rx_res *res,
+				     const struct mlx5e_lro_param *init_lro_param,
+				     unsigned int init_nch)
 {
 	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
 	struct mlx5e_rss *rss;
 	int err;
 
+	if (WARN_ON(res->rss[0]))
+		return -EINVAL;
+
 	rss = mlx5e_rss_alloc();
 	if (!rss)
 		return -ENOMEM;
 
-	res->rss = rss;
+	err = mlx5e_rss_init(rss, res->mdev, inner_ft_support, res->drop_rqn,
+			     init_lro_param);
+	if (err)
+		goto err_rss_free;
+
+	mlx5e_rss_set_indir_uniform(rss, init_nch);
+
+	res->rss[0] = rss;
+
+	return 0;
+
+err_rss_free:
+	mlx5e_rss_free(rss);
+	return err;
+}
+
+int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int init_nch)
+{
+	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
+	struct mlx5e_rss *rss;
+	int err, i;
+
+	for (i = 1; i < MLX5E_MAX_NUM_RSS; i++)
+		if (!res->rss[i])
+			break;
+
+	if (i == MLX5E_MAX_NUM_RSS)
+		return -ENOSPC;
+
+	rss = mlx5e_rss_alloc();
+	if (!rss)
+		return -ENOMEM;
 
-	err = mlx5e_rss_init(rss, res->mdev, inner_ft_support, res->drop_rqn, init_lro_param);
+	err = mlx5e_rss_init_no_tirs(rss, res->mdev, inner_ft_support, res->drop_rqn);
 	if (err)
 		goto err_rss_free;
 
 	mlx5e_rss_set_indir_uniform(rss, init_nch);
+	if (res->rss_active)
+		mlx5e_rss_enable(rss, res->rss_rqns, res->rss_nch);
+
+	res->rss[i] = rss;
+	*rss_idx = i;
 
 	return 0;
 
 err_rss_free:
 	mlx5e_rss_free(rss);
-	res->rss = NULL;
 	return err;
 }
 
-static void mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res)
+static int __mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss = res->rss[rss_idx];
+	int err;
+
+	err = mlx5e_rss_cleanup(rss);
+	if (err)
+		return err;
 
-	mlx5e_rss_cleanup(rss);
 	mlx5e_rss_free(rss);
-	res->rss = NULL;
+	res->rss[rss_idx] = NULL;
+
+	return 0;
+}
+
+int mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx)
+{
+	struct mlx5e_rss *rss;
+
+	if (rss_idx >= MLX5E_MAX_NUM_RSS)
+		return -EINVAL;
+
+	rss = res->rss[rss_idx];
+	if (!rss)
+		return -EINVAL;
+
+	return __mlx5e_rx_res_rss_destroy(res, rss_idx);
+}
+
+static void mlx5e_rx_res_rss_destroy_all(struct mlx5e_rx_res *res)
+{
+	int i;
+
+	for (i = 0; i < MLX5E_MAX_NUM_RSS; i++) {
+		struct mlx5e_rss *rss = res->rss[i];
+		int err;
+
+		if (!rss)
+			continue;
+
+		err = __mlx5e_rx_res_rss_destroy(res, i);
+		if (err) {
+			unsigned int refcount;
+
+			refcount = mlx5e_rss_refcnt_read(rss);
+			mlx5_core_warn(res->mdev,
+				       "Failed to destroy RSS context %d, refcount = %u, err = %d\n",
+				       i, refcount, err);
+		}
+	}
 }
 
 static void mlx5e_rx_res_rss_enable(struct mlx5e_rx_res *res)
 {
-	struct mlx5e_rss *rss = res->rss;
+	int i;
 
 	res->rss_active = true;
 
-	mlx5e_rss_enable(rss, res->rss_rqns, res->rss_nch);
+	for (i = 0; i < MLX5E_MAX_NUM_RSS; i++) {
+		struct mlx5e_rss *rss = res->rss[i];
+
+		if (!rss)
+			continue;
+		mlx5e_rss_enable(rss, res->rss_rqns, res->rss_nch);
+	}
 }
 
 static void mlx5e_rx_res_rss_disable(struct mlx5e_rx_res *res)
 {
-	struct mlx5e_rss *rss = res->rss;
+	int i;
 
 	res->rss_active = false;
 
-	mlx5e_rss_disable(rss);
+	for (i = 0; i < MLX5E_MAX_NUM_RSS; i++) {
+		struct mlx5e_rss *rss = res->rss[i];
+
+		if (!rss)
+			continue;
+		mlx5e_rss_disable(rss);
+	}
 }
 
 /* Updates the indirection table SW shadow, does not update the HW resources yet */
 void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch)
 {
 	WARN_ON_ONCE(res->rss_active);
-	mlx5e_rss_set_indir_uniform(res->rss, nch);
+	mlx5e_rss_set_indir_uniform(res->rss[0], nch);
 }
 
-int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc)
+int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 rss_idx,
+			      u32 *indir, u8 *key, u8 *hfunc)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss;
+
+	if (rss_idx >= MLX5E_MAX_NUM_RSS)
+		return -EINVAL;
+
+	rss = res->rss[rss_idx];
+	if (!rss)
+		return -ENOENT;
 
 	return mlx5e_rss_get_rxfh(rss, indir, key, hfunc);
 }
 
-int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
-			      const u8 *key, const u8 *hfunc)
+int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, u32 rss_idx,
+			      const u32 *indir, const u8 *key, const u8 *hfunc)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss;
+
+	if (rss_idx >= MLX5E_MAX_NUM_RSS)
+		return -EINVAL;
+
+	rss = res->rss[rss_idx];
+	if (!rss)
+		return -ENOENT;
 
 	return mlx5e_rss_set_rxfh(rss, indir, key, hfunc, res->rss_rqns, res->rss_nch);
 }
 
 u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss = res->rss[0];
 
 	return mlx5e_rss_get_hash_fields(rss, tt);
 }
@@ -118,11 +228,23 @@ u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_
 int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
 				     u8 rx_hash_fields)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss = res->rss[0];
 
 	return mlx5e_rss_set_hash_fields(rss, tt, rx_hash_fields);
 }
 
+int mlx5e_rx_res_rss_cnt(struct mlx5e_rx_res *res)
+{
+	int i, cnt;
+
+	cnt = 0;
+	for (i = 0; i < MLX5E_MAX_NUM_RSS; i++)
+		if (res->rss[i])
+			cnt++;
+
+	return cnt;
+}
+
 /* End of API rx_res_rss_* */
 
 struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
@@ -303,7 +425,7 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
 	res->max_nch = max_nch;
 	res->drop_rqn = drop_rqn;
 
-	err = mlx5e_rx_res_rss_init(res, init_lro_param, init_nch);
+	err = mlx5e_rx_res_rss_init_def(res, init_lro_param, init_nch);
 	if (err)
 		goto err_out;
 
@@ -320,7 +442,7 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
 err_channels_destroy:
 	mlx5e_rx_res_channels_destroy(res);
 err_rss_destroy:
-	mlx5e_rx_res_rss_destroy(res);
+	__mlx5e_rx_res_rss_destroy(res, 0);
 err_out:
 	return err;
 }
@@ -329,7 +451,7 @@ void mlx5e_rx_res_destroy(struct mlx5e_rx_res *res)
 {
 	mlx5e_rx_res_ptp_destroy(res);
 	mlx5e_rx_res_channels_destroy(res);
-	mlx5e_rx_res_rss_destroy(res);
+	mlx5e_rx_res_rss_destroy_all(res);
 }
 
 void mlx5e_rx_res_free(struct mlx5e_rx_res *res)
@@ -351,14 +473,14 @@ u32 mlx5e_rx_res_get_tirn_xsk(struct mlx5e_rx_res *res, unsigned int ix)
 
 u32 mlx5e_rx_res_get_tirn_rss(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss = res->rss[0];
 
 	return mlx5e_rss_get_tirn(rss, tt, false);
 }
 
 u32 mlx5e_rx_res_get_tirn_rss_inner(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt)
 {
-	struct mlx5e_rss *rss = res->rss;
+	struct mlx5e_rss *rss = res->rss[0];
 
 	return mlx5e_rss_get_tirn(rss, tt, true);
 }
@@ -503,7 +625,6 @@ int mlx5e_rx_res_xsk_deactivate(struct mlx5e_rx_res *res, unsigned int ix)
 
 int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param *lro_param)
 {
-	struct mlx5e_rss *rss = res->rss;
 	struct mlx5e_tir_builder *builder;
 	int err, final_err;
 	unsigned int ix;
@@ -516,9 +637,16 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 
 	final_err = 0;
 
-	err = mlx5e_rss_lro_set_param(rss, lro_param);
-	if (err)
-		final_err = final_err ? : err;
+	for (ix = 0; ix < MLX5E_MAX_NUM_RSS; ix++) {
+		struct mlx5e_rss *rss = res->rss[ix];
+
+		if (!rss)
+			continue;
+
+		err = mlx5e_rss_lro_set_param(rss, lro_param);
+		if (err)
+			final_err = final_err ? : err;
+	}
 
 	for (ix = 0; ix < res->max_nch; ix++) {
 		err = mlx5e_tir_modify(&res->channels[ix].direct_tir, builder);
@@ -536,5 +664,5 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 
 struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res)
 {
-	return mlx5e_rss_get_hash(res->rss);
+	return mlx5e_rss_get_hash(res->rss[0]);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index af017f516f4a..8248caa36995 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -49,14 +49,20 @@ int mlx5e_rx_res_xsk_deactivate(struct mlx5e_rx_res *res, unsigned int ix);
 
 /* Configuration API */
 void mlx5e_rx_res_rss_set_indir_uniform(struct mlx5e_rx_res *res, unsigned int nch);
-int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 *indir, u8 *key, u8 *hfunc);
-int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, const u32 *indir,
-			      const u8 *key, const u8 *hfunc);
+int mlx5e_rx_res_rss_get_rxfh(struct mlx5e_rx_res *res, u32 rss_idx,
+			      u32 *indir, u8 *key, u8 *hfunc);
+int mlx5e_rx_res_rss_set_rxfh(struct mlx5e_rx_res *res, u32 rss_idx,
+			      const u32 *indir, const u8 *key, const u8 *hfunc);
+
 u8 mlx5e_rx_res_rss_get_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt);
 int mlx5e_rx_res_rss_set_hash_fields(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt,
 				     u8 rx_hash_fields);
 int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param *lro_param);
 
+int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int init_nch);
+int mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx);
+int mlx5e_rx_res_rss_cnt(struct mlx5e_rx_res *res);
+
 /* Workaround for hairpin */
 struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 62eef3e7f993..839a753fda32 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -420,6 +420,7 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
 	unsigned int count = ch->combined_count;
 	struct mlx5e_params new_params;
 	bool arfs_enabled;
+	int rss_cnt;
 	bool opened;
 	int err = 0;
 
@@ -455,6 +456,17 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
 		goto out;
 	}
 
+	/* Don't allow changing the number of channels if non-default RSS contexts exist,
+	 * the kernel doesn't protect against set_channels operations that break them.
+	 */
+	rss_cnt = mlx5e_rx_res_rss_cnt(priv->rx_res) - 1;
+	if (rss_cnt) {
+		err = -EINVAL;
+		netdev_err(priv->netdev, "%s: Non-default RSS contexts exist (%d), cannot change the number of channels\n",
+			   __func__, rss_cnt);
+		goto out;
+	}
+
 	new_params = *cur_params;
 	new_params.num_channels = count;
 
@@ -1194,18 +1206,53 @@ static u32 mlx5e_get_rxfh_indir_size(struct net_device *netdev)
 	return mlx5e_ethtool_get_rxfh_indir_size(priv);
 }
 
-int mlx5e_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
-		   u8 *hfunc)
+static int mlx5e_get_rxfh_context(struct net_device *dev, u32 *indir,
+				  u8 *key, u8 *hfunc, u32 rss_context)
 {
-	struct mlx5e_priv *priv = netdev_priv(netdev);
+	struct mlx5e_priv *priv = netdev_priv(dev);
 	int err;
 
 	mutex_lock(&priv->state_lock);
-	err = mlx5e_rx_res_rss_get_rxfh(priv->rx_res, indir, key, hfunc);
+	err = mlx5e_rx_res_rss_get_rxfh(priv->rx_res, rss_context, indir, key, hfunc);
 	mutex_unlock(&priv->state_lock);
 	return err;
 }
 
+static int mlx5e_set_rxfh_context(struct net_device *dev, const u32 *indir,
+				  const u8 *key, const u8 hfunc,
+				  u32 *rss_context, bool delete)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	int err;
+
+	mutex_lock(&priv->state_lock);
+	if (delete) {
+		err = mlx5e_rx_res_rss_destroy(priv->rx_res, *rss_context);
+		goto unlock;
+	}
+
+	if (*rss_context == ETH_RXFH_CONTEXT_ALLOC) {
+		unsigned int count = priv->channels.params.num_channels;
+
+		err = mlx5e_rx_res_rss_init(priv->rx_res, rss_context, count);
+		if (err)
+			goto unlock;
+	}
+
+	err = mlx5e_rx_res_rss_set_rxfh(priv->rx_res, *rss_context, indir, key,
+					hfunc == ETH_RSS_HASH_NO_CHANGE ? NULL : &hfunc);
+
+unlock:
+	mutex_unlock(&priv->state_lock);
+	return err;
+}
+
+int mlx5e_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
+		   u8 *hfunc)
+{
+	return mlx5e_get_rxfh_context(netdev, indir, key, hfunc, 0);
+}
+
 int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir,
 		   const u8 *key, const u8 hfunc)
 {
@@ -1213,7 +1260,7 @@ int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir,
 	int err;
 
 	mutex_lock(&priv->state_lock);
-	err = mlx5e_rx_res_rss_set_rxfh(priv->rx_res, indir, key,
+	err = mlx5e_rx_res_rss_set_rxfh(priv->rx_res, 0, indir, key,
 					hfunc == ETH_RSS_HASH_NO_CHANGE ? NULL : &hfunc);
 	mutex_unlock(&priv->state_lock);
 	return err;
@@ -2299,6 +2346,8 @@ const struct ethtool_ops mlx5e_ethtool_ops = {
 	.get_rxfh_indir_size = mlx5e_get_rxfh_indir_size,
 	.get_rxfh          = mlx5e_get_rxfh,
 	.set_rxfh          = mlx5e_set_rxfh,
+	.get_rxfh_context  = mlx5e_get_rxfh_context,
+	.set_rxfh_context  = mlx5e_set_rxfh_context,
 	.get_rxnfc         = mlx5e_get_rxnfc,
 	.set_rxnfc         = mlx5e_set_rxnfc,
 	.get_tunable       = mlx5e_get_tunable,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 07/17] net/mlx5e: Support flow classification into RSS contexts
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 06/17] net/mlx5e: Support multiple " Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 08/17] net/mlx5e: Abstract MQPRIO params Saeed Mahameed
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Extend the existing flow classification support, to steer
flows not only directly to a receive ring, but also into
the new RSS contexts.

Create needed TIR objects on demand, and hold reference
on the RSS context.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/rss.c  | 24 +++++
 .../net/ethernet/mellanox/mlx5/core/en/rss.h  |  5 +
 .../ethernet/mellanox/mlx5/core/en/rx_res.c   | 22 +++++
 .../ethernet/mellanox/mlx5/core/en/rx_res.h   |  2 +
 .../mellanox/mlx5/core/en_fs_ethtool.c        | 99 +++++++++++++++----
 5 files changed, 131 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
index d2c4ace7c8ba..625cd49ef96c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -367,6 +367,30 @@ u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
 	return mlx5e_tir_get_tirn(tir);
 }
 
+/* Fill the "tirn" output parameter.
+ * Create the requested TIR if it's its first usage.
+ */
+int mlx5e_rss_obtain_tirn(struct mlx5e_rss *rss,
+			  enum mlx5_traffic_types tt,
+			  const struct mlx5e_lro_param *init_lro_param,
+			  bool inner, u32 *tirn)
+{
+	struct mlx5e_tir *tir;
+
+	tir = rss_get_tir(rss, tt, inner);
+	if (!tir) { /* TIR doesn't exist, create one */
+		int err;
+
+		err = mlx5e_rss_create_tir(rss, tt, init_lro_param, inner);
+		if (err)
+			return err;
+		tir = rss_get_tir(rss, tt, inner);
+	}
+
+	*tirn = mlx5e_tir_get_tirn(tir);
+	return 0;
+}
+
 static void mlx5e_rss_apply(struct mlx5e_rss *rss, u32 *rqns, unsigned int num_rqns)
 {
 	int err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
index 6f52d78a36da..d522a10dadf3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
@@ -28,6 +28,11 @@ unsigned int mlx5e_rss_refcnt_read(struct mlx5e_rss *rss);
 
 u32 mlx5e_rss_get_tirn(struct mlx5e_rss *rss, enum mlx5_traffic_types tt,
 		       bool inner);
+int mlx5e_rss_obtain_tirn(struct mlx5e_rss *rss,
+			  enum mlx5_traffic_types tt,
+			  const struct mlx5e_lro_param *init_lro_param,
+			  bool inner, u32 *tirn);
+
 void mlx5e_rss_enable(struct mlx5e_rss *rss, u32 *rqns, unsigned int num_rqns);
 void mlx5e_rss_disable(struct mlx5e_rss *rss);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 432963594b8e..bf0313e2682b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -245,6 +245,28 @@ int mlx5e_rx_res_rss_cnt(struct mlx5e_rx_res *res)
 	return cnt;
 }
 
+int mlx5e_rx_res_rss_index(struct mlx5e_rx_res *res, struct mlx5e_rss *rss)
+{
+	int i;
+
+	if (!rss)
+		return -EINVAL;
+
+	for (i = 0; i < MLX5E_MAX_NUM_RSS; i++)
+		if (rss == res->rss[i])
+			return i;
+
+	return -ENOENT;
+}
+
+struct mlx5e_rss *mlx5e_rx_res_rss_get(struct mlx5e_rx_res *res, u32 rss_idx)
+{
+	if (rss_idx >= MLX5E_MAX_NUM_RSS)
+		return NULL;
+
+	return res->rss[rss_idx];
+}
+
 /* End of API rx_res_rss_* */
 
 struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index 8248caa36995..4a15942d79f7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -62,6 +62,8 @@ int mlx5e_rx_res_lro_set_param(struct mlx5e_rx_res *res, struct mlx5e_lro_param
 int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int init_nch);
 int mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx);
 int mlx5e_rx_res_rss_cnt(struct mlx5e_rx_res *res);
+int mlx5e_rx_res_rss_index(struct mlx5e_rx_res *res, struct mlx5e_rss *rss);
+struct mlx5e_rss *mlx5e_rx_res_rss_get(struct mlx5e_rx_res *res, u32 rss_idx);
 
 /* Workaround for hairpin */
 struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
index 3d8918f9399e..03693fa74a70 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
@@ -35,11 +35,19 @@
 #include "en/params.h"
 #include "en/xsk/pool.h"
 
+static int flow_type_to_traffic_type(u32 flow_type);
+
+static u32 flow_type_mask(u32 flow_type)
+{
+	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+}
+
 struct mlx5e_ethtool_rule {
 	struct list_head             list;
 	struct ethtool_rx_flow_spec  flow_spec;
 	struct mlx5_flow_handle	     *rule;
 	struct mlx5e_ethtool_table   *eth_ft;
+	struct mlx5e_rss             *rss;
 };
 
 static void put_flow_table(struct mlx5e_ethtool_table *eth_ft)
@@ -66,7 +74,7 @@ static struct mlx5e_ethtool_table *get_flow_table(struct mlx5e_priv *priv,
 	int table_size;
 	int prio;
 
-	switch (fs->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT)) {
+	switch (flow_type_mask(fs->flow_type)) {
 	case TCP_V4_FLOW:
 	case UDP_V4_FLOW:
 	case TCP_V6_FLOW:
@@ -329,7 +337,7 @@ static int set_flow_attrs(u32 *match_c, u32 *match_v,
 					     outer_headers);
 	void *outer_headers_v = MLX5_ADDR_OF(fte_match_param, match_v,
 					     outer_headers);
-	u32 flow_type = fs->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT);
+	u32 flow_type = flow_type_mask(fs->flow_type);
 
 	switch (flow_type) {
 	case TCP_V4_FLOW:
@@ -397,10 +405,53 @@ static bool outer_header_zero(u32 *match_criteria)
 						  size - 1);
 }
 
+static int flow_get_tirn(struct mlx5e_priv *priv,
+			 struct mlx5e_ethtool_rule *eth_rule,
+			 struct ethtool_rx_flow_spec *fs,
+			 u32 rss_context, u32 *tirn)
+{
+	if (fs->flow_type & FLOW_RSS) {
+		struct mlx5e_lro_param lro_param;
+		struct mlx5e_rss *rss;
+		u32 flow_type;
+		int err;
+		int tt;
+
+		rss = mlx5e_rx_res_rss_get(priv->rx_res, rss_context);
+		if (!rss)
+			return -ENOENT;
+
+		flow_type = flow_type_mask(fs->flow_type);
+		tt = flow_type_to_traffic_type(flow_type);
+		if (tt < 0)
+			return -EINVAL;
+
+		lro_param = mlx5e_get_lro_param(&priv->channels.params);
+		err = mlx5e_rss_obtain_tirn(rss, tt, &lro_param, false, tirn);
+		if (err)
+			return err;
+		eth_rule->rss = rss;
+		mlx5e_rss_refcnt_inc(eth_rule->rss);
+	} else {
+		struct mlx5e_params *params = &priv->channels.params;
+		enum mlx5e_rq_group group;
+		u16 ix;
+
+		mlx5e_qid_get_ch_and_group(params, fs->ring_cookie, &ix, &group);
+
+		*tirn = group == MLX5E_RQ_GROUP_XSK ?
+			mlx5e_rx_res_get_tirn_xsk(priv->rx_res, ix) :
+			mlx5e_rx_res_get_tirn_direct(priv->rx_res, ix);
+	}
+
+	return 0;
+}
+
 static struct mlx5_flow_handle *
 add_ethtool_flow_rule(struct mlx5e_priv *priv,
+		      struct mlx5e_ethtool_rule *eth_rule,
 		      struct mlx5_flow_table *ft,
-		      struct ethtool_rx_flow_spec *fs)
+		      struct ethtool_rx_flow_spec *fs, u32 rss_context)
 {
 	struct mlx5_flow_act flow_act = { .flags = FLOW_ACT_NO_APPEND };
 	struct mlx5_flow_destination *dst = NULL;
@@ -419,23 +470,17 @@ add_ethtool_flow_rule(struct mlx5e_priv *priv,
 	if (fs->ring_cookie == RX_CLS_FLOW_DISC) {
 		flow_act.action = MLX5_FLOW_CONTEXT_ACTION_DROP;
 	} else {
-		struct mlx5e_params *params = &priv->channels.params;
-		enum mlx5e_rq_group group;
-		u16 ix;
-
-		mlx5e_qid_get_ch_and_group(params, fs->ring_cookie, &ix, &group);
-
 		dst = kzalloc(sizeof(*dst), GFP_KERNEL);
 		if (!dst) {
 			err = -ENOMEM;
 			goto free;
 		}
 
+		err = flow_get_tirn(priv, eth_rule, fs, rss_context, &dst->tir_num);
+		if (err)
+			goto free;
+
 		dst->type = MLX5_FLOW_DESTINATION_TYPE_TIR;
-		if (group == MLX5E_RQ_GROUP_XSK)
-			dst->tir_num = mlx5e_rx_res_get_tirn_xsk(priv->rx_res, ix);
-		else
-			dst->tir_num = mlx5e_rx_res_get_tirn_direct(priv->rx_res, ix);
 		flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
 	}
 
@@ -459,6 +504,8 @@ static void del_ethtool_rule(struct mlx5e_priv *priv,
 {
 	if (eth_rule->rule)
 		mlx5_del_flow_rules(eth_rule->rule);
+	if (eth_rule->rss)
+		mlx5e_rss_refcnt_dec(eth_rule->rss);
 	list_del(&eth_rule->list);
 	priv->fs.ethtool.tot_num_rules--;
 	put_flow_table(eth_rule->eth_ft);
@@ -619,7 +666,7 @@ static int validate_flow(struct mlx5e_priv *priv,
 					fs->ring_cookie))
 			return -EINVAL;
 
-	switch (fs->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT)) {
+	switch (flow_type_mask(fs->flow_type)) {
 	case ETHER_FLOW:
 		num_tuples += validate_ethter(fs);
 		break;
@@ -668,7 +715,7 @@ static int validate_flow(struct mlx5e_priv *priv,
 
 static int
 mlx5e_ethtool_flow_replace(struct mlx5e_priv *priv,
-			   struct ethtool_rx_flow_spec *fs)
+			   struct ethtool_rx_flow_spec *fs, u32 rss_context)
 {
 	struct mlx5e_ethtool_table *eth_ft;
 	struct mlx5e_ethtool_rule *eth_rule;
@@ -699,7 +746,7 @@ mlx5e_ethtool_flow_replace(struct mlx5e_priv *priv,
 		err = -EINVAL;
 		goto del_ethtool_rule;
 	}
-	rule = add_ethtool_flow_rule(priv, eth_ft->ft, fs);
+	rule = add_ethtool_flow_rule(priv, eth_rule, eth_ft->ft, fs, rss_context);
 	if (IS_ERR(rule)) {
 		err = PTR_ERR(rule);
 		goto del_ethtool_rule;
@@ -745,10 +792,20 @@ mlx5e_ethtool_get_flow(struct mlx5e_priv *priv,
 		return -EINVAL;
 
 	list_for_each_entry(eth_rule, &priv->fs.ethtool.rules, list) {
-		if (eth_rule->flow_spec.location == location) {
-			info->fs = eth_rule->flow_spec;
+		int index;
+
+		if (eth_rule->flow_spec.location != location)
+			continue;
+		if (!info)
 			return 0;
-		}
+		info->fs = eth_rule->flow_spec;
+		if (!eth_rule->rss)
+			return 0;
+		index = mlx5e_rx_res_rss_index(priv->rx_res, eth_rule->rss);
+		if (index < 0)
+			return index;
+		info->rss_context = index;
+		return 0;
 	}
 
 	return -ENOENT;
@@ -764,7 +821,7 @@ mlx5e_ethtool_get_all_flows(struct mlx5e_priv *priv,
 
 	info->data = MAX_NUM_OF_ETHTOOL_RULES;
 	while ((!err || err == -ENOENT) && idx < info->rule_cnt) {
-		err = mlx5e_ethtool_get_flow(priv, info, location);
+		err = mlx5e_ethtool_get_flow(priv, NULL, location);
 		if (!err)
 			rule_locs[idx++] = location;
 		location++;
@@ -887,7 +944,7 @@ int mlx5e_ethtool_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
 
 	switch (cmd->cmd) {
 	case ETHTOOL_SRXCLSRLINS:
-		err = mlx5e_ethtool_flow_replace(priv, &cmd->fs);
+		err = mlx5e_ethtool_flow_replace(priv, &cmd->fs, cmd->rss_context);
 		break;
 	case ETHTOOL_SRXCLSRLDEL:
 		err = mlx5e_ethtool_flow_remove(priv, cmd->fs.location);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 08/17] net/mlx5e: Abstract MQPRIO params
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 07/17] net/mlx5e: Support flow classification into " Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 09/17] net/mlx5e: Maintain MQPRIO mode parameter Saeed Mahameed
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Abstract the MQPRIO params into a struct.
Use a getter for DCB mode num_tcs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  9 ++++++++-
 .../net/ethernet/mellanox/mlx5/core/en/ptp.c  | 18 ++++++++++-------
 .../net/ethernet/mellanox/mlx5/core/en/qos.c  |  2 +-
 .../mellanox/mlx5/core/en/reporter_tx.c       |  8 ++++----
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 20 +++++++++----------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  5 +++--
 6 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 4f6897c1ea8d..1ddf320af831 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -248,7 +248,9 @@ struct mlx5e_params {
 	u8  rq_wq_type;
 	u8  log_rq_mtu_frames;
 	u16 num_channels;
-	u8  num_tc;
+	struct {
+		u8 num_tc;
+	} mqprio;
 	bool rx_cqe_compress_def;
 	bool tunneled_offload_en;
 	struct dim_cq_moder rx_cq_moderation;
@@ -268,6 +270,11 @@ struct mlx5e_params {
 	bool ptp_rx;
 };
 
+static inline u8 mlx5e_get_dcb_num_tc(struct mlx5e_params *params)
+{
+	return params->mqprio.num_tc;
+}
+
 enum {
 	MLX5E_RQ_STATE_ENABLED,
 	MLX5E_RQ_STATE_RECOVERING,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index f479ef31ca40..ee688dec67a9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -326,13 +326,14 @@ static int mlx5e_ptp_open_txqsqs(struct mlx5e_ptp *c,
 				 struct mlx5e_ptp_params *cparams)
 {
 	struct mlx5e_params *params = &cparams->params;
+	u8 num_tc = mlx5e_get_dcb_num_tc(params);
 	int ix_base;
 	int err;
 	int tc;
 
-	ix_base = params->num_tc * params->num_channels;
+	ix_base = num_tc * params->num_channels;
 
-	for (tc = 0; tc < params->num_tc; tc++) {
+	for (tc = 0; tc < num_tc; tc++) {
 		int txq_ix = ix_base + tc;
 
 		err = mlx5e_ptp_open_txqsq(c, c->priv->tisn[c->lag_port][tc], txq_ix,
@@ -365,9 +366,12 @@ static int mlx5e_ptp_open_tx_cqs(struct mlx5e_ptp *c,
 	struct mlx5e_create_cq_param ccp = {};
 	struct dim_cq_moder ptp_moder = {};
 	struct mlx5e_cq_param *cq_param;
+	u8 num_tc;
 	int err;
 	int tc;
 
+	num_tc = mlx5e_get_dcb_num_tc(params);
+
 	ccp.node     = dev_to_node(mlx5_core_dma_dev(c->mdev));
 	ccp.ch_stats = c->stats;
 	ccp.napi     = &c->napi;
@@ -375,7 +379,7 @@ static int mlx5e_ptp_open_tx_cqs(struct mlx5e_ptp *c,
 
 	cq_param = &cparams->txq_sq_param.cqp;
 
-	for (tc = 0; tc < params->num_tc; tc++) {
+	for (tc = 0; tc < num_tc; tc++) {
 		struct mlx5e_cq *cq = &c->ptpsq[tc].txqsq.cq;
 
 		err = mlx5e_open_cq(c->priv, ptp_moder, cq_param, &ccp, cq);
@@ -383,7 +387,7 @@ static int mlx5e_ptp_open_tx_cqs(struct mlx5e_ptp *c,
 			goto out_err_txqsq_cq;
 	}
 
-	for (tc = 0; tc < params->num_tc; tc++) {
+	for (tc = 0; tc < num_tc; tc++) {
 		struct mlx5e_cq *cq = &c->ptpsq[tc].ts_cq;
 		struct mlx5e_ptpsq *ptpsq = &c->ptpsq[tc];
 
@@ -399,7 +403,7 @@ static int mlx5e_ptp_open_tx_cqs(struct mlx5e_ptp *c,
 out_err_ts_cq:
 	for (--tc; tc >= 0; tc--)
 		mlx5e_close_cq(&c->ptpsq[tc].ts_cq);
-	tc = params->num_tc;
+	tc = num_tc;
 out_err_txqsq_cq:
 	for (--tc; tc >= 0; tc--)
 		mlx5e_close_cq(&c->ptpsq[tc].txqsq.cq);
@@ -475,7 +479,7 @@ static void mlx5e_ptp_build_params(struct mlx5e_ptp *c,
 	params->num_channels = orig->num_channels;
 	params->hard_mtu = orig->hard_mtu;
 	params->sw_mtu = orig->sw_mtu;
-	params->num_tc = orig->num_tc;
+	params->mqprio = orig->mqprio;
 
 	/* SQ */
 	if (test_bit(MLX5E_PTP_STATE_TX, c->state)) {
@@ -680,7 +684,7 @@ int mlx5e_ptp_open(struct mlx5e_priv *priv, struct mlx5e_params *params,
 	c->pdev     = mlx5_core_dma_dev(priv->mdev);
 	c->netdev   = priv->netdev;
 	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.hw_objs.mkey.key);
-	c->num_tc   = params->num_tc;
+	c->num_tc   = mlx5e_get_dcb_num_tc(params);
 	c->stats    = &priv->ptp_stats.ch;
 	c->lag_port = lag_port;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
index 5efe3278b0f6..c9ac69f62f21 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
@@ -132,7 +132,7 @@ static u16 mlx5e_qid_from_qos(struct mlx5e_channels *chs, u16 qid)
 	 */
 	bool is_ptp = MLX5E_GET_PFLAG(&chs->params, MLX5E_PFLAG_TX_PORT_TS);
 
-	return (chs->params.num_channels + is_ptp) * chs->params.num_tc + qid;
+	return (chs->params.num_channels + is_ptp) * mlx5e_get_dcb_num_tc(&chs->params) + qid;
 }
 
 int mlx5e_get_txq_by_classid(struct mlx5e_priv *priv, u16 classid)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 9d361efd5ff7..bb682fd751c9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -372,7 +372,7 @@ static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
 	for (i = 0; i < priv->channels.num; i++) {
 		struct mlx5e_channel *c = priv->channels.c[i];
 
-		for (tc = 0; tc < priv->channels.params.num_tc; tc++) {
+		for (tc = 0; tc < mlx5e_get_dcb_num_tc(&priv->channels.params); tc++) {
 			struct mlx5e_txqsq *sq = &c->sq[tc];
 
 			err = mlx5e_tx_reporter_build_diagnose_output(fmsg, sq, tc);
@@ -384,7 +384,7 @@ static int mlx5e_tx_reporter_diagnose(struct devlink_health_reporter *reporter,
 	if (!ptp_ch || !test_bit(MLX5E_PTP_STATE_TX, ptp_ch->state))
 		goto close_sqs_nest;
 
-	for (tc = 0; tc < priv->channels.params.num_tc; tc++) {
+	for (tc = 0; tc < mlx5e_get_dcb_num_tc(&priv->channels.params); tc++) {
 		err = mlx5e_tx_reporter_build_diagnose_output_ptpsq(fmsg,
 								    &ptp_ch->ptpsq[tc],
 								    tc);
@@ -494,7 +494,7 @@ static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv,
 	for (i = 0; i < priv->channels.num; i++) {
 		struct mlx5e_channel *c = priv->channels.c[i];
 
-		for (tc = 0; tc < priv->channels.params.num_tc; tc++) {
+		for (tc = 0; tc < mlx5e_get_dcb_num_tc(&priv->channels.params); tc++) {
 			struct mlx5e_txqsq *sq = &c->sq[tc];
 
 			err = mlx5e_health_queue_dump(priv, fmsg, sq->sqn, "SQ");
@@ -504,7 +504,7 @@ static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv,
 	}
 
 	if (ptp_ch && test_bit(MLX5E_PTP_STATE_TX, ptp_ch->state)) {
-		for (tc = 0; tc < priv->channels.params.num_tc; tc++) {
+		for (tc = 0; tc < mlx5e_get_dcb_num_tc(&priv->channels.params); tc++) {
 			struct mlx5e_txqsq *sq = &ptp_ch->ptpsq[tc].txqsq;
 
 			err = mlx5e_health_queue_dump(priv, fmsg, sq->sqn, "PTP SQ");
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e559afc70bff..b2f95cd34622 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1711,7 +1711,7 @@ static int mlx5e_open_sqs(struct mlx5e_channel *c,
 {
 	int err, tc;
 
-	for (tc = 0; tc < params->num_tc; tc++) {
+	for (tc = 0; tc < mlx5e_get_dcb_num_tc(params); tc++) {
 		int txq_ix = c->ix + tc * params->num_channels;
 
 		err = mlx5e_open_txqsq(c, c->priv->tisn[c->lag_port][tc], txq_ix,
@@ -1992,7 +1992,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 	c->pdev     = mlx5_core_dma_dev(priv->mdev);
 	c->netdev   = priv->netdev;
 	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.hw_objs.mkey.key);
-	c->num_tc   = params->num_tc;
+	c->num_tc   = mlx5e_get_dcb_num_tc(params);
 	c->xdp      = !!params->xdp_prog;
 	c->stats    = &priv->channel_stats[ix].ch;
 	c->aff_mask = irq_get_effective_affinity_mask(irq);
@@ -2288,7 +2288,7 @@ int mlx5e_update_tx_netdev_queues(struct mlx5e_priv *priv)
 	qos_queues = mlx5e_qos_cur_leaf_nodes(priv);
 
 	nch = priv->channels.params.num_channels;
-	ntc = priv->channels.params.num_tc;
+	ntc = mlx5e_get_dcb_num_tc(&priv->channels.params);
 	num_txqs = nch * ntc + qos_queues;
 	if (MLX5E_GET_PFLAG(&priv->channels.params, MLX5E_PFLAG_TX_PORT_TS))
 		num_txqs += ntc;
@@ -2312,7 +2312,7 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
 	old_ntc = netdev->num_tc ? : 1;
 
 	nch = priv->channels.params.num_channels;
-	ntc = priv->channels.params.num_tc;
+	ntc = mlx5e_get_dcb_num_tc(&priv->channels.params);
 	num_rxqs = nch * priv->profile->rq_groups;
 
 	mlx5e_netdev_set_tcs(netdev, nch, ntc);
@@ -2387,7 +2387,7 @@ static void mlx5e_build_txq_maps(struct mlx5e_priv *priv)
 	int i, ch, tc, num_tc;
 
 	ch = priv->channels.num;
-	num_tc = priv->channels.params.num_tc;
+	num_tc = mlx5e_get_dcb_num_tc(&priv->channels.params);
 
 	for (i = 0; i < ch; i++) {
 		for (tc = 0; tc < num_tc; tc++) {
@@ -2418,7 +2418,7 @@ static void mlx5e_update_num_tc_x_num_ch(struct mlx5e_priv *priv)
 {
 	/* Sync with mlx5e_select_queue. */
 	WRITE_ONCE(priv->num_tc_x_num_ch,
-		   priv->channels.params.num_tc * priv->channels.num);
+		   mlx5e_get_dcb_num_tc(&priv->channels.params) * priv->channels.num);
 }
 
 void mlx5e_activate_priv_channels(struct mlx5e_priv *priv)
@@ -2870,14 +2870,14 @@ static int mlx5e_setup_tc_mqprio(struct mlx5e_priv *priv,
 	}
 
 	new_params = priv->channels.params;
-	new_params.num_tc = tc ? tc : 1;
+	new_params.mqprio.num_tc = tc ? tc : 1;
 
 	err = mlx5e_safe_switch_params(priv, &new_params,
 				       mlx5e_num_channels_changed_ctx, NULL, true);
 
 out:
 	priv->max_opened_tc = max_t(u8, priv->max_opened_tc,
-				    priv->channels.params.num_tc);
+				    mlx5e_get_dcb_num_tc(&priv->channels.params));
 	mutex_unlock(&priv->state_lock);
 	return err;
 }
@@ -4093,12 +4093,12 @@ void mlx5e_build_nic_params(struct mlx5e_priv *priv, struct mlx5e_xsk *xsk, u16
 	params->hard_mtu = MLX5E_ETH_HARD_MTU;
 	params->num_channels = min_t(unsigned int, MLX5E_MAX_NUM_CHANNELS / 2,
 				     priv->max_nch);
-	params->num_tc       = 1;
+	params->mqprio.num_tc = 1;
 
 	/* Set an initial non-zero value, so that mlx5e_select_queue won't
 	 * divide by zero if called before first activating channels.
 	 */
-	priv->num_tc_x_num_ch = params->num_channels * params->num_tc;
+	priv->num_tc_x_num_ch = params->num_channels * params->mqprio.num_tc;
 
 	/* SQ */
 	params->log_sq_size = is_kdump_kernel() ?
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index c54aaef521b7..eb83f27850c7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -394,7 +394,8 @@ int mlx5e_add_sqs_fwd_rules(struct mlx5e_priv *priv)
 	int err = -ENOMEM;
 	u32 *sqs;
 
-	sqs = kcalloc(priv->channels.num * priv->channels.params.num_tc, sizeof(*sqs), GFP_KERNEL);
+	sqs = kcalloc(priv->channels.num * mlx5e_get_dcb_num_tc(&priv->channels.params),
+		      sizeof(*sqs), GFP_KERNEL);
 	if (!sqs)
 		goto out;
 
@@ -611,7 +612,7 @@ static void mlx5e_build_rep_params(struct net_device *netdev)
 	params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation);
 	mlx5e_set_rx_cq_mode_params(params, cq_period_mode);
 
-	params->num_tc                = 1;
+	params->mqprio.num_tc       = 1;
 	params->tunneled_offload_en = false;
 
 	mlx5_query_min_inline(mdev, &params->tx_min_inline_mode);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 09/17] net/mlx5e: Maintain MQPRIO mode parameter
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 08/17] net/mlx5e: Abstract MQPRIO params Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 10/17] net/mlx5e: Handle errors of netdev_set_num_tc() Saeed Mahameed
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

This is in preparation for supporting MQPRIO CHANNEL mode in
downstream patch, in addition to DCB mode that's supported today.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  4 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 41 +++++++++++--------
 2 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1ddf320af831..3dbcb2cf2ff8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -249,6 +249,7 @@ struct mlx5e_params {
 	u8  log_rq_mtu_frames;
 	u16 num_channels;
 	struct {
+		u16 mode;
 		u8 num_tc;
 	} mqprio;
 	bool rx_cqe_compress_def;
@@ -272,7 +273,8 @@ struct mlx5e_params {
 
 static inline u8 mlx5e_get_dcb_num_tc(struct mlx5e_params *params)
 {
-	return params->mqprio.num_tc;
+	return params->mqprio.mode == TC_MQPRIO_MODE_DCB ?
+		params->mqprio.num_tc : 1;
 }
 
 enum {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index b2f95cd34622..0d84eb17707e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2847,41 +2847,47 @@ static int mlx5e_modify_channels_vsd(struct mlx5e_channels *chs, bool vsd)
 	return 0;
 }
 
-static int mlx5e_setup_tc_mqprio(struct mlx5e_priv *priv,
-				 struct tc_mqprio_qopt *mqprio)
+static int mlx5e_setup_tc_mqprio_dcb(struct mlx5e_priv *priv,
+				     struct tc_mqprio_qopt *mqprio)
 {
 	struct mlx5e_params new_params;
 	u8 tc = mqprio->num_tc;
-	int err = 0;
+	int err;
 
 	mqprio->hw = TC_MQPRIO_HW_OFFLOAD_TCS;
 
 	if (tc && tc != MLX5E_MAX_NUM_TC)
 		return -EINVAL;
 
-	mutex_lock(&priv->state_lock);
-
-	/* MQPRIO is another toplevel qdisc that can't be attached
-	 * simultaneously with the offloaded HTB.
-	 */
-	if (WARN_ON(priv->htb.maj_id)) {
-		err = -EINVAL;
-		goto out;
-	}
-
 	new_params = priv->channels.params;
+	new_params.mqprio.mode = TC_MQPRIO_MODE_DCB;
 	new_params.mqprio.num_tc = tc ? tc : 1;
 
 	err = mlx5e_safe_switch_params(priv, &new_params,
 				       mlx5e_num_channels_changed_ctx, NULL, true);
 
-out:
 	priv->max_opened_tc = max_t(u8, priv->max_opened_tc,
 				    mlx5e_get_dcb_num_tc(&priv->channels.params));
-	mutex_unlock(&priv->state_lock);
 	return err;
 }
 
+static int mlx5e_setup_tc_mqprio(struct mlx5e_priv *priv,
+				 struct tc_mqprio_qopt_offload *mqprio)
+{
+	/* MQPRIO is another toplevel qdisc that can't be attached
+	 * simultaneously with the offloaded HTB.
+	 */
+	if (WARN_ON(priv->htb.maj_id))
+		return -EINVAL;
+
+	switch (mqprio->mode) {
+	case TC_MQPRIO_MODE_DCB:
+		return mlx5e_setup_tc_mqprio_dcb(priv, &mqprio->qopt);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static int mlx5e_setup_tc_htb(struct mlx5e_priv *priv, struct tc_htb_qopt_offload *htb)
 {
 	int res;
@@ -2951,7 +2957,10 @@ static int mlx5e_setup_tc(struct net_device *dev, enum tc_setup_type type,
 						  priv, priv, true);
 	}
 	case TC_SETUP_QDISC_MQPRIO:
-		return mlx5e_setup_tc_mqprio(priv, type_data);
+		mutex_lock(&priv->state_lock);
+		err = mlx5e_setup_tc_mqprio(priv, type_data);
+		mutex_unlock(&priv->state_lock);
+		return err;
 	case TC_SETUP_QDISC_HTB:
 		mutex_lock(&priv->state_lock);
 		err = mlx5e_setup_tc_htb(priv, type_data);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 10/17] net/mlx5e: Handle errors of netdev_set_num_tc()
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 09/17] net/mlx5e: Maintain MQPRIO mode parameter Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 11/17] net/mlx5e: Support MQPRIO channel mode Saeed Mahameed
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Add handling for failures in netdev_set_num_tc().
Let mlx5e_netdev_set_tcs return an int.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 20 +++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0d84eb17707e..f5c89a00214d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2263,22 +2263,28 @@ void mlx5e_set_netdev_mtu_boundaries(struct mlx5e_priv *priv)
 				ETH_MAX_MTU);
 }
 
-static void mlx5e_netdev_set_tcs(struct net_device *netdev, u16 nch, u8 ntc)
+static int mlx5e_netdev_set_tcs(struct net_device *netdev, u16 nch, u8 ntc)
 {
-	int tc;
+	int tc, err;
 
 	netdev_reset_tc(netdev);
 
 	if (ntc == 1)
-		return;
+		return 0;
 
-	netdev_set_num_tc(netdev, ntc);
+	err = netdev_set_num_tc(netdev, ntc);
+	if (err) {
+		netdev_WARN(netdev, "netdev_set_num_tc failed (%d), ntc = %d\n", err, ntc);
+		return err;
+	}
 
 	/* Map netdev TCs to offset 0
 	 * We have our own UP to TXQ mapping for QoS
 	 */
 	for (tc = 0; tc < ntc; tc++)
 		netdev_set_tc_queue(netdev, tc, nch, 0);
+
+	return 0;
 }
 
 int mlx5e_update_tx_netdev_queues(struct mlx5e_priv *priv)
@@ -2315,8 +2321,9 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
 	ntc = mlx5e_get_dcb_num_tc(&priv->channels.params);
 	num_rxqs = nch * priv->profile->rq_groups;
 
-	mlx5e_netdev_set_tcs(netdev, nch, ntc);
-
+	err = mlx5e_netdev_set_tcs(netdev, nch, ntc);
+	if (err)
+		goto err_out;
 	err = mlx5e_update_tx_netdev_queues(priv);
 	if (err)
 		goto err_tcs;
@@ -2338,6 +2345,7 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
 
 err_tcs:
 	mlx5e_netdev_set_tcs(netdev, old_num_txqs / old_ntc, old_ntc);
+err_out:
 	return err;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 11/17] net/mlx5e: Support MQPRIO channel mode
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 10/17] net/mlx5e: Handle errors of netdev_set_num_tc() Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 12/17] net/mlx5: Bridge, release bridge in same function where it is taken Saeed Mahameed
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Maxim Mikityanskiy,
	Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Add support for MQPRIO channel mode, in which a partition to TCs
is defined over the channels. We allow partitions with contiguous
queue indices, with no holes within. We do not allow modification
to the num of channels while this MQPRIO mode is active.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  1 +
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  | 10 ++
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 99 +++++++++++++++++--
 3 files changed, 102 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 3dbcb2cf2ff8..669a75f3537a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -72,6 +72,7 @@ struct page_pool;
 #define MLX5E_SW2HW_MTU(params, swmtu) ((swmtu) + ((params)->hard_mtu))
 
 #define MLX5E_MAX_NUM_TC	8
+#define MLX5E_MAX_NUM_MQPRIO_CH_TC TC_QOPT_MAX_QUEUE
 
 #define MLX5_RX_HEADROOM NET_SKB_PAD
 #define MLX5_SKB_FRAG_SZ(len)	(SKB_DATA_ALIGN(len) +	\
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 839a753fda32..5696d3f1baaf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -467,6 +467,16 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
 		goto out;
 	}
 
+	/* Don't allow changing the number of channels if MQPRIO mode channel offload is active,
+	 * because it defines a partition over the channels queues.
+	 */
+	if (cur_params->mqprio.mode == TC_MQPRIO_MODE_CHANNEL) {
+		err = -EINVAL;
+		netdev_err(priv->netdev, "%s: MQPRIO mode channel offload is active, cannot change the number of channels\n",
+			   __func__);
+		goto out;
+	}
+
 	new_params = *cur_params;
 	new_params.num_channels = count;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index f5c89a00214d..26d2f78c7706 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2263,7 +2263,8 @@ void mlx5e_set_netdev_mtu_boundaries(struct mlx5e_priv *priv)
 				ETH_MAX_MTU);
 }
 
-static int mlx5e_netdev_set_tcs(struct net_device *netdev, u16 nch, u8 ntc)
+static int mlx5e_netdev_set_tcs(struct net_device *netdev, u16 nch, u8 ntc,
+				struct tc_mqprio_qopt_offload *mqprio)
 {
 	int tc, err;
 
@@ -2278,11 +2279,16 @@ static int mlx5e_netdev_set_tcs(struct net_device *netdev, u16 nch, u8 ntc)
 		return err;
 	}
 
-	/* Map netdev TCs to offset 0
-	 * We have our own UP to TXQ mapping for QoS
-	 */
-	for (tc = 0; tc < ntc; tc++)
-		netdev_set_tc_queue(netdev, tc, nch, 0);
+	for (tc = 0; tc < ntc; tc++) {
+		u16 count, offset;
+
+		/* For DCB mode, map netdev TCs to offset 0
+		 * We have our own UP to TXQ mapping for QoS
+		 */
+		count = mqprio ? mqprio->qopt.count[tc] : nch;
+		offset = mqprio ? mqprio->qopt.offset[tc] : 0;
+		netdev_set_tc_queue(netdev, tc, count, offset);
+	}
 
 	return 0;
 }
@@ -2321,7 +2327,7 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
 	ntc = mlx5e_get_dcb_num_tc(&priv->channels.params);
 	num_rxqs = nch * priv->profile->rq_groups;
 
-	err = mlx5e_netdev_set_tcs(netdev, nch, ntc);
+	err = mlx5e_netdev_set_tcs(netdev, nch, ntc, NULL);
 	if (err)
 		goto err_out;
 	err = mlx5e_update_tx_netdev_queues(priv);
@@ -2344,7 +2350,7 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv)
 	WARN_ON_ONCE(netif_set_real_num_tx_queues(netdev, old_num_txqs));
 
 err_tcs:
-	mlx5e_netdev_set_tcs(netdev, old_num_txqs / old_ntc, old_ntc);
+	mlx5e_netdev_set_tcs(netdev, old_num_txqs / old_ntc, old_ntc, NULL);
 err_out:
 	return err;
 }
@@ -2879,6 +2885,81 @@ static int mlx5e_setup_tc_mqprio_dcb(struct mlx5e_priv *priv,
 	return err;
 }
 
+static int mlx5e_mqprio_channel_validate(struct mlx5e_priv *priv,
+					 struct tc_mqprio_qopt_offload *mqprio)
+{
+	struct net_device *netdev = priv->netdev;
+	int agg_count = 0;
+	int i;
+
+	if (mqprio->qopt.offset[0] != 0 || mqprio->qopt.num_tc < 1 ||
+	    mqprio->qopt.num_tc > MLX5E_MAX_NUM_MQPRIO_CH_TC)
+		return -EINVAL;
+
+	for (i = 0; i < mqprio->qopt.num_tc; i++) {
+		if (!mqprio->qopt.count[i]) {
+			netdev_err(netdev, "Zero size for queue-group (%d) is not supported\n", i);
+			return -EINVAL;
+		}
+		if (mqprio->min_rate[i]) {
+			netdev_err(netdev, "Min tx rate is not supported\n");
+			return -EINVAL;
+		}
+		if (mqprio->max_rate[i]) {
+			netdev_err(netdev, "Max tx rate is not supported\n");
+			return -EINVAL;
+		}
+
+		if (mqprio->qopt.offset[i] != agg_count) {
+			netdev_err(netdev, "Discontinuous queues config is not supported\n");
+			return -EINVAL;
+		}
+		agg_count += mqprio->qopt.count[i];
+	}
+
+	if (priv->channels.params.num_channels < agg_count) {
+		netdev_err(netdev, "Num of queues (%d) exceeds available (%d)\n",
+			   agg_count, priv->channels.params.num_channels);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int mlx5e_mqprio_channel_set_tcs_ctx(struct mlx5e_priv *priv, void *ctx)
+{
+	struct tc_mqprio_qopt_offload *mqprio = (struct tc_mqprio_qopt_offload *)ctx;
+	struct net_device *netdev = priv->netdev;
+	u8 num_tc;
+
+	if (priv->channels.params.mqprio.mode != TC_MQPRIO_MODE_CHANNEL)
+		return -EINVAL;
+
+	num_tc = priv->channels.params.mqprio.num_tc;
+	mlx5e_netdev_set_tcs(netdev, 0, num_tc, mqprio);
+
+	return 0;
+}
+
+static int mlx5e_setup_tc_mqprio_channel(struct mlx5e_priv *priv,
+					 struct tc_mqprio_qopt_offload *mqprio)
+{
+	struct mlx5e_params new_params;
+	int err;
+
+	err = mlx5e_mqprio_channel_validate(priv, mqprio);
+	if (err)
+		return err;
+
+	new_params = priv->channels.params;
+	new_params.mqprio.mode = TC_MQPRIO_MODE_CHANNEL;
+	new_params.mqprio.num_tc = mqprio->qopt.num_tc;
+	err = mlx5e_safe_switch_params(priv, &new_params,
+				       mlx5e_mqprio_channel_set_tcs_ctx, mqprio, true);
+
+	return err;
+}
+
 static int mlx5e_setup_tc_mqprio(struct mlx5e_priv *priv,
 				 struct tc_mqprio_qopt_offload *mqprio)
 {
@@ -2891,6 +2972,8 @@ static int mlx5e_setup_tc_mqprio(struct mlx5e_priv *priv,
 	switch (mqprio->mode) {
 	case TC_MQPRIO_MODE_DCB:
 		return mlx5e_setup_tc_mqprio_dcb(priv, &mqprio->qopt);
+	case TC_MQPRIO_MODE_CHANNEL:
+		return mlx5e_setup_tc_mqprio_channel(priv, mqprio);
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 12/17] net/mlx5: Bridge, release bridge in same function where it is taken
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 11/17] net/mlx5e: Support MQPRIO channel mode Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 13/17] net/mlx5: Bridge, obtain core device from eswitch instead of priv Saeed Mahameed
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Vlad Buslov, Roi Dayan,
	Mark Bloch, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Refactor mlx5_esw_bridge_vport_link() to release the bridge instance if
mlx5_esw_bridge_vport_init() returned an error instead of relying on it to
release the bridge. This improves the design because object instance is
taken and released in same layer and simplifies following patches that add
more logic to mlx5_esw_bridge_vport_link().

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 69a3630818d7..4bca480e3e7d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -1042,10 +1042,8 @@ static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloa
 	int err;
 
 	port = kvzalloc(sizeof(*port), GFP_KERNEL);
-	if (!port) {
-		err = -ENOMEM;
-		goto err_port_alloc;
-	}
+	if (!port)
+		return -ENOMEM;
 
 	port->vport_num = vport->vport;
 	xa_init(&port->vlans);
@@ -1062,8 +1060,6 @@ static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloa
 
 err_port_insert:
 	kvfree(port);
-err_port_alloc:
-	mlx5_esw_bridge_put(br_offloads, bridge);
 	return err;
 }
 
@@ -1108,8 +1104,14 @@ int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_
 	}
 
 	err = mlx5_esw_bridge_vport_init(br_offloads, bridge, vport);
-	if (err)
+	if (err) {
 		NL_SET_ERR_MSG_MOD(extack, "Error initializing port");
+		goto err_vport;
+	}
+	return 0;
+
+err_vport:
+	mlx5_esw_bridge_put(br_offloads, bridge);
 	return err;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 13/17] net/mlx5: Bridge, obtain core device from eswitch instead of priv
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 12/17] net/mlx5: Bridge, release bridge in same function where it is taken Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 14/17] net/mlx5: Bridge, identify port by vport_num+esw_owner_vhca_id pair Saeed Mahameed
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Vlad Buslov, Roi Dayan,
	Mark Bloch, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Following patches in series will pass bond device to bridge, which means
the code can't assume the device is mlx5 representor. Moreover, the core
device can be easily obtained from eswitch instance, so there is no reason
for more complex code that obtains struct mlx5_priv from net_device in
order to use its mdev. Refactor the code to use esw->dev instead of
priv->mdev.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 4bca480e3e7d..e2963d8d5302 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -912,7 +912,6 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	struct mlx5_esw_bridge_fdb_entry *entry;
 	struct mlx5_flow_handle *handle;
 	struct mlx5_fc *counter;
-	struct mlx5e_priv *priv;
 	int err;
 
 	if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG && vid) {
@@ -921,7 +920,6 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 			return ERR_CAST(vlan);
 	}
 
-	priv = netdev_priv(dev);
 	entry = kvzalloc(sizeof(*entry), GFP_KERNEL);
 	if (!entry)
 		return ERR_PTR(-ENOMEM);
@@ -934,7 +932,7 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	if (added_by_user)
 		entry->flags |= MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER;
 
-	counter = mlx5_fc_create(priv->mdev, true);
+	counter = mlx5_fc_create(esw->dev, true);
 	if (IS_ERR(counter)) {
 		err = PTR_ERR(counter);
 		goto err_ingress_fc_create;
@@ -994,7 +992,7 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 err_ingress_filter_flow_create:
 	mlx5_del_flow_rules(entry->ingress_handle);
 err_ingress_flow_create:
-	mlx5_fc_destroy(priv->mdev, entry->ingress_counter);
+	mlx5_fc_destroy(esw->dev, entry->ingress_counter);
 err_ingress_fc_create:
 	kvfree(entry);
 	return ERR_PTR(err);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 14/17] net/mlx5: Bridge, identify port by vport_num+esw_owner_vhca_id pair
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 13/17] net/mlx5: Bridge, obtain core device from eswitch instead of priv Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 15/17] net/mlx5: Bridge, extract FDB delete notification to function Saeed Mahameed
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Vlad Buslov, Roi Dayan,
	Mark Bloch, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Following patches in series allow traffic between vports of different
eswitch instances, which requires addressing bridge port by
vport_num+esw_owner_vhca_id pair since vport_num is only unique
per-eswitch. As a preparation, extend struct mlx5_esw_bridge_port with
'esw_owner_vhca_id' field and use it as part of key for
mlx5_esw_bridge->vports xarray.

With this change we can't rely on switchdev_handle_port_obj_add() helper to
get mlx5 representor from stacked device because we need specifically
representor from parent eswitch that registered the callback to obtain
correct esw_owner_vhca_id. The helper doesn't allow passing additional
parameters to predicate function and doesn't provide access to the notifier
block to obtain eswitch through br_offloads. Implement custom helpers to
obtain mlx5 representor and use them in
mlx5_esw_bridge_port_obj_{add|del|attr_set}() implementations.

Remove direct pointer to parent bridge from struct mlx5_vport as it is no
longer needed.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/rep/bridge.c        | 238 ++++++++++--------
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 184 +++++++-------
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |  37 +--
 .../mellanox/mlx5/core/esw/bridge_priv.h      |   3 +
 .../mlx5/core/esw/diag/bridge_tracepoint.h    |   6 +-
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |   3 -
 6 files changed, 263 insertions(+), 208 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 3c0032c9647c..f21b0beae395 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -18,6 +18,55 @@ struct mlx5_bridge_switchdev_fdb_work {
 	bool add;
 };
 
+static bool mlx5_esw_bridge_dev_same_esw(struct net_device *dev, struct mlx5_eswitch *esw)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+
+	return esw == priv->mdev->priv.eswitch;
+}
+
+static int mlx5_esw_bridge_vport_num_vhca_id_get(struct net_device *dev, struct mlx5_eswitch *esw,
+						 u16 *vport_num, u16 *esw_owner_vhca_id)
+{
+	struct mlx5e_rep_priv *rpriv;
+	struct mlx5e_priv *priv;
+
+	if (!mlx5e_eswitch_rep(dev) || !mlx5_esw_bridge_dev_same_esw(dev, esw))
+		return -ENODEV;
+
+	priv = netdev_priv(dev);
+	rpriv = priv->ppriv;
+	*vport_num = rpriv->rep->vport;
+	*esw_owner_vhca_id = MLX5_CAP_GEN(priv->mdev, vhca_id);
+	return 0;
+}
+
+static int
+mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(struct net_device *dev, struct mlx5_eswitch *esw,
+						u16 *vport_num, u16 *esw_owner_vhca_id)
+{
+	struct net_device *lower_dev;
+	struct list_head *iter;
+
+	if (mlx5e_eswitch_rep(dev) && mlx5_esw_bridge_dev_same_esw(dev, esw))
+		return mlx5_esw_bridge_vport_num_vhca_id_get(dev, esw, vport_num,
+							     esw_owner_vhca_id);
+
+	netdev_for_each_lower_dev(dev, lower_dev, iter) {
+		int err;
+
+		if (netif_is_bridge_master(lower_dev))
+			continue;
+
+		err = mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(lower_dev, esw, vport_num,
+								      esw_owner_vhca_id);
+		if (!err)
+			return 0;
+	}
+
+	return -ENODEV;
+}
+
 static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb,
@@ -25,37 +74,27 @@ static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr
 								    netdev_nb);
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct netdev_notifier_changeupper_info *info = ptr;
+	struct net_device *upper = info->upper_dev;
+	u16 vport_num, esw_owner_vhca_id;
 	struct netlink_ext_ack *extack;
-	struct mlx5e_rep_priv *rpriv;
-	struct mlx5_eswitch *esw;
-	struct mlx5_vport *vport;
-	struct net_device *upper;
-	struct mlx5e_priv *priv;
-	u16 vport_num;
-
-	if (!mlx5e_eswitch_rep(dev))
-		return 0;
+	int ifindex = upper->ifindex;
+	int err;
 
-	upper = info->upper_dev;
 	if (!netif_is_bridge_master(upper))
 		return 0;
 
-	esw = br_offloads->esw;
-	priv = netdev_priv(dev);
-	if (esw != priv->mdev->priv.eswitch)
+	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						    &esw_owner_vhca_id);
+	if (err)
 		return 0;
 
-	rpriv = priv->ppriv;
-	vport_num = rpriv->rep->vport;
-	vport = mlx5_eswitch_get_vport(esw, vport_num);
-	if (IS_ERR(vport))
-		return PTR_ERR(vport);
-
 	extack = netdev_notifier_info_to_extack(&info->info);
 
 	return info->linking ?
-		mlx5_esw_bridge_vport_link(upper->ifindex, br_offloads, vport, extack) :
-		mlx5_esw_bridge_vport_unlink(upper->ifindex, br_offloads, vport, extack);
+		mlx5_esw_bridge_vport_link(ifindex, vport_num, esw_owner_vhca_id, br_offloads,
+					   extack) :
+		mlx5_esw_bridge_vport_unlink(ifindex, vport_num, esw_owner_vhca_id, br_offloads,
+					     extack);
 }
 
 static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
@@ -75,31 +114,29 @@ static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
 	return notifier_from_errno(err);
 }
 
-static int mlx5_esw_bridge_port_obj_add(struct net_device *dev,
-					const void *ctx,
-					const struct switchdev_obj *obj,
-					struct netlink_ext_ack *extack)
+static int
+mlx5_esw_bridge_port_obj_add(struct net_device *dev,
+			     struct switchdev_notifier_port_obj_info *port_obj_info,
+			     struct mlx5_esw_bridge_offloads *br_offloads)
 {
+	struct netlink_ext_ack *extack = switchdev_notifier_info_to_extack(&port_obj_info->info);
+	const struct switchdev_obj *obj = port_obj_info->obj;
 	const struct switchdev_obj_port_vlan *vlan;
-	struct mlx5e_rep_priv *rpriv;
-	struct mlx5_eswitch *esw;
-	struct mlx5_vport *vport;
-	struct mlx5e_priv *priv;
-	u16 vport_num;
-	int err = 0;
+	u16 vport_num, esw_owner_vhca_id;
+	int err;
 
-	priv = netdev_priv(dev);
-	rpriv = priv->ppriv;
-	vport_num = rpriv->rep->vport;
-	esw = priv->mdev->priv.eswitch;
-	vport = mlx5_eswitch_get_vport(esw, vport_num);
-	if (IS_ERR(vport))
-		return PTR_ERR(vport);
+	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						    &esw_owner_vhca_id);
+	if (err)
+		return 0;
+
+	port_obj_info->handled = true;
 
 	switch (obj->id) {
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
-		err = mlx5_esw_bridge_port_vlan_add(vlan->vid, vlan->flags, esw, vport, extack);
+		err = mlx5_esw_bridge_port_vlan_add(vport_num, esw_owner_vhca_id, vlan->vid,
+						    vlan->flags, br_offloads, extack);
 		break;
 	default:
 		return -EOPNOTSUPP;
@@ -107,29 +144,27 @@ static int mlx5_esw_bridge_port_obj_add(struct net_device *dev,
 	return err;
 }
 
-static int mlx5_esw_bridge_port_obj_del(struct net_device *dev,
-					const void *ctx,
-					const struct switchdev_obj *obj)
+static int
+mlx5_esw_bridge_port_obj_del(struct net_device *dev,
+			     struct switchdev_notifier_port_obj_info *port_obj_info,
+			     struct mlx5_esw_bridge_offloads *br_offloads)
 {
+	const struct switchdev_obj *obj = port_obj_info->obj;
 	const struct switchdev_obj_port_vlan *vlan;
-	struct mlx5e_rep_priv *rpriv;
-	struct mlx5_eswitch *esw;
-	struct mlx5_vport *vport;
-	struct mlx5e_priv *priv;
-	u16 vport_num;
+	u16 vport_num, esw_owner_vhca_id;
+	int err;
 
-	priv = netdev_priv(dev);
-	rpriv = priv->ppriv;
-	vport_num = rpriv->rep->vport;
-	esw = priv->mdev->priv.eswitch;
-	vport = mlx5_eswitch_get_vport(esw, vport_num);
-	if (IS_ERR(vport))
-		return PTR_ERR(vport);
+	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						    &esw_owner_vhca_id);
+	if (err)
+		return 0;
+
+	port_obj_info->handled = true;
 
 	switch (obj->id) {
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
-		mlx5_esw_bridge_port_vlan_del(vlan->vid, esw, vport);
+		mlx5_esw_bridge_port_vlan_del(vport_num, esw_owner_vhca_id, vlan->vid, br_offloads);
 		break;
 	default:
 		return -EOPNOTSUPP;
@@ -137,25 +172,22 @@ static int mlx5_esw_bridge_port_obj_del(struct net_device *dev,
 	return 0;
 }
 
-static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
-					     const void *ctx,
-					     const struct switchdev_attr *attr,
-					     struct netlink_ext_ack *extack)
+static int
+mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
+				  struct switchdev_notifier_port_attr_info *port_attr_info,
+				  struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	struct mlx5e_rep_priv *rpriv;
-	struct mlx5_eswitch *esw;
-	struct mlx5_vport *vport;
-	struct mlx5e_priv *priv;
-	u16 vport_num;
-	int err = 0;
+	struct netlink_ext_ack *extack = switchdev_notifier_info_to_extack(&port_attr_info->info);
+	const struct switchdev_attr *attr = port_attr_info->attr;
+	u16 vport_num, esw_owner_vhca_id;
+	int err;
 
-	priv = netdev_priv(dev);
-	rpriv = priv->ppriv;
-	vport_num = rpriv->rep->vport;
-	esw = priv->mdev->priv.eswitch;
-	vport = mlx5_eswitch_get_vport(esw, vport_num);
-	if (IS_ERR(vport))
-		return PTR_ERR(vport);
+	err = mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+							      &esw_owner_vhca_id);
+	if (err)
+		return 0;
+
+	port_attr_info->handled = true;
 
 	switch (attr->id) {
 	case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS:
@@ -167,10 +199,12 @@ static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
 	case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
 		break;
 	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
-		err = mlx5_esw_bridge_ageing_time_set(attr->u.ageing_time, esw, vport);
+		err = mlx5_esw_bridge_ageing_time_set(vport_num, esw_owner_vhca_id,
+						      attr->u.ageing_time, br_offloads);
 		break;
 	case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
-		err = mlx5_esw_bridge_vlan_filtering_set(attr->u.vlan_filtering, esw, vport);
+		err = mlx5_esw_bridge_vlan_filtering_set(vport_num, esw_owner_vhca_id,
+							 attr->u.vlan_filtering, br_offloads);
 		break;
 	default:
 		err = -EOPNOTSUPP;
@@ -179,27 +213,24 @@ static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
 	return err;
 }
 
-static int mlx5_esw_bridge_event_blocking(struct notifier_block *unused,
+static int mlx5_esw_bridge_event_blocking(struct notifier_block *nb,
 					  unsigned long event, void *ptr)
 {
+	struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb,
+								    struct mlx5_esw_bridge_offloads,
+								    nb_blk);
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
 	int err;
 
 	switch (event) {
 	case SWITCHDEV_PORT_OBJ_ADD:
-		err = switchdev_handle_port_obj_add(dev, ptr,
-						    mlx5e_eswitch_rep,
-						    mlx5_esw_bridge_port_obj_add);
+		err = mlx5_esw_bridge_port_obj_add(dev, ptr, br_offloads);
 		break;
 	case SWITCHDEV_PORT_OBJ_DEL:
-		err = switchdev_handle_port_obj_del(dev, ptr,
-						    mlx5e_eswitch_rep,
-						    mlx5_esw_bridge_port_obj_del);
+		err = mlx5_esw_bridge_port_obj_del(dev, ptr, br_offloads);
 		break;
 	case SWITCHDEV_PORT_ATTR_SET:
-		err = switchdev_handle_port_attr_set(dev, ptr,
-						     mlx5e_eswitch_rep,
-						     mlx5_esw_bridge_port_obj_attr_set);
+		err = mlx5_esw_bridge_port_obj_attr_set(dev, ptr, br_offloads);
 		break;
 	default:
 		err = 0;
@@ -222,27 +253,27 @@ static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work)
 		container_of(work, struct mlx5_bridge_switchdev_fdb_work, work);
 	struct switchdev_notifier_fdb_info *fdb_info =
 		&fdb_work->fdb_info;
+	struct mlx5_esw_bridge_offloads *br_offloads;
 	struct net_device *dev = fdb_work->dev;
-	struct mlx5e_rep_priv *rpriv;
-	struct mlx5_eswitch *esw;
-	struct mlx5_vport *vport;
+	u16 vport_num, esw_owner_vhca_id;
 	struct mlx5e_priv *priv;
-	u16 vport_num;
+	int err;
 
 	rtnl_lock();
 
 	priv = netdev_priv(dev);
-	rpriv = priv->ppriv;
-	vport_num = rpriv->rep->vport;
-	esw = priv->mdev->priv.eswitch;
-	vport = mlx5_eswitch_get_vport(esw, vport_num);
-	if (IS_ERR(vport))
+	br_offloads = priv->mdev->priv.eswitch->br_offloads;
+	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						    &esw_owner_vhca_id);
+	if (err)
 		goto out;
 
 	if (fdb_work->add)
-		mlx5_esw_bridge_fdb_create(dev, esw, vport, fdb_info);
+		mlx5_esw_bridge_fdb_create(dev, vport_num, esw_owner_vhca_id, br_offloads,
+					   fdb_info);
 	else
-		mlx5_esw_bridge_fdb_remove(dev, esw, vport, fdb_info);
+		mlx5_esw_bridge_fdb_remove(dev, vport_num, esw_owner_vhca_id, br_offloads,
+					   fdb_info);
 
 out:
 	rtnl_unlock();
@@ -288,18 +319,10 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	struct mlx5_bridge_switchdev_fdb_work *work;
 	struct switchdev_notifier_info *info = ptr;
 	struct net_device *upper;
-	struct mlx5e_priv *priv;
-
-	if (!mlx5e_eswitch_rep(dev))
-		return NOTIFY_DONE;
-	priv = netdev_priv(dev);
-	if (priv->mdev->priv.eswitch != br_offloads->esw)
-		return NOTIFY_DONE;
 
 	if (event == SWITCHDEV_PORT_ATTR_SET) {
-		int err = switchdev_handle_port_attr_set(dev, ptr,
-							 mlx5e_eswitch_rep,
-							 mlx5_esw_bridge_port_obj_attr_set);
+		int err = mlx5_esw_bridge_port_obj_attr_set(dev, ptr, br_offloads);
+
 		return notifier_from_errno(err);
 	}
 
@@ -309,6 +332,11 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	if (!netif_is_bridge_master(upper))
 		return NOTIFY_DONE;
 
+	if (!mlx5e_eswitch_rep(dev))
+		return NOTIFY_DONE;
+	if (!mlx5_esw_bridge_dev_same_esw(dev, br_offloads->esw))
+		return NOTIFY_DONE;
+
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index e2963d8d5302..65173db2a2f4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -56,7 +56,6 @@ struct mlx5_esw_bridge {
 
 	struct list_head fdb_list;
 	struct rhashtable fdb_ht;
-	struct xarray vports;
 
 	struct mlx5_flow_table *egress_ft;
 	struct mlx5_flow_group *egress_vlan_fg;
@@ -576,7 +575,6 @@ static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex,
 		goto err_fdb_ht;
 
 	INIT_LIST_HEAD(&bridge->fdb_list);
-	xa_init(&bridge->vports);
 	bridge->ifindex = ifindex;
 	bridge->refcnt = 1;
 	bridge->ageing_time = clock_t_to_jiffies(BR_DEFAULT_AGEING_TIME);
@@ -603,7 +601,6 @@ static void mlx5_esw_bridge_put(struct mlx5_esw_bridge_offloads *br_offloads,
 		return;
 
 	mlx5_esw_bridge_egress_table_cleanup(bridge);
-	WARN_ON(!xa_empty(&bridge->vports));
 	list_del(&bridge->list);
 	rhashtable_destroy(&bridge->fdb_ht);
 	kvfree(bridge);
@@ -639,22 +636,34 @@ mlx5_esw_bridge_lookup(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads
 	return bridge;
 }
 
+static unsigned long mlx5_esw_bridge_port_key_from_data(u16 vport_num, u16 esw_owner_vhca_id)
+{
+	return vport_num | (unsigned long)esw_owner_vhca_id << sizeof(vport_num) * BITS_PER_BYTE;
+}
+
+static unsigned long mlx5_esw_bridge_port_key(struct mlx5_esw_bridge_port *port)
+{
+	return mlx5_esw_bridge_port_key_from_data(port->vport_num, port->esw_owner_vhca_id);
+}
+
 static int mlx5_esw_bridge_port_insert(struct mlx5_esw_bridge_port *port,
-				       struct mlx5_esw_bridge *bridge)
+				       struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	return xa_insert(&bridge->vports, port->vport_num, port, GFP_KERNEL);
+	return xa_insert(&br_offloads->ports, mlx5_esw_bridge_port_key(port), port, GFP_KERNEL);
 }
 
 static struct mlx5_esw_bridge_port *
-mlx5_esw_bridge_port_lookup(u16 vport_num, struct mlx5_esw_bridge *bridge)
+mlx5_esw_bridge_port_lookup(u16 vport_num, u16 esw_owner_vhca_id,
+			    struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	return xa_load(&bridge->vports, vport_num);
+	return xa_load(&br_offloads->ports, mlx5_esw_bridge_port_key_from_data(vport_num,
+									       esw_owner_vhca_id));
 }
 
 static void mlx5_esw_bridge_port_erase(struct mlx5_esw_bridge_port *port,
-				       struct mlx5_esw_bridge *bridge)
+				       struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	xa_erase(&bridge->vports, port->vport_num);
+	xa_erase(&br_offloads->ports, mlx5_esw_bridge_port_key(port));
 }
 
 static void mlx5_esw_bridge_fdb_entry_refresh(unsigned long lastuse,
@@ -875,13 +884,13 @@ static void mlx5_esw_bridge_port_vlans_flush(struct mlx5_esw_bridge_port *port,
 }
 
 static struct mlx5_esw_bridge_vlan *
-mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, struct mlx5_esw_bridge *bridge,
-				 struct mlx5_eswitch *esw)
+mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, u16 esw_owner_vhca_id,
+				 struct mlx5_esw_bridge *bridge, struct mlx5_eswitch *esw)
 {
 	struct mlx5_esw_bridge_port *port;
 	struct mlx5_esw_bridge_vlan *vlan;
 
-	port = mlx5_esw_bridge_port_lookup(vport_num, bridge);
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, bridge->br_offloads);
 	if (!port) {
 		/* FDB is added asynchronously on wq while port might have been deleted
 		 * concurrently. Report on 'info' logging level and skip the FDB offload.
@@ -904,9 +913,9 @@ mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, struct mlx5_esw_bridge
 }
 
 static struct mlx5_esw_bridge_fdb_entry *
-mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsigned char *addr,
-			       u16 vid, bool added_by_user, struct mlx5_eswitch *esw,
-			       struct mlx5_esw_bridge *bridge)
+mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+			       const unsigned char *addr, u16 vid, bool added_by_user,
+			       struct mlx5_eswitch *esw, struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_vlan *vlan = NULL;
 	struct mlx5_esw_bridge_fdb_entry *entry;
@@ -915,7 +924,8 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	int err;
 
 	if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG && vid) {
-		vlan = mlx5_esw_bridge_port_vlan_lookup(vid, vport_num, bridge, esw);
+		vlan = mlx5_esw_bridge_port_vlan_lookup(vid, vport_num, esw_owner_vhca_id, bridge,
+							esw);
 		if (IS_ERR(vlan))
 			return ERR_CAST(vlan);
 	}
@@ -928,6 +938,7 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	entry->key.vid = vid;
 	entry->dev = dev;
 	entry->vport_num = vport_num;
+	entry->esw_owner_vhca_id = esw_owner_vhca_id;
 	entry->lastuse = jiffies;
 	if (added_by_user)
 		entry->flags |= MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER;
@@ -998,26 +1009,31 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	return ERR_PTR(err);
 }
 
-int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw,
-				    struct mlx5_vport *vport)
+int mlx5_esw_bridge_ageing_time_set(u16 vport_num, u16 esw_owner_vhca_id, unsigned long ageing_time,
+				    struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	if (!vport->bridge)
+	struct mlx5_esw_bridge_port *port;
+
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
+	if (!port)
 		return -EINVAL;
 
-	vport->bridge->ageing_time = clock_t_to_jiffies(ageing_time);
+	port->bridge->ageing_time = clock_t_to_jiffies(ageing_time);
 	return 0;
 }
 
-int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw,
-				       struct mlx5_vport *vport)
+int mlx5_esw_bridge_vlan_filtering_set(u16 vport_num, u16 esw_owner_vhca_id, bool enable,
+				       struct mlx5_esw_bridge_offloads *br_offloads)
 {
+	struct mlx5_esw_bridge_port *port;
 	struct mlx5_esw_bridge *bridge;
 	bool filtering;
 
-	if (!vport->bridge)
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
+	if (!port)
 		return -EINVAL;
 
-	bridge = vport->bridge;
+	bridge = port->bridge;
 	filtering = bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG;
 	if (filtering == enable)
 		return 0;
@@ -1031,9 +1047,9 @@ int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw,
 	return 0;
 }
 
-static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloads,
-				      struct mlx5_esw_bridge *bridge,
-				      struct mlx5_vport *vport)
+static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id,
+				      struct mlx5_esw_bridge_offloads *br_offloads,
+				      struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_eswitch *esw = br_offloads->esw;
 	struct mlx5_esw_bridge_port *port;
@@ -1043,17 +1059,19 @@ static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloa
 	if (!port)
 		return -ENOMEM;
 
-	port->vport_num = vport->vport;
+	port->vport_num = vport_num;
+	port->esw_owner_vhca_id = esw_owner_vhca_id;
+	port->bridge = bridge;
 	xa_init(&port->vlans);
-	err = mlx5_esw_bridge_port_insert(port, bridge);
+	err = mlx5_esw_bridge_port_insert(port, br_offloads);
 	if (err) {
-		esw_warn(esw->dev, "Failed to insert port metadata (vport=%u,err=%d)\n",
-			 vport->vport, err);
+		esw_warn(esw->dev,
+			 "Failed to insert port metadata (vport=%u,esw_owner_vhca_id=%u,err=%d)\n",
+			 port->vport_num, port->esw_owner_vhca_id, err);
 		goto err_port_insert;
 	}
 	trace_mlx5_esw_bridge_vport_init(port);
 
-	vport->bridge = bridge;
 	return 0;
 
 err_port_insert:
@@ -1062,46 +1080,38 @@ static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloa
 }
 
 static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_offloads,
-					 struct mlx5_vport *vport)
+					 struct mlx5_esw_bridge_port *port)
 {
-	struct mlx5_esw_bridge *bridge = vport->bridge;
+	u16 vport_num = port->vport_num, esw_owner_vhca_id = port->esw_owner_vhca_id;
+	struct mlx5_esw_bridge *bridge = port->bridge;
 	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
-	struct mlx5_esw_bridge_port *port;
 
 	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list)
-		if (entry->vport_num == vport->vport)
+		if (entry->vport_num == vport_num && entry->esw_owner_vhca_id == esw_owner_vhca_id)
 			mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 
-	port = mlx5_esw_bridge_port_lookup(vport->vport, bridge);
-	if (!port) {
-		WARN(1, "Vport %u metadata not found on bridge", vport->vport);
-		return -EINVAL;
-	}
-
 	trace_mlx5_esw_bridge_vport_cleanup(port);
 	mlx5_esw_bridge_port_vlans_flush(port, bridge);
-	mlx5_esw_bridge_port_erase(port, bridge);
+	mlx5_esw_bridge_port_erase(port, br_offloads);
 	kvfree(port);
 	mlx5_esw_bridge_put(br_offloads, bridge);
-	vport->bridge = NULL;
 	return 0;
 }
 
-int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
-			       struct mlx5_vport *vport, struct netlink_ext_ack *extack)
+int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+			       struct mlx5_esw_bridge_offloads *br_offloads,
+			       struct netlink_ext_ack *extack)
 {
 	struct mlx5_esw_bridge *bridge;
 	int err;
 
-	WARN_ON(vport->bridge);
-
 	bridge = mlx5_esw_bridge_lookup(ifindex, br_offloads);
 	if (IS_ERR(bridge)) {
 		NL_SET_ERR_MSG_MOD(extack, "Error checking for existing bridge with same ifindex");
 		return PTR_ERR(bridge);
 	}
 
-	err = mlx5_esw_bridge_vport_init(br_offloads, bridge, vport);
+	err = mlx5_esw_bridge_vport_init(vport_num, esw_owner_vhca_id, br_offloads, bridge);
 	if (err) {
 		NL_SET_ERR_MSG_MOD(extack, "Error initializing port");
 		goto err_vport;
@@ -1113,34 +1123,37 @@ int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_
 	return err;
 }
 
-int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
-				 struct mlx5_vport *vport, struct netlink_ext_ack *extack)
+int mlx5_esw_bridge_vport_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+				 struct mlx5_esw_bridge_offloads *br_offloads,
+				 struct netlink_ext_ack *extack)
 {
-	struct mlx5_esw_bridge *bridge = vport->bridge;
+	struct mlx5_esw_bridge_port *port;
 	int err;
 
-	if (!bridge) {
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
+	if (!port) {
 		NL_SET_ERR_MSG_MOD(extack, "Port is not attached to any bridge");
 		return -EINVAL;
 	}
-	if (bridge->ifindex != ifindex) {
+	if (port->bridge->ifindex != ifindex) {
 		NL_SET_ERR_MSG_MOD(extack, "Port is attached to another bridge");
 		return -EINVAL;
 	}
 
-	err = mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
+	err = mlx5_esw_bridge_vport_cleanup(br_offloads, port);
 	if (err)
 		NL_SET_ERR_MSG_MOD(extack, "Port cleanup failed");
 	return err;
 }
 
-int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
-				  struct mlx5_vport *vport, struct netlink_ext_ack *extack)
+int mlx5_esw_bridge_port_vlan_add(u16 vport_num, u16 esw_owner_vhca_id, u16 vid, u16 flags,
+				  struct mlx5_esw_bridge_offloads *br_offloads,
+				  struct netlink_ext_ack *extack)
 {
 	struct mlx5_esw_bridge_port *port;
 	struct mlx5_esw_bridge_vlan *vlan;
 
-	port = mlx5_esw_bridge_port_lookup(vport->vport, vport->bridge);
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
 	if (!port)
 		return -EINVAL;
 
@@ -1148,10 +1161,10 @@ int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
 	if (vlan) {
 		if (vlan->flags == flags)
 			return 0;
-		mlx5_esw_bridge_vlan_cleanup(port, vlan, vport->bridge);
+		mlx5_esw_bridge_vlan_cleanup(port, vlan, port->bridge);
 	}
 
-	vlan = mlx5_esw_bridge_vlan_create(vid, flags, port, esw);
+	vlan = mlx5_esw_bridge_vlan_create(vid, flags, port, br_offloads->esw);
 	if (IS_ERR(vlan)) {
 		NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry");
 		return PTR_ERR(vlan);
@@ -1159,36 +1172,38 @@ int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
 	return 0;
 }
 
-void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx5_vport *vport)
+void mlx5_esw_bridge_port_vlan_del(u16 vport_num, u16 esw_owner_vhca_id, u16 vid,
+				   struct mlx5_esw_bridge_offloads *br_offloads)
 {
 	struct mlx5_esw_bridge_port *port;
 	struct mlx5_esw_bridge_vlan *vlan;
 
-	port = mlx5_esw_bridge_port_lookup(vport->vport, vport->bridge);
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
 	if (!port)
 		return;
 
 	vlan = mlx5_esw_bridge_vlan_lookup(vid, port);
 	if (!vlan)
 		return;
-	mlx5_esw_bridge_vlan_cleanup(port, vlan, vport->bridge);
+	mlx5_esw_bridge_vlan_cleanup(port, vlan, port->bridge);
 }
 
-void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw,
-				struct mlx5_vport *vport,
+void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info)
 {
-	struct mlx5_esw_bridge *bridge = vport->bridge;
 	struct mlx5_esw_bridge_fdb_entry *entry;
-	u16 vport_num = vport->vport;
+	struct mlx5_esw_bridge_port *port;
+	struct mlx5_esw_bridge *bridge;
 
-	if (!bridge) {
-		esw_info(esw->dev, "Vport is not assigned to bridge (vport=%u)\n", vport_num);
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
+	if (!port)
 		return;
-	}
 
-	entry = mlx5_esw_bridge_fdb_entry_init(dev, vport_num, fdb_info->addr, fdb_info->vid,
-					       fdb_info->added_by_user, esw, bridge);
+	bridge = port->bridge;
+	entry = mlx5_esw_bridge_fdb_entry_init(dev, vport_num, esw_owner_vhca_id, fdb_info->addr,
+					       fdb_info->vid, fdb_info->added_by_user,
+					       br_offloads->esw, bridge);
 	if (IS_ERR(entry))
 		return;
 
@@ -1201,20 +1216,21 @@ void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw
 						   SWITCHDEV_FDB_ADD_TO_BRIDGE);
 }
 
-void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw,
-				struct mlx5_vport *vport,
+void mlx5_esw_bridge_fdb_remove(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info)
 {
-	struct mlx5_esw_bridge *bridge = vport->bridge;
+	struct mlx5_eswitch *esw = br_offloads->esw;
 	struct mlx5_esw_bridge_fdb_entry *entry;
 	struct mlx5_esw_bridge_fdb_key key;
-	u16 vport_num = vport->vport;
+	struct mlx5_esw_bridge_port *port;
+	struct mlx5_esw_bridge *bridge;
 
-	if (!bridge) {
-		esw_warn(esw->dev, "Vport is not assigned to bridge (vport=%u)\n", vport_num);
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
+	if (!port)
 		return;
-	}
 
+	bridge = port->bridge;
 	ether_addr_copy(key.addr, fdb_info->addr);
 	key.vid = fdb_info->vid;
 	entry = rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params);
@@ -1258,13 +1274,11 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
 
 static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	struct mlx5_eswitch *esw = br_offloads->esw;
-	struct mlx5_vport *vport;
+	struct mlx5_esw_bridge_port *port;
 	unsigned long i;
 
-	mlx5_esw_for_each_vport(esw, i, vport)
-		if (vport->bridge)
-			mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
+	xa_for_each(&br_offloads->ports, i, port)
+		mlx5_esw_bridge_vport_cleanup(br_offloads, port);
 
 	WARN_ONCE(!list_empty(&br_offloads->bridges),
 		  "Cleaning up bridge offloads while still having bridges attached\n");
@@ -1279,6 +1293,7 @@ struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw)
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&br_offloads->bridges);
+	xa_init(&br_offloads->ports);
 	br_offloads->esw = esw;
 	esw->br_offloads = br_offloads;
 
@@ -1293,6 +1308,7 @@ void mlx5_esw_bridge_cleanup(struct mlx5_eswitch *esw)
 		return;
 
 	mlx5_esw_bridge_flush(br_offloads);
+	WARN_ON(!xa_empty(&br_offloads->ports));
 
 	esw->br_offloads = NULL;
 	kvfree(br_offloads);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index d826942b27fc..374f768db4cc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -7,6 +7,7 @@
 #include <linux/notifier.h>
 #include <linux/list.h>
 #include <linux/workqueue.h>
+#include <linux/xarray.h>
 #include "eswitch.h"
 
 struct mlx5_flow_table;
@@ -15,6 +16,8 @@ struct mlx5_flow_group;
 struct mlx5_esw_bridge_offloads {
 	struct mlx5_eswitch *esw;
 	struct list_head bridges;
+	struct xarray ports;
+
 	struct notifier_block netdev_nb;
 	struct notifier_block nb_blk;
 	struct notifier_block nb;
@@ -31,23 +34,27 @@ struct mlx5_esw_bridge_offloads {
 
 struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw);
 void mlx5_esw_bridge_cleanup(struct mlx5_eswitch *esw);
-int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
-			       struct mlx5_vport *vport, struct netlink_ext_ack *extack);
-int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
-				 struct mlx5_vport *vport, struct netlink_ext_ack *extack);
-void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw,
-				struct mlx5_vport *vport,
+int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+			       struct mlx5_esw_bridge_offloads *br_offloads,
+			       struct netlink_ext_ack *extack);
+int mlx5_esw_bridge_vport_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+				 struct mlx5_esw_bridge_offloads *br_offloads,
+				 struct netlink_ext_ack *extack);
+void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info);
-void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw,
-				struct mlx5_vport *vport,
+void mlx5_esw_bridge_fdb_remove(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info);
 void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads);
-int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw,
-				    struct mlx5_vport *vport);
-int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw,
-				       struct mlx5_vport *vport);
-int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
-				  struct mlx5_vport *vport, struct netlink_ext_ack *extack);
-void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx5_vport *vport);
+int mlx5_esw_bridge_ageing_time_set(u16 vport_num, u16 esw_owner_vhca_id, unsigned long ageing_time,
+				    struct mlx5_esw_bridge_offloads *br_offloads);
+int mlx5_esw_bridge_vlan_filtering_set(u16 vport_num, u16 esw_owner_vhca_id, bool enable,
+				       struct mlx5_esw_bridge_offloads *br_offloads);
+int mlx5_esw_bridge_port_vlan_add(u16 vport_num, u16 esw_owner_vhca_id, u16 vid, u16 flags,
+				  struct mlx5_esw_bridge_offloads *br_offloads,
+				  struct netlink_ext_ack *extack);
+void mlx5_esw_bridge_port_vlan_del(u16 vport_num, u16 esw_owner_vhca_id, u16 vid,
+				   struct mlx5_esw_bridge_offloads *br_offloads);
 
 #endif /* __MLX5_ESW_BRIDGE_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
index d9ab2e8bc2cb..7e1c5590aef8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
@@ -28,6 +28,7 @@ struct mlx5_esw_bridge_fdb_entry {
 	struct list_head list;
 	struct list_head vlan_list;
 	u16 vport_num;
+	u16 esw_owner_vhca_id;
 	u16 flags;
 
 	struct mlx5_flow_handle *ingress_handle;
@@ -47,6 +48,8 @@ struct mlx5_esw_bridge_vlan {
 
 struct mlx5_esw_bridge_port {
 	u16 vport_num;
+	u16 esw_owner_vhca_id;
+	struct mlx5_esw_bridge *bridge;
 	struct xarray vlans;
 };
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
index 227964b7d3b9..28231584da81 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
@@ -85,11 +85,15 @@ DECLARE_EVENT_CLASS(mlx5_esw_bridge_port_template,
 		    TP_ARGS(port),
 		    TP_STRUCT__entry(
 			    __field(u16, vport_num)
+			    __field(u16, esw_owner_vhca_id)
 			    ),
 		    TP_fast_assign(
 			    __entry->vport_num = port->vport_num;
+			    __entry->esw_owner_vhca_id = port->esw_owner_vhca_id;
 			    ),
-		    TP_printk("vport_num=%hu", __entry->vport_num)
+		    TP_printk("vport_num=%hu esw_owner_vhca_id=%hu",
+			      __entry->vport_num,
+			      __entry->esw_owner_vhca_id)
 	);
 
 DEFINE_EVENT(mlx5_esw_bridge_port_template,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 01e8dfb994d4..d3a5ff4f6140 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -160,8 +160,6 @@ enum mlx5_eswitch_vport_event {
 	MLX5_VPORT_PROMISC_CHANGE = BIT(3),
 };
 
-struct mlx5_esw_bridge;
-
 struct mlx5_vport {
 	struct mlx5_core_dev    *dev;
 	struct hlist_head       uc_list[MLX5_L2_ADDR_HASH_SIZE];
@@ -190,7 +188,6 @@ struct mlx5_vport {
 	enum mlx5_eswitch_vport_event enabled_events;
 	int index;
 	struct devlink_port *dl_port;
-	struct mlx5_esw_bridge *bridge;
 };
 
 struct mlx5_esw_indir_table;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 15/17] net/mlx5: Bridge, extract FDB delete notification to function
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (13 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 14/17] net/mlx5: Bridge, identify port by vport_num+esw_owner_vhca_id pair Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 21:18 ` [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity Saeed Mahameed
  2021-08-16 21:18 ` [net-next 17/17] net/mlx5: Bridge, support LAG Saeed Mahameed
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Vlad Buslov, Roi Dayan,
	Mark Bloch, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

SWITCHDEV_FDB_DEL_TO_BRIDGE notification is generated in multiple places in
bridge code. Following patch in series changes the condition for the
notification. Extract the notification into dedicated helper function
mlx5_esw_bridge_fdb_del_notify() to only modify it in single place in the
future changes.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 27 +++++++++----------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 65173db2a2f4..5f5571190ffe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -76,6 +76,15 @@ mlx5_esw_bridge_fdb_offload_notify(struct net_device *dev, const unsigned char *
 	call_switchdev_notifiers(val, dev, &send_info.info, NULL);
 }
 
+static void
+mlx5_esw_bridge_fdb_del_notify(struct mlx5_esw_bridge_fdb_entry *entry)
+{
+	if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
+		mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
+						   entry->key.vid,
+						   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+}
+
 static struct mlx5_flow_table *
 mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw)
 {
@@ -699,10 +708,7 @@ static void mlx5_esw_bridge_fdb_flush(struct mlx5_esw_bridge *bridge)
 	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
 
 	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
-		if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
-			mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
-							   entry->key.vid,
-							   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+		mlx5_esw_bridge_fdb_del_notify(entry);
 		mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 	}
 }
@@ -850,10 +856,7 @@ static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_vlan *vlan,
 	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
 
 	list_for_each_entry_safe(entry, tmp, &vlan->fdb_list, vlan_list) {
-		if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
-			mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
-							   entry->key.vid,
-							   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+		mlx5_esw_bridge_fdb_del_notify(entry);
 		mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 	}
 
@@ -1241,9 +1244,7 @@ void mlx5_esw_bridge_fdb_remove(struct net_device *dev, u16 vport_num, u16 esw_o
 		return;
 	}
 
-	if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
-		mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
-						   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+	mlx5_esw_bridge_fdb_del_notify(entry);
 	mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 }
 
@@ -1263,9 +1264,7 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
 			if (time_after(lastuse, entry->lastuse)) {
 				mlx5_esw_bridge_fdb_entry_refresh(lastuse, entry);
 			} else if (time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
-				mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
-								   entry->key.vid,
-								   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+				mlx5_esw_bridge_fdb_del_notify(entry);
 				mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 			}
 		}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (14 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 15/17] net/mlx5: Bridge, extract FDB delete notification to function Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  2021-08-16 22:38   ` Jakub Kicinski
  2021-08-16 21:18 ` [net-next 17/17] net/mlx5: Bridge, support LAG Saeed Mahameed
  16 siblings, 1 reply; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Vlad Buslov, Roi Dayan,
	Mark Bloch, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Allow connectivity between representors of different eswitch instances that
are attached to same bridge when merged_eswitch capability is enabled. Add
ports of peer eswitch to bridge instance and mark them with
MLX5_ESW_BRIDGE_PORT_FLAG_PEER. Mark FDBs offloaded on peer ports with
MLX5_ESW_BRIDGE_FLAG_PEER flag. Such FDBs can only be aged out on their
local eswitch instance, which then sends SWITCHDEV_FDB_DEL_TO_BRIDGE event.
Listen to the event on mlx5 bridge implementation and delete peer FDBs in
event handler.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/rep/bridge.c        | 58 ++++++++++++++----
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 61 +++++++++++++++----
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |  6 ++
 .../mellanox/mlx5/core/esw/bridge_priv.h      |  6 ++
 .../mlx5/core/esw/diag/bridge_tracepoint.h    |  7 ++-
 5 files changed, 112 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index f21b0beae395..2e6f2bce9083 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -15,6 +15,7 @@ struct mlx5_bridge_switchdev_fdb_work {
 	struct work_struct work;
 	struct switchdev_notifier_fdb_info fdb_info;
 	struct net_device *dev;
+	struct mlx5_esw_bridge_offloads *br_offloads;
 	bool add;
 };
 
@@ -25,13 +26,28 @@ static bool mlx5_esw_bridge_dev_same_esw(struct net_device *dev, struct mlx5_esw
 	return esw == priv->mdev->priv.eswitch;
 }
 
+static bool mlx5_esw_bridge_dev_same_hw(struct net_device *dev, struct mlx5_eswitch *esw)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5_core_dev *mdev, *esw_mdev;
+	u64 system_guid, esw_system_guid;
+
+	mdev = priv->mdev;
+	esw_mdev = esw->dev;
+
+	system_guid = mlx5_query_nic_system_image_guid(mdev);
+	esw_system_guid = mlx5_query_nic_system_image_guid(esw_mdev);
+
+	return system_guid == esw_system_guid;
+}
+
 static int mlx5_esw_bridge_vport_num_vhca_id_get(struct net_device *dev, struct mlx5_eswitch *esw,
 						 u16 *vport_num, u16 *esw_owner_vhca_id)
 {
 	struct mlx5e_rep_priv *rpriv;
 	struct mlx5e_priv *priv;
 
-	if (!mlx5e_eswitch_rep(dev) || !mlx5_esw_bridge_dev_same_esw(dev, esw))
+	if (!mlx5e_eswitch_rep(dev) || !mlx5_esw_bridge_dev_same_hw(dev, esw))
 		return -ENODEV;
 
 	priv = netdev_priv(dev);
@@ -48,7 +64,7 @@ mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(struct net_device *dev, struct m
 	struct net_device *lower_dev;
 	struct list_head *iter;
 
-	if (mlx5e_eswitch_rep(dev) && mlx5_esw_bridge_dev_same_esw(dev, esw))
+	if (mlx5e_eswitch_rep(dev))
 		return mlx5_esw_bridge_vport_num_vhca_id_get(dev, esw, vport_num,
 							     esw_owner_vhca_id);
 
@@ -74,6 +90,7 @@ static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr
 								    netdev_nb);
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct netdev_notifier_changeupper_info *info = ptr;
+	struct mlx5_eswitch *esw = br_offloads->esw;
 	struct net_device *upper = info->upper_dev;
 	u16 vport_num, esw_owner_vhca_id;
 	struct netlink_ext_ack *extack;
@@ -90,11 +107,20 @@ static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr
 
 	extack = netdev_notifier_info_to_extack(&info->info);
 
-	return info->linking ?
-		mlx5_esw_bridge_vport_link(ifindex, vport_num, esw_owner_vhca_id, br_offloads,
-					   extack) :
-		mlx5_esw_bridge_vport_unlink(ifindex, vport_num, esw_owner_vhca_id, br_offloads,
-					     extack);
+	if (mlx5_esw_bridge_dev_same_esw(dev, esw))
+		err = info->linking ?
+			mlx5_esw_bridge_vport_link(ifindex, vport_num, esw_owner_vhca_id,
+						   br_offloads, extack) :
+			mlx5_esw_bridge_vport_unlink(ifindex, vport_num, esw_owner_vhca_id,
+						     br_offloads, extack);
+	else if (mlx5_esw_bridge_dev_same_hw(dev, esw))
+		err = info->linking ?
+			mlx5_esw_bridge_vport_peer_link(ifindex, vport_num, esw_owner_vhca_id,
+							br_offloads, extack) :
+			mlx5_esw_bridge_vport_peer_unlink(ifindex, vport_num, esw_owner_vhca_id,
+							  br_offloads, extack);
+
+	return err;
 }
 
 static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
@@ -253,7 +279,8 @@ static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work)
 		container_of(work, struct mlx5_bridge_switchdev_fdb_work, work);
 	struct switchdev_notifier_fdb_info *fdb_info =
 		&fdb_work->fdb_info;
-	struct mlx5_esw_bridge_offloads *br_offloads;
+	struct mlx5_esw_bridge_offloads *br_offloads =
+		fdb_work->br_offloads;
 	struct net_device *dev = fdb_work->dev;
 	u16 vport_num, esw_owner_vhca_id;
 	struct mlx5e_priv *priv;
@@ -262,7 +289,6 @@ static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work)
 	rtnl_lock();
 
 	priv = netdev_priv(dev);
-	br_offloads = priv->mdev->priv.eswitch->br_offloads;
 	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
 						    &esw_owner_vhca_id);
 	if (err)
@@ -282,7 +308,8 @@ static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work)
 
 static struct mlx5_bridge_switchdev_fdb_work *
 mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add,
-					struct switchdev_notifier_fdb_info *fdb_info)
+					struct switchdev_notifier_fdb_info *fdb_info,
+					struct mlx5_esw_bridge_offloads *br_offloads)
 {
 	struct mlx5_bridge_switchdev_fdb_work *work;
 	u8 *addr;
@@ -304,6 +331,7 @@ mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add,
 
 	dev_hold(dev);
 	work->dev = dev;
+	work->br_offloads = br_offloads;
 	work->add = add;
 	return work;
 }
@@ -334,10 +362,13 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 
 	if (!mlx5e_eswitch_rep(dev))
 		return NOTIFY_DONE;
-	if (!mlx5_esw_bridge_dev_same_esw(dev, br_offloads->esw))
-		return NOTIFY_DONE;
 
 	switch (event) {
+	case SWITCHDEV_FDB_DEL_TO_BRIDGE:
+		/* only handle the event when source is on another eswitch */
+		if (mlx5_esw_bridge_dev_same_esw(dev, br_offloads->esw))
+			break;
+		fallthrough;
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
 	case SWITCHDEV_FDB_DEL_TO_DEVICE:
 		fdb_info = container_of(info,
@@ -346,7 +377,8 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 
 		work = mlx5_esw_bridge_init_switchdev_fdb_work(dev,
 							       event == SWITCHDEV_FDB_ADD_TO_DEVICE,
-							       fdb_info);
+							       fdb_info,
+							       br_offloads);
 		if (IS_ERR(work)) {
 			WARN_ONCE(1, "Failed to init switchdev work, err=%ld",
 				  PTR_ERR(work));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 5f5571190ffe..20d44b0ae337 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -79,7 +79,7 @@ mlx5_esw_bridge_fdb_offload_notify(struct net_device *dev, const unsigned char *
 static void
 mlx5_esw_bridge_fdb_del_notify(struct mlx5_esw_bridge_fdb_entry *entry)
 {
-	if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
+	if (!(entry->flags & (MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER | MLX5_ESW_BRIDGE_FLAG_PEER)))
 		mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
 						   entry->key.vid,
 						   SWITCHDEV_FDB_DEL_TO_BRIDGE);
@@ -513,7 +513,7 @@ mlx5_esw_bridge_ingress_filter_flow_create(u16 vport_num, const unsigned char *a
 }
 
 static struct mlx5_flow_handle *
-mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr,
+mlx5_esw_bridge_egress_flow_create(u16 vport_num, u16 esw_owner_vhca_id, const unsigned char *addr,
 				   struct mlx5_esw_bridge_vlan *vlan,
 				   struct mlx5_esw_bridge *bridge)
 {
@@ -558,6 +558,10 @@ mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr,
 			 vlan->vid);
 	}
 
+	if (MLX5_CAP_ESW(bridge->br_offloads->esw->dev, merged_eswitch)) {
+		dest.vport.flags = MLX5_FLOW_DEST_VPORT_VHCA_ID;
+		dest.vport.vhca_id = esw_owner_vhca_id;
+	}
 	handle = mlx5_add_flow_rules(bridge->egress_ft, rule_spec, &flow_act, &dest, 1);
 
 	kvfree(rule_spec);
@@ -917,7 +921,7 @@ mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, u16 esw_owner_vhca_id,
 
 static struct mlx5_esw_bridge_fdb_entry *
 mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
-			       const unsigned char *addr, u16 vid, bool added_by_user,
+			       const unsigned char *addr, u16 vid, bool added_by_user, bool peer,
 			       struct mlx5_eswitch *esw, struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_vlan *vlan = NULL;
@@ -945,6 +949,8 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_ow
 	entry->lastuse = jiffies;
 	if (added_by_user)
 		entry->flags |= MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER;
+	if (peer)
+		entry->flags |= MLX5_ESW_BRIDGE_FLAG_PEER;
 
 	counter = mlx5_fc_create(esw->dev, true);
 	if (IS_ERR(counter)) {
@@ -974,7 +980,8 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_ow
 		entry->filter_handle = handle;
 	}
 
-	handle = mlx5_esw_bridge_egress_flow_create(vport_num, addr, vlan, bridge);
+	handle = mlx5_esw_bridge_egress_flow_create(vport_num, esw_owner_vhca_id, addr, vlan,
+						    bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
 		esw_warn(esw->dev, "Failed to create egress flow(vport=%u,err=%d)\n",
@@ -1050,7 +1057,7 @@ int mlx5_esw_bridge_vlan_filtering_set(u16 vport_num, u16 esw_owner_vhca_id, boo
 	return 0;
 }
 
-static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id,
+static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id, u16 flags,
 				      struct mlx5_esw_bridge_offloads *br_offloads,
 				      struct mlx5_esw_bridge *bridge)
 {
@@ -1065,6 +1072,7 @@ static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id,
 	port->vport_num = vport_num;
 	port->esw_owner_vhca_id = esw_owner_vhca_id;
 	port->bridge = bridge;
+	port->flags |= flags;
 	xa_init(&port->vlans);
 	err = mlx5_esw_bridge_port_insert(port, br_offloads);
 	if (err) {
@@ -1101,9 +1109,10 @@ static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_off
 	return 0;
 }
 
-int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
-			       struct mlx5_esw_bridge_offloads *br_offloads,
-			       struct netlink_ext_ack *extack)
+static int mlx5_esw_bridge_vport_link_with_flags(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+						 u16 flags,
+						 struct mlx5_esw_bridge_offloads *br_offloads,
+						 struct netlink_ext_ack *extack)
 {
 	struct mlx5_esw_bridge *bridge;
 	int err;
@@ -1114,7 +1123,7 @@ int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id
 		return PTR_ERR(bridge);
 	}
 
-	err = mlx5_esw_bridge_vport_init(vport_num, esw_owner_vhca_id, br_offloads, bridge);
+	err = mlx5_esw_bridge_vport_init(vport_num, esw_owner_vhca_id, flags, br_offloads, bridge);
 	if (err) {
 		NL_SET_ERR_MSG_MOD(extack, "Error initializing port");
 		goto err_vport;
@@ -1126,6 +1135,14 @@ int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id
 	return err;
 }
 
+int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+			       struct mlx5_esw_bridge_offloads *br_offloads,
+			       struct netlink_ext_ack *extack)
+{
+	return mlx5_esw_bridge_vport_link_with_flags(ifindex, vport_num, esw_owner_vhca_id, 0,
+						     br_offloads, extack);
+}
+
 int mlx5_esw_bridge_vport_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
 				 struct mlx5_esw_bridge_offloads *br_offloads,
 				 struct netlink_ext_ack *extack)
@@ -1149,6 +1166,26 @@ int mlx5_esw_bridge_vport_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_
 	return err;
 }
 
+int mlx5_esw_bridge_vport_peer_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+				    struct mlx5_esw_bridge_offloads *br_offloads,
+				    struct netlink_ext_ack *extack)
+{
+	if (!MLX5_CAP_ESW(br_offloads->esw->dev, merged_eswitch))
+		return 0;
+
+	return mlx5_esw_bridge_vport_link_with_flags(ifindex, vport_num, esw_owner_vhca_id,
+						     MLX5_ESW_BRIDGE_PORT_FLAG_PEER,
+						     br_offloads, extack);
+}
+
+int mlx5_esw_bridge_vport_peer_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+				      struct mlx5_esw_bridge_offloads *br_offloads,
+				      struct netlink_ext_ack *extack)
+{
+	return mlx5_esw_bridge_vport_unlink(ifindex, vport_num, esw_owner_vhca_id, br_offloads,
+					    extack);
+}
+
 int mlx5_esw_bridge_port_vlan_add(u16 vport_num, u16 esw_owner_vhca_id, u16 vid, u16 flags,
 				  struct mlx5_esw_bridge_offloads *br_offloads,
 				  struct netlink_ext_ack *extack)
@@ -1206,6 +1243,7 @@ void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_o
 	bridge = port->bridge;
 	entry = mlx5_esw_bridge_fdb_entry_init(dev, vport_num, esw_owner_vhca_id, fdb_info->addr,
 					       fdb_info->vid, fdb_info->added_by_user,
+					       port->flags & MLX5_ESW_BRIDGE_PORT_FLAG_PEER,
 					       br_offloads->esw, bridge);
 	if (IS_ERR(entry))
 		return;
@@ -1213,7 +1251,7 @@ void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_o
 	if (entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)
 		mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
 						   SWITCHDEV_FDB_OFFLOADED);
-	else
+	else if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_PEER))
 		/* Take over dynamic entries to prevent kernel bridge from aging them out. */
 		mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
 						   SWITCHDEV_FDB_ADD_TO_BRIDGE);
@@ -1263,7 +1301,8 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
 
 			if (time_after(lastuse, entry->lastuse)) {
 				mlx5_esw_bridge_fdb_entry_refresh(lastuse, entry);
-			} else if (time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
+			} else if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_PEER) &&
+				   time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
 				mlx5_esw_bridge_fdb_del_notify(entry);
 				mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 			}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index 374f768db4cc..a4f04f3f5b11 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -40,6 +40,12 @@ int mlx5_esw_bridge_vport_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id
 int mlx5_esw_bridge_vport_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
 				 struct mlx5_esw_bridge_offloads *br_offloads,
 				 struct netlink_ext_ack *extack);
+int mlx5_esw_bridge_vport_peer_link(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+				    struct mlx5_esw_bridge_offloads *br_offloads,
+				    struct netlink_ext_ack *extack);
+int mlx5_esw_bridge_vport_peer_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
+				      struct mlx5_esw_bridge_offloads *br_offloads,
+				      struct netlink_ext_ack *extack);
 void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
 				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
index 7e1c5590aef8..52964a82d6a6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
@@ -19,6 +19,11 @@ struct mlx5_esw_bridge_fdb_key {
 
 enum {
 	MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER = BIT(0),
+	MLX5_ESW_BRIDGE_FLAG_PEER = BIT(1),
+};
+
+enum {
+	MLX5_ESW_BRIDGE_PORT_FLAG_PEER = BIT(0),
 };
 
 struct mlx5_esw_bridge_fdb_entry {
@@ -49,6 +54,7 @@ struct mlx5_esw_bridge_vlan {
 struct mlx5_esw_bridge_port {
 	u16 vport_num;
 	u16 esw_owner_vhca_id;
+	u16 flags;
 	struct mlx5_esw_bridge *bridge;
 	struct xarray vlans;
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
index 28231584da81..3401188e0a60 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
@@ -86,14 +86,17 @@ DECLARE_EVENT_CLASS(mlx5_esw_bridge_port_template,
 		    TP_STRUCT__entry(
 			    __field(u16, vport_num)
 			    __field(u16, esw_owner_vhca_id)
+			    __field(u16, flags)
 			    ),
 		    TP_fast_assign(
 			    __entry->vport_num = port->vport_num;
 			    __entry->esw_owner_vhca_id = port->esw_owner_vhca_id;
+			    __entry->flags = port->flags;
 			    ),
-		    TP_printk("vport_num=%hu esw_owner_vhca_id=%hu",
+		    TP_printk("vport_num=%hu esw_owner_vhca_id=%hu flags=%hx",
 			      __entry->vport_num,
-			      __entry->esw_owner_vhca_id)
+			      __entry->esw_owner_vhca_id,
+			      __entry->flags)
 	);
 
 DEFINE_EVENT(mlx5_esw_bridge_port_template,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 17/17] net/mlx5: Bridge, support LAG
  2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
                   ` (15 preceding siblings ...)
  2021-08-16 21:18 ` [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity Saeed Mahameed
@ 2021-08-16 21:18 ` Saeed Mahameed
  16 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 21:18 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Leon Romanovsky, Vlad Buslov, Roi Dayan,
	Mark Bloch, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Allow adding bond net devices to mlx5 bridge with following changes:

- Modify bridge representor code to obtain uplink represetor that belongs
to eswitch that is registered for notification. Require representor to be
in shared FDB mode. If representor is the lag master, then consider its
port as local, otherwise treat it as peer.

- Use devcom to match on paired eswitch metadata in peer FDB entries. This
is necessary for shared FDB LAG to function since packets are always
received on active eswitch instance as opposed to parent eswitch of port.

- Support for deleting peer flows when receiving
SWITCHDEV_FDB_DEL_TO_BRIDGE notification was implemented in one of previous
patches in series. Now also implement support for handling
SWITCHDEV_FDB_ADD_TO_BRIDGE which can be generated on peer by bridge update
workqueue task in LAG configuration. Refresh the flow 'lastuse' timestamp
to current jiffies when receiving such notification on eswitch that manages
the local FDB entry. This allows peer entries to prevent ageing of the FDB.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/rep/bridge.c        | 125 ++++++++++++------
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |  79 +++++++++--
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |   3 +
 3 files changed, 159 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 2e6f2bce9083..6590ce5325e7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -41,46 +41,88 @@ static bool mlx5_esw_bridge_dev_same_hw(struct net_device *dev, struct mlx5_eswi
 	return system_guid == esw_system_guid;
 }
 
-static int mlx5_esw_bridge_vport_num_vhca_id_get(struct net_device *dev, struct mlx5_eswitch *esw,
-						 u16 *vport_num, u16 *esw_owner_vhca_id)
+static struct net_device *
+mlx5_esw_bridge_lag_rep_get(struct net_device *dev, struct mlx5_eswitch *esw)
+{
+	struct net_device *lower;
+	struct list_head *iter;
+
+	netdev_for_each_lower_dev(dev, lower, iter) {
+		struct mlx5_core_dev *mdev;
+		struct mlx5e_priv *priv;
+
+		if (!mlx5e_eswitch_rep(lower))
+			continue;
+
+		priv = netdev_priv(lower);
+		mdev = priv->mdev;
+		if (mlx5_lag_is_shared_fdb(mdev) && mlx5_esw_bridge_dev_same_esw(lower, esw))
+			return lower;
+	}
+
+	return NULL;
+}
+
+static struct net_device *
+mlx5_esw_bridge_rep_vport_num_vhca_id_get(struct net_device *dev, struct mlx5_eswitch *esw,
+					  u16 *vport_num, u16 *esw_owner_vhca_id)
 {
 	struct mlx5e_rep_priv *rpriv;
 	struct mlx5e_priv *priv;
 
-	if (!mlx5e_eswitch_rep(dev) || !mlx5_esw_bridge_dev_same_hw(dev, esw))
-		return -ENODEV;
+	if (netif_is_lag_master(dev))
+		dev = mlx5_esw_bridge_lag_rep_get(dev, esw);
+
+	if (!dev || !mlx5e_eswitch_rep(dev) || !mlx5_esw_bridge_dev_same_hw(dev, esw))
+		return NULL;
 
 	priv = netdev_priv(dev);
 	rpriv = priv->ppriv;
 	*vport_num = rpriv->rep->vport;
 	*esw_owner_vhca_id = MLX5_CAP_GEN(priv->mdev, vhca_id);
-	return 0;
+	return dev;
 }
 
-static int
+static struct net_device *
 mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(struct net_device *dev, struct mlx5_eswitch *esw,
 						u16 *vport_num, u16 *esw_owner_vhca_id)
 {
 	struct net_device *lower_dev;
 	struct list_head *iter;
 
-	if (mlx5e_eswitch_rep(dev))
-		return mlx5_esw_bridge_vport_num_vhca_id_get(dev, esw, vport_num,
-							     esw_owner_vhca_id);
+	if (netif_is_lag_master(dev) || mlx5e_eswitch_rep(dev))
+		return mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, esw, vport_num,
+								 esw_owner_vhca_id);
 
 	netdev_for_each_lower_dev(dev, lower_dev, iter) {
-		int err;
+		struct net_device *rep;
 
 		if (netif_is_bridge_master(lower_dev))
 			continue;
 
-		err = mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(lower_dev, esw, vport_num,
+		rep = mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(lower_dev, esw, vport_num,
 								      esw_owner_vhca_id);
-		if (!err)
-			return 0;
+		if (rep)
+			return rep;
 	}
 
-	return -ENODEV;
+	return NULL;
+}
+
+static bool mlx5_esw_bridge_is_local(struct net_device *dev, struct net_device *rep,
+				     struct mlx5_eswitch *esw)
+{
+	struct mlx5_core_dev *mdev;
+	struct mlx5e_priv *priv;
+
+	if (!mlx5_esw_bridge_dev_same_esw(rep, esw))
+		return false;
+
+	priv = netdev_priv(rep);
+	mdev = priv->mdev;
+	if (netif_is_lag_master(dev))
+		return mlx5_lag_is_shared_fdb(mdev) && mlx5_lag_is_master(mdev);
+	return true;
 }
 
 static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr)
@@ -90,8 +132,8 @@ static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr
 								    netdev_nb);
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct netdev_notifier_changeupper_info *info = ptr;
+	struct net_device *upper = info->upper_dev, *rep;
 	struct mlx5_eswitch *esw = br_offloads->esw;
-	struct net_device *upper = info->upper_dev;
 	u16 vport_num, esw_owner_vhca_id;
 	struct netlink_ext_ack *extack;
 	int ifindex = upper->ifindex;
@@ -100,20 +142,19 @@ static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr
 	if (!netif_is_bridge_master(upper))
 		return 0;
 
-	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
-						    &esw_owner_vhca_id);
-	if (err)
+	rep = mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, esw, &vport_num, &esw_owner_vhca_id);
+	if (!rep)
 		return 0;
 
 	extack = netdev_notifier_info_to_extack(&info->info);
 
-	if (mlx5_esw_bridge_dev_same_esw(dev, esw))
+	if (mlx5_esw_bridge_is_local(dev, rep, esw))
 		err = info->linking ?
 			mlx5_esw_bridge_vport_link(ifindex, vport_num, esw_owner_vhca_id,
 						   br_offloads, extack) :
 			mlx5_esw_bridge_vport_unlink(ifindex, vport_num, esw_owner_vhca_id,
 						     br_offloads, extack);
-	else if (mlx5_esw_bridge_dev_same_hw(dev, esw))
+	else if (mlx5_esw_bridge_dev_same_hw(rep, esw))
 		err = info->linking ?
 			mlx5_esw_bridge_vport_peer_link(ifindex, vport_num, esw_owner_vhca_id,
 							br_offloads, extack) :
@@ -151,9 +192,8 @@ mlx5_esw_bridge_port_obj_add(struct net_device *dev,
 	u16 vport_num, esw_owner_vhca_id;
 	int err;
 
-	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
-						    &esw_owner_vhca_id);
-	if (err)
+	if (!mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						       &esw_owner_vhca_id))
 		return 0;
 
 	port_obj_info->handled = true;
@@ -178,11 +218,9 @@ mlx5_esw_bridge_port_obj_del(struct net_device *dev,
 	const struct switchdev_obj *obj = port_obj_info->obj;
 	const struct switchdev_obj_port_vlan *vlan;
 	u16 vport_num, esw_owner_vhca_id;
-	int err;
 
-	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
-						    &esw_owner_vhca_id);
-	if (err)
+	if (!mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						       &esw_owner_vhca_id))
 		return 0;
 
 	port_obj_info->handled = true;
@@ -208,9 +246,8 @@ mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
 	u16 vport_num, esw_owner_vhca_id;
 	int err;
 
-	err = mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
-							      &esw_owner_vhca_id);
-	if (err)
+	if (!mlx5_esw_bridge_lower_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+							     &esw_owner_vhca_id))
 		return 0;
 
 	port_attr_info->handled = true;
@@ -284,14 +321,12 @@ static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work)
 	struct net_device *dev = fdb_work->dev;
 	u16 vport_num, esw_owner_vhca_id;
 	struct mlx5e_priv *priv;
-	int err;
 
 	rtnl_lock();
 
 	priv = netdev_priv(dev);
-	err = mlx5_esw_bridge_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
-						    &esw_owner_vhca_id);
-	if (err)
+	if (!mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num,
+						       &esw_owner_vhca_id))
 		goto out;
 
 	if (fdb_work->add)
@@ -345,8 +380,10 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
 	struct switchdev_notifier_fdb_info *fdb_info;
 	struct mlx5_bridge_switchdev_fdb_work *work;
+	struct mlx5_eswitch *esw = br_offloads->esw;
 	struct switchdev_notifier_info *info = ptr;
-	struct net_device *upper;
+	u16 vport_num, esw_owner_vhca_id;
+	struct net_device *upper, *rep;
 
 	if (event == SWITCHDEV_PORT_ATTR_SET) {
 		int err = mlx5_esw_bridge_port_obj_attr_set(dev, ptr, br_offloads);
@@ -360,13 +397,25 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	if (!netif_is_bridge_master(upper))
 		return NOTIFY_DONE;
 
-	if (!mlx5e_eswitch_rep(dev))
+	rep = mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, esw, &vport_num, &esw_owner_vhca_id);
+	if (!rep)
 		return NOTIFY_DONE;
 
 	switch (event) {
+	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
+		/* only handle the event on native eswtich of representor */
+		if (!mlx5_esw_bridge_is_local(dev, rep, esw))
+			break;
+
+		fdb_info = container_of(info,
+					struct switchdev_notifier_fdb_info,
+					info);
+		mlx5_esw_bridge_fdb_update_used(dev, vport_num, esw_owner_vhca_id, br_offloads,
+						fdb_info);
+		break;
 	case SWITCHDEV_FDB_DEL_TO_BRIDGE:
-		/* only handle the event when source is on another eswitch */
-		if (mlx5_esw_bridge_dev_same_esw(dev, br_offloads->esw))
+		/* only handle the event on peers */
+		if (mlx5_esw_bridge_is_local(dev, rep, esw))
 			break;
 		fallthrough;
 	case SWITCHDEV_FDB_ADD_TO_DEVICE:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 20d44b0ae337..7e221038df8d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -5,6 +5,7 @@
 #include <linux/notifier.h>
 #include <net/netevent.h>
 #include <net/switchdev.h>
+#include "lib/devcom.h"
 #include "bridge.h"
 #include "eswitch.h"
 #include "bridge_priv.h"
@@ -408,9 +409,10 @@ mlx5_esw_bridge_egress_table_cleanup(struct mlx5_esw_bridge *bridge)
 }
 
 static struct mlx5_flow_handle *
-mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
-				    struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
-				    struct mlx5_esw_bridge *bridge)
+mlx5_esw_bridge_ingress_flow_with_esw_create(u16 vport_num, const unsigned char *addr,
+					     struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
+					     struct mlx5_esw_bridge *bridge,
+					     struct mlx5_eswitch *esw)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads;
 	struct mlx5_flow_act flow_act = {
@@ -438,7 +440,7 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
 	MLX5_SET(fte_match_param, rule_spec->match_criteria,
 		 misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_mask());
 	MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0,
-		 mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num));
+		 mlx5_eswitch_get_vport_metadata_for_match(esw, vport_num));
 
 	if (vlan && vlan->pkt_reformat_push) {
 		flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
@@ -466,6 +468,35 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
 	return handle;
 }
 
+static struct mlx5_flow_handle *
+mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
+				    struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
+				    struct mlx5_esw_bridge *bridge)
+{
+	return mlx5_esw_bridge_ingress_flow_with_esw_create(vport_num, addr, vlan, counter_id,
+							    bridge, bridge->br_offloads->esw);
+}
+
+static struct mlx5_flow_handle *
+mlx5_esw_bridge_ingress_flow_peer_create(u16 vport_num, const unsigned char *addr,
+					 struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
+					 struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_devcom *devcom = bridge->br_offloads->esw->dev->priv.devcom;
+	static struct mlx5_flow_handle *handle;
+	struct mlx5_eswitch *peer_esw;
+
+	peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+	if (!peer_esw)
+		return ERR_PTR(-ENODEV);
+
+	handle = mlx5_esw_bridge_ingress_flow_with_esw_create(vport_num, addr, vlan, counter_id,
+							      bridge, peer_esw);
+
+	mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+	return handle;
+}
+
 static struct mlx5_flow_handle *
 mlx5_esw_bridge_ingress_filter_flow_create(u16 vport_num, const unsigned char *addr,
 					   struct mlx5_esw_bridge *bridge)
@@ -679,12 +710,10 @@ static void mlx5_esw_bridge_port_erase(struct mlx5_esw_bridge_port *port,
 	xa_erase(&br_offloads->ports, mlx5_esw_bridge_port_key(port));
 }
 
-static void mlx5_esw_bridge_fdb_entry_refresh(unsigned long lastuse,
-					      struct mlx5_esw_bridge_fdb_entry *entry)
+static void mlx5_esw_bridge_fdb_entry_refresh(struct mlx5_esw_bridge_fdb_entry *entry)
 {
 	trace_mlx5_esw_bridge_fdb_entry_refresh(entry);
 
-	entry->lastuse = lastuse;
 	mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
 					   entry->key.vid,
 					   SWITCHDEV_FDB_ADD_TO_BRIDGE);
@@ -959,8 +988,11 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_ow
 	}
 	entry->ingress_counter = counter;
 
-	handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vlan, mlx5_fc_id(counter),
-						     bridge);
+	handle = peer ?
+		mlx5_esw_bridge_ingress_flow_peer_create(vport_num, addr, vlan,
+							 mlx5_fc_id(counter), bridge) :
+		mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vlan,
+						    mlx5_fc_id(counter), bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
 		esw_warn(esw->dev, "Failed to create ingress flow(vport=%u,err=%d)\n",
@@ -1228,6 +1260,33 @@ void mlx5_esw_bridge_port_vlan_del(u16 vport_num, u16 esw_owner_vhca_id, u16 vid
 	mlx5_esw_bridge_vlan_cleanup(port, vlan, port->bridge);
 }
 
+void mlx5_esw_bridge_fdb_update_used(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+				     struct mlx5_esw_bridge_offloads *br_offloads,
+				     struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct mlx5_esw_bridge_fdb_entry *entry;
+	struct mlx5_esw_bridge_fdb_key key;
+	struct mlx5_esw_bridge_port *port;
+	struct mlx5_esw_bridge *bridge;
+
+	port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads);
+	if (!port || port->flags & MLX5_ESW_BRIDGE_PORT_FLAG_PEER)
+		return;
+
+	bridge = port->bridge;
+	ether_addr_copy(key.addr, fdb_info->addr);
+	key.vid = fdb_info->vid;
+	entry = rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params);
+	if (!entry) {
+		esw_debug(br_offloads->esw->dev,
+			  "FDB entry with specified key not found (MAC=%pM,vid=%u,vport=%u)\n",
+			  key.addr, key.vid, vport_num);
+		return;
+	}
+
+	entry->lastuse = jiffies;
+}
+
 void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
 				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info)
@@ -1300,7 +1359,7 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
 				continue;
 
 			if (time_after(lastuse, entry->lastuse)) {
-				mlx5_esw_bridge_fdb_entry_refresh(lastuse, entry);
+				mlx5_esw_bridge_fdb_entry_refresh(entry);
 			} else if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_PEER) &&
 				   time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
 				mlx5_esw_bridge_fdb_del_notify(entry);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index a4f04f3f5b11..efc39975226e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -46,6 +46,9 @@ int mlx5_esw_bridge_vport_peer_link(int ifindex, u16 vport_num, u16 esw_owner_vh
 int mlx5_esw_bridge_vport_peer_unlink(int ifindex, u16 vport_num, u16 esw_owner_vhca_id,
 				      struct mlx5_esw_bridge_offloads *br_offloads,
 				      struct netlink_ext_ack *extack);
+void mlx5_esw_bridge_fdb_update_used(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
+				     struct mlx5_esw_bridge_offloads *br_offloads,
+				     struct switchdev_notifier_fdb_info *fdb_info);
 void mlx5_esw_bridge_fdb_create(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
 				struct mlx5_esw_bridge_offloads *br_offloads,
 				struct switchdev_notifier_fdb_info *fdb_info);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity
  2021-08-16 21:18 ` [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity Saeed Mahameed
@ 2021-08-16 22:38   ` Jakub Kicinski
  2021-08-16 23:23     ` Saeed Mahameed
  0 siblings, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2021-08-16 22:38 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Tariq Toukan, Leon Romanovsky,
	Vlad Buslov, Roi Dayan, Mark Bloch, Saeed Mahameed

On Mon, 16 Aug 2021 14:18:46 -0700 Saeed Mahameed wrote:
> From: Vlad Buslov <vladbu@nvidia.com>
> 
> Allow connectivity between representors of different eswitch instances that
> are attached to same bridge when merged_eswitch capability is enabled. Add
> ports of peer eswitch to bridge instance and mark them with
> MLX5_ESW_BRIDGE_PORT_FLAG_PEER. Mark FDBs offloaded on peer ports with
> MLX5_ESW_BRIDGE_FLAG_PEER flag. Such FDBs can only be aged out on their
> local eswitch instance, which then sends SWITCHDEV_FDB_DEL_TO_BRIDGE event.
> Listen to the event on mlx5 bridge implementation and delete peer FDBs in
> event handler.
> 
> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
> Reviewed-by: Roi Dayan <roid@nvidia.com>
> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>


drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c: In function ‘mlx5_esw_bridge_switchdev_fdb_event_work’:
drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c:286:21: warning: variable ‘priv’ set but not used [-Wunused-but-set-variable]
  286 |  struct mlx5e_priv *priv;
      |                     ^~~~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity
  2021-08-16 22:38   ` Jakub Kicinski
@ 2021-08-16 23:23     ` Saeed Mahameed
  0 siblings, 0 replies; 20+ messages in thread
From: Saeed Mahameed @ 2021-08-16 23:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, netdev, Tariq Toukan, Leon Romanovsky,
	Vlad Buslov, Roi Dayan, Mark Bloch

On Mon, 2021-08-16 at 15:38 -0700, Jakub Kicinski wrote:
> On Mon, 16 Aug 2021 14:18:46 -0700 Saeed Mahameed wrote:
> > From: Vlad Buslov <vladbu@nvidia.com>
> > 
> > Allow connectivity between representors of different eswitch
> > instances that
> > are attached to same bridge when merged_eswitch capability is
> > enabled. Add
> > ports of peer eswitch to bridge instance and mark them with
> > MLX5_ESW_BRIDGE_PORT_FLAG_PEER. Mark FDBs offloaded on peer ports
> > with
> > MLX5_ESW_BRIDGE_FLAG_PEER flag. Such FDBs can only be aged out on
> > their
> > local eswitch instance, which then sends
> > SWITCHDEV_FDB_DEL_TO_BRIDGE event.
> > Listen to the event on mlx5 bridge implementation and delete peer
> > FDBs in
> > event handler.
> > 
> > Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
> > Reviewed-by: Roi Dayan <roid@nvidia.com>
> > Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> 
> 
> drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c: In function
> ‘mlx5_esw_bridge_switchdev_fdb_event_work’:
> drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c:286:21:
> warning: variable ‘priv’ set but not used [-Wunused-but-set-variable]
>   286 |  struct mlx5e_priv *priv;
>       |                     ^~~~

Missing kconfig in our CI.
Thanks for the reprot, handled in V2.





^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-08-16 23:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-16 21:18 [pull request][net-next 00/17] mlx5 updates 2021-08-16 Saeed Mahameed
2021-08-16 21:18 ` [net-next 01/17] net/mlx5e: Do not try enable RSS when resetting indir table Saeed Mahameed
2021-08-16 21:18 ` [net-next 02/17] net/mlx5e: Introduce TIR create/destroy API in rx_res Saeed Mahameed
2021-08-16 21:18 ` [net-next 03/17] net/mlx5e: Introduce abstraction of RSS context Saeed Mahameed
2021-08-16 21:18 ` [net-next 04/17] net/mlx5e: Convert RSS to a dedicated object Saeed Mahameed
2021-08-16 21:18 ` [net-next 05/17] net/mlx5e: Dynamically allocate TIRs in RSS contexts Saeed Mahameed
2021-08-16 21:18 ` [net-next 06/17] net/mlx5e: Support multiple " Saeed Mahameed
2021-08-16 21:18 ` [net-next 07/17] net/mlx5e: Support flow classification into " Saeed Mahameed
2021-08-16 21:18 ` [net-next 08/17] net/mlx5e: Abstract MQPRIO params Saeed Mahameed
2021-08-16 21:18 ` [net-next 09/17] net/mlx5e: Maintain MQPRIO mode parameter Saeed Mahameed
2021-08-16 21:18 ` [net-next 10/17] net/mlx5e: Handle errors of netdev_set_num_tc() Saeed Mahameed
2021-08-16 21:18 ` [net-next 11/17] net/mlx5e: Support MQPRIO channel mode Saeed Mahameed
2021-08-16 21:18 ` [net-next 12/17] net/mlx5: Bridge, release bridge in same function where it is taken Saeed Mahameed
2021-08-16 21:18 ` [net-next 13/17] net/mlx5: Bridge, obtain core device from eswitch instead of priv Saeed Mahameed
2021-08-16 21:18 ` [net-next 14/17] net/mlx5: Bridge, identify port by vport_num+esw_owner_vhca_id pair Saeed Mahameed
2021-08-16 21:18 ` [net-next 15/17] net/mlx5: Bridge, extract FDB delete notification to function Saeed Mahameed
2021-08-16 21:18 ` [net-next 16/17] net/mlx5: Bridge, allow merged eswitch connectivity Saeed Mahameed
2021-08-16 22:38   ` Jakub Kicinski
2021-08-16 23:23     ` Saeed Mahameed
2021-08-16 21:18 ` [net-next 17/17] net/mlx5: Bridge, support LAG Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.