netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net 0/7] mlx5 fixes 2022-05-31
@ 2022-05-31 20:54 Saeed Mahameed
  2022-05-31 20:54 ` [net 1/7] net/mlx5: Don't use already freed action pointer Saeed Mahameed
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni; +Cc: netdev, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit 09e545f7381459c015b6fa0cd0ac6f010ef8cc25:

  xen/netback: fix incorrect usage of RING_HAS_UNCONSUMED_REQUESTS() (2022-05-31 12:22:22 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2022-05-31

for you to fetch changes up to 1c5de097bea31760c3f0467ac0c84ba0dc3525d5:

  net/mlx5: Fix mlx5_get_next_dev() peer device matching (2022-05-31 13:40:55 -0700)

----------------------------------------------------------------
mlx5-fixes-2022-05-31

----------------------------------------------------------------
Changcheng Liu (1):
      net/mlx5: correct ECE offset in query qp output

Leon Romanovsky (1):
      net/mlx5: Don't use already freed action pointer

Maor Dickman (1):
      net/mlx5e: TC NIC mode, fix tc chains miss table

Maxim Mikityanskiy (2):
      net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition
      net/mlx5e: Update netdev features after changing XDP state

Paul Blakey (1):
      net/mlx5: CT: Fix header-rewrite re-use for tupels

Saeed Mahameed (1):
      net/mlx5: Fix mlx5_get_next_dev() peer device matching

 drivers/net/ethernet/mellanox/mlx5/core/dev.c      | 34 ++++++++++++-------
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  4 +++
 drivers/net/ethernet/mellanox/mlx5/core/en/fs.h    |  2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c   |  1 +
 .../ethernet/mellanox/mlx5/core/en/reporter_rx.c   |  6 ++++
 drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 19 ++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/en/trap.c  |  1 +
 .../net/ethernet/mellanox/mlx5/core/en/xsk/pool.c  |  1 +
 .../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c |  5 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 29 +++++++++++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    | 38 ++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |  2 +-
 .../ethernet/mellanox/mlx5/core/steering/fs_dr.c   |  9 +++--
 include/linux/mlx5/mlx5_ifc.h                      |  5 ++-
 14 files changed, 115 insertions(+), 41 deletions(-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [net 1/7] net/mlx5: Don't use already freed action pointer
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  2022-06-02  1:20   ` patchwork-bot+netdevbpf
  2022-05-31 20:54 ` [net 2/7] net/mlx5e: TC NIC mode, fix tc chains miss table Saeed Mahameed
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Leon Romanovsky, Dan Carpenter, Saeed Mahameed

From: Leon Romanovsky <leonro@nvidia.com>

The call to mlx5dr_action_destroy() releases "action" memory. That
pointer is set to miss_action later and generates the following smatch
error:

 drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c:53 set_miss_action()
 warn: 'action' was already freed.

Make sure that the pointer is always valid by setting NULL after destroy.

Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
index 728f81882589..6a9abba92df6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
@@ -44,11 +44,10 @@ static int set_miss_action(struct mlx5_flow_root_namespace *ns,
 	err = mlx5dr_table_set_miss_action(ft->fs_dr_table.dr_table, action);
 	if (err && action) {
 		err = mlx5dr_action_destroy(action);
-		if (err) {
-			action = NULL;
-			mlx5_core_err(ns->dev, "Failed to destroy action (%d)\n",
-				      err);
-		}
+		if (err)
+			mlx5_core_err(ns->dev,
+				      "Failed to destroy action (%d)\n", err);
+		action = NULL;
 	}
 	ft->fs_dr_table.miss_action = action;
 	if (old_miss_action) {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net 2/7] net/mlx5e: TC NIC mode, fix tc chains miss table
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
  2022-05-31 20:54 ` [net 1/7] net/mlx5: Don't use already freed action pointer Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  2022-05-31 20:54 ` [net 3/7] net/mlx5: CT: Fix header-rewrite re-use for tupels Saeed Mahameed
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maor Dickman, Paul Blakey, Ariel Levkovich, Saeed Mahameed

From: Maor Dickman <maord@nvidia.com>

The cited commit changed promisc table to be created on demand with the
highest priority in the NIC table replacing the vlan table, this caused
tc NIC tables miss flow to skip the prmoisc table because it use vlan
table as miss table.

OVS offload in NIC mode use promisc by default so any unicast packet
which will be handled by tc NIC tables miss flow will skip the promisc
rule and will be dropped.

Fix this by adding new empty table in new tc level with low priority and
point the nic tc chain miss to it, the new table is managed so it will
point to vlan table if promisc is disabled and to promisc table if enabled.

Fixes: 1c46d7409f30 ("net/mlx5e: Optimize promiscuous mode")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/fs.h   |  2 +
 .../net/ethernet/mellanox/mlx5/core/en_tc.c   | 38 ++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/fs_core.c |  2 +-
 3 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
index 4130a871de61..6e3a90a959e9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
@@ -12,6 +12,7 @@ struct mlx5e_post_act;
 enum {
 	MLX5E_TC_FT_LEVEL = 0,
 	MLX5E_TC_TTC_FT_LEVEL,
+	MLX5E_TC_MISS_LEVEL,
 };
 
 struct mlx5e_tc_table {
@@ -20,6 +21,7 @@ struct mlx5e_tc_table {
 	 */
 	struct mutex			t_lock;
 	struct mlx5_flow_table		*t;
+	struct mlx5_flow_table		*miss_t;
 	struct mlx5_fs_chains           *chains;
 	struct mlx5e_post_act		*post_act;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 49dea02a12d2..34bf11cdf90f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -4714,6 +4714,33 @@ static int mlx5e_tc_nic_get_ft_size(struct mlx5_core_dev *dev)
 	return tc_tbl_size;
 }
 
+static int mlx5e_tc_nic_create_miss_table(struct mlx5e_priv *priv)
+{
+	struct mlx5_flow_table **ft = &priv->fs.tc.miss_t;
+	struct mlx5_flow_table_attr ft_attr = {};
+	struct mlx5_flow_namespace *ns;
+	int err = 0;
+
+	ft_attr.max_fte = 1;
+	ft_attr.autogroup.max_num_groups = 1;
+	ft_attr.level = MLX5E_TC_MISS_LEVEL;
+	ft_attr.prio = 0;
+	ns = mlx5_get_flow_namespace(priv->mdev, MLX5_FLOW_NAMESPACE_KERNEL);
+
+	*ft = mlx5_create_auto_grouped_flow_table(ns, &ft_attr);
+	if (IS_ERR(*ft)) {
+		err = PTR_ERR(*ft);
+		netdev_err(priv->netdev, "failed to create tc nic miss table err=%d\n", err);
+	}
+
+	return err;
+}
+
+static void mlx5e_tc_nic_destroy_miss_table(struct mlx5e_priv *priv)
+{
+	mlx5_destroy_flow_table(priv->fs.tc.miss_t);
+}
+
 int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
 {
 	struct mlx5e_tc_table *tc = &priv->fs.tc;
@@ -4746,19 +4773,23 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
 	}
 	tc->mapping = chains_mapping;
 
+	err = mlx5e_tc_nic_create_miss_table(priv);
+	if (err)
+		goto err_chains;
+
 	if (MLX5_CAP_FLOWTABLE_NIC_RX(priv->mdev, ignore_flow_level))
 		attr.flags = MLX5_CHAINS_AND_PRIOS_SUPPORTED |
 			MLX5_CHAINS_IGNORE_FLOW_LEVEL_SUPPORTED;
 	attr.ns = MLX5_FLOW_NAMESPACE_KERNEL;
 	attr.max_ft_sz = mlx5e_tc_nic_get_ft_size(dev);
 	attr.max_grp_num = MLX5E_TC_TABLE_NUM_GROUPS;
-	attr.default_ft = mlx5e_vlan_get_flowtable(priv->fs.vlan);
+	attr.default_ft = priv->fs.tc.miss_t;
 	attr.mapping = chains_mapping;
 
 	tc->chains = mlx5_chains_create(dev, &attr);
 	if (IS_ERR(tc->chains)) {
 		err = PTR_ERR(tc->chains);
-		goto err_chains;
+		goto err_miss;
 	}
 
 	tc->post_act = mlx5e_tc_post_act_init(priv, tc->chains, MLX5_FLOW_NAMESPACE_KERNEL);
@@ -4781,6 +4812,8 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
 	mlx5_tc_ct_clean(tc->ct);
 	mlx5e_tc_post_act_destroy(tc->post_act);
 	mlx5_chains_destroy(tc->chains);
+err_miss:
+	mlx5e_tc_nic_destroy_miss_table(priv);
 err_chains:
 	mapping_destroy(chains_mapping);
 err_mapping:
@@ -4821,6 +4854,7 @@ void mlx5e_tc_nic_cleanup(struct mlx5e_priv *priv)
 	mlx5e_tc_post_act_destroy(tc->post_act);
 	mapping_destroy(tc->mapping);
 	mlx5_chains_destroy(tc->chains);
+	mlx5e_tc_nic_destroy_miss_table(priv);
 }
 
 int mlx5e_tc_ht_init(struct rhashtable *tc_ht)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 84caffe4c278..fdcf7f529330 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -114,7 +114,7 @@
 #define KERNEL_MIN_LEVEL (KERNEL_NIC_PRIO_NUM_LEVELS + 1)
 
 #define KERNEL_NIC_TC_NUM_PRIOS  1
-#define KERNEL_NIC_TC_NUM_LEVELS 2
+#define KERNEL_NIC_TC_NUM_LEVELS 3
 
 #define ANCHOR_NUM_LEVELS 1
 #define ANCHOR_NUM_PRIOS 1
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net 3/7] net/mlx5: CT: Fix header-rewrite re-use for tupels
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
  2022-05-31 20:54 ` [net 1/7] net/mlx5: Don't use already freed action pointer Saeed Mahameed
  2022-05-31 20:54 ` [net 2/7] net/mlx5e: TC NIC mode, fix tc chains miss table Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  2022-05-31 20:54 ` [net 4/7] net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition Saeed Mahameed
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Paul Blakey, Ariel Levkovich, Saeed Mahameed

From: Paul Blakey <paulb@nvidia.com>

Tuple entries that don't have nat configured for them
which are added to the ct nat table will always create
a new modify header, as we don't check for possible
re-use on them. The same for tuples that have nat configured
for them but are added to ct table.

Fix the above by only avoiding wasteful re-use lookup
for actually natted entries in ct nat table.

Fixes: 7fac5c2eced3 ("net/mlx5: CT: Avoid reusing modify header context for natted entries")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/tc_ct.c    | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
index bceea7a1589e..25f51f80a9b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
@@ -715,7 +715,7 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
 				struct mlx5_flow_attr *attr,
 				struct flow_rule *flow_rule,
 				struct mlx5e_mod_hdr_handle **mh,
-				u8 zone_restore_id, bool nat)
+				u8 zone_restore_id, bool nat_table, bool has_nat)
 {
 	DECLARE_MOD_HDR_ACTS_ACTIONS(actions_arr, MLX5_CT_MIN_MOD_ACTS);
 	DECLARE_MOD_HDR_ACTS(mod_acts, actions_arr);
@@ -731,11 +731,12 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
 				     &attr->ct_attr.ct_labels_id);
 	if (err)
 		return -EOPNOTSUPP;
-	if (nat) {
-		err = mlx5_tc_ct_entry_create_nat(ct_priv, flow_rule,
-						  &mod_acts);
-		if (err)
-			goto err_mapping;
+	if (nat_table) {
+		if (has_nat) {
+			err = mlx5_tc_ct_entry_create_nat(ct_priv, flow_rule, &mod_acts);
+			if (err)
+				goto err_mapping;
+		}
 
 		ct_state |= MLX5_CT_STATE_NAT_BIT;
 	}
@@ -750,7 +751,7 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
 	if (err)
 		goto err_mapping;
 
-	if (nat) {
+	if (nat_table && has_nat) {
 		attr->modify_hdr = mlx5_modify_header_alloc(ct_priv->dev, ct_priv->ns_type,
 							    mod_acts.num_actions,
 							    mod_acts.actions);
@@ -818,7 +819,9 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
 
 	err = mlx5_tc_ct_entry_create_mod_hdr(ct_priv, attr, flow_rule,
 					      &zone_rule->mh,
-					      zone_restore_id, nat);
+					      zone_restore_id,
+					      nat,
+					      mlx5_tc_ct_entry_has_nat(entry));
 	if (err) {
 		ct_dbg("Failed to create ct entry mod hdr");
 		goto err_mod_hdr;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net 4/7] net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2022-05-31 20:54 ` [net 3/7] net/mlx5: CT: Fix header-rewrite re-use for tupels Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  2022-05-31 20:54 ` [net 5/7] net/mlx5: correct ECE offset in query qp output Saeed Mahameed
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maxim Mikityanskiy, Karsten Nielsen, Tariq Toukan,
	Gal Pressman, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

When the driver activates the channels, it assumes NAPI isn't running
yet. mlx5e_activate_rq posts a NOP WQE to ICOSQ to trigger a hardware
interrupt and start NAPI, which will run mlx5e_alloc_rx_mpwqe and post
UMR WQEs to ICOSQ to be able to receive packets with striding RQ.

Unfortunately, a race condition is possible if NAPI is triggered by
something else (for example, TX) at a bad timing, before
mlx5e_activate_rq finishes. In this case, mlx5e_alloc_rx_mpwqe may post
UMR WQEs to ICOSQ, and with the bad timing, the wqe_info of the first
UMR may be overwritten by the wqe_info of the NOP posted by
mlx5e_activate_rq.

The consequence is that icosq->db.wqe_info[0].num_wqebbs will be changed
from MLX5E_UMR_WQEBBS to 1, disrupting the integrity of the array-based
linked list in wqe_info[]. mlx5e_poll_ico_cq will hang in an infinite
loop after processing wqe_info[0], because after the corruption, the
next item to be processed will be wqe_info[1], which is filled with
zeros, and `sqcc += wi->num_wqebbs` will never move further.

This commit fixes this race condition by using async_icosq to post the
NOP and trigger the interrupt. async_icosq is always protected with a
spinlock, eliminating the race condition.

Fixes: bc77b240b3c5 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reported-by: Karsten Nielsen <karsten@foo-bar.dk>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  4 ++++
 .../net/ethernet/mellanox/mlx5/core/en/ptp.c  |  1 +
 .../mellanox/mlx5/core/en/reporter_rx.c       |  6 +++++
 .../net/ethernet/mellanox/mlx5/core/en/trap.c |  1 +
 .../ethernet/mellanox/mlx5/core/en/xsk/pool.c |  1 +
 .../mellanox/mlx5/core/en/xsk/setup.c         |  5 +---
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++------
 7 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 65d3c4865abf..b6c15efe92ad 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -764,6 +764,7 @@ struct mlx5e_rq {
 	u8                     wq_type;
 	u32                    rqn;
 	struct mlx5_core_dev  *mdev;
+	struct mlx5e_channel  *channel;
 	u32  umr_mkey;
 	struct mlx5e_dma_info  wqe_overflow;
 
@@ -1076,6 +1077,9 @@ void mlx5e_close_cq(struct mlx5e_cq *cq);
 int mlx5e_open_locked(struct net_device *netdev);
 int mlx5e_close_locked(struct net_device *netdev);
 
+void mlx5e_trigger_napi_icosq(struct mlx5e_channel *c);
+void mlx5e_trigger_napi_sched(struct napi_struct *napi);
+
 int mlx5e_open_channels(struct mlx5e_priv *priv,
 			struct mlx5e_channels *chs);
 void mlx5e_close_channels(struct mlx5e_channels *chs);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index 335b20b6383b..047f88f09203 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -736,6 +736,7 @@ void mlx5e_ptp_activate_channel(struct mlx5e_ptp *c)
 	if (test_bit(MLX5E_PTP_STATE_RX, c->state)) {
 		mlx5e_ptp_rx_set_fs(c->priv);
 		mlx5e_activate_rq(&c->rq);
+		mlx5e_trigger_napi_sched(&c->napi);
 	}
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
index 2684e9da9f41..fc366e66d0b0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
@@ -123,6 +123,8 @@ static int mlx5e_rx_reporter_err_icosq_cqe_recover(void *ctx)
 		xskrq->stats->recover++;
 	}
 
+	mlx5e_trigger_napi_icosq(icosq->channel);
+
 	mutex_unlock(&icosq->channel->icosq_recovery_lock);
 
 	return 0;
@@ -166,6 +168,10 @@ static int mlx5e_rx_reporter_err_rq_cqe_recover(void *ctx)
 	clear_bit(MLX5E_RQ_STATE_RECOVERING, &rq->state);
 	mlx5e_activate_rq(rq);
 	rq->stats->recover++;
+	if (rq->channel)
+		mlx5e_trigger_napi_icosq(rq->channel);
+	else
+		mlx5e_trigger_napi_sched(rq->cq.napi);
 	return 0;
 out:
 	clear_bit(MLX5E_RQ_STATE_RECOVERING, &rq->state);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
index 857840ab1e91..11f2a7fb72a9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
@@ -179,6 +179,7 @@ static void mlx5e_activate_trap(struct mlx5e_trap *trap)
 {
 	napi_enable(&trap->napi);
 	mlx5e_activate_rq(&trap->rq);
+	mlx5e_trigger_napi_sched(&trap->napi);
 }
 
 void mlx5e_deactivate_trap(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
index 279cd8f4e79f..2c520394aa1d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
@@ -117,6 +117,7 @@ static int mlx5e_xsk_enable_locked(struct mlx5e_priv *priv,
 		goto err_remove_pool;
 
 	mlx5e_activate_xsk(c);
+	mlx5e_trigger_napi_icosq(c);
 
 	/* Don't wait for WQEs, because the newer xdpsock sample doesn't provide
 	 * any Fill Ring entries at the setup stage.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
index 3ad7f1301fa8..98ed9ef3a6bd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
@@ -64,6 +64,7 @@ static int mlx5e_init_xsk_rq(struct mlx5e_channel *c,
 	rq->clock        = &mdev->clock;
 	rq->icosq        = &c->icosq;
 	rq->ix           = c->ix;
+	rq->channel      = c;
 	rq->mdev         = mdev;
 	rq->hw_mtu       = MLX5E_SW2HW_MTU(params, params->sw_mtu);
 	rq->xdpsq        = &c->rq_xdpsq;
@@ -179,10 +180,6 @@ void mlx5e_activate_xsk(struct mlx5e_channel *c)
 	mlx5e_reporter_icosq_resume_recovery(c);
 
 	/* TX queue is created active. */
-
-	spin_lock_bh(&c->async_icosq_lock);
-	mlx5e_trigger_irq(&c->async_icosq);
-	spin_unlock_bh(&c->async_icosq_lock);
 }
 
 void mlx5e_deactivate_xsk(struct mlx5e_channel *c)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 05c015515cce..930a5402c817 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -475,6 +475,7 @@ static int mlx5e_init_rxq_rq(struct mlx5e_channel *c, struct mlx5e_params *param
 	rq->clock        = &mdev->clock;
 	rq->icosq        = &c->icosq;
 	rq->ix           = c->ix;
+	rq->channel      = c;
 	rq->mdev         = mdev;
 	rq->hw_mtu       = MLX5E_SW2HW_MTU(params, params->sw_mtu);
 	rq->xdpsq        = &c->rq_xdpsq;
@@ -1066,13 +1067,6 @@ int mlx5e_open_rq(struct mlx5e_params *params, struct mlx5e_rq_param *param,
 void mlx5e_activate_rq(struct mlx5e_rq *rq)
 {
 	set_bit(MLX5E_RQ_STATE_ENABLED, &rq->state);
-	if (rq->icosq) {
-		mlx5e_trigger_irq(rq->icosq);
-	} else {
-		local_bh_disable();
-		napi_schedule(rq->cq.napi);
-		local_bh_enable();
-	}
 }
 
 void mlx5e_deactivate_rq(struct mlx5e_rq *rq)
@@ -2227,6 +2221,20 @@ static int mlx5e_channel_stats_alloc(struct mlx5e_priv *priv, int ix, int cpu)
 	return 0;
 }
 
+void mlx5e_trigger_napi_icosq(struct mlx5e_channel *c)
+{
+	spin_lock_bh(&c->async_icosq_lock);
+	mlx5e_trigger_irq(&c->async_icosq);
+	spin_unlock_bh(&c->async_icosq_lock);
+}
+
+void mlx5e_trigger_napi_sched(struct napi_struct *napi)
+{
+	local_bh_disable();
+	napi_schedule(napi);
+	local_bh_enable();
+}
+
 static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 			      struct mlx5e_params *params,
 			      struct mlx5e_channel_param *cparam,
@@ -2308,6 +2316,8 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c)
 
 	if (test_bit(MLX5E_CHANNEL_STATE_XSK, c->state))
 		mlx5e_activate_xsk(c);
+
+	mlx5e_trigger_napi_icosq(c);
 }
 
 static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net 5/7] net/mlx5: correct ECE offset in query qp output
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2022-05-31 20:54 ` [net 4/7] net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  2022-05-31 20:54 ` [net 6/7] net/mlx5e: Update netdev features after changing XDP state Saeed Mahameed
  2022-05-31 20:54 ` [net 7/7] net/mlx5: Fix mlx5_get_next_dev() peer device matching Saeed Mahameed
  6 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Changcheng Liu, Saeed Mahameed

From: Changcheng Liu <jerrliu@nvidia.com>

ECE field should be after opt_param_mask in query qp output.

Fixes: 6b646a7e4af6 ("net/mlx5: Add ability to read and write ECE options")
Signed-off-by: Changcheng Liu <jerrliu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 78b3d3465dd7..2cd7d611e7b3 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -5176,12 +5176,11 @@ struct mlx5_ifc_query_qp_out_bits {
 
 	u8         syndrome[0x20];
 
-	u8         reserved_at_40[0x20];
-	u8         ece[0x20];
+	u8         reserved_at_40[0x40];
 
 	u8         opt_param_mask[0x20];
 
-	u8         reserved_at_a0[0x20];
+	u8         ece[0x20];
 
 	struct mlx5_ifc_qpc_bits qpc;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net 6/7] net/mlx5e: Update netdev features after changing XDP state
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2022-05-31 20:54 ` [net 5/7] net/mlx5: correct ECE offset in query qp output Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  2022-05-31 20:54 ` [net 7/7] net/mlx5: Fix mlx5_get_next_dev() peer device matching Saeed Mahameed
  6 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

Some features (LRO, HW GRO) conflict with XDP. If there is an attempt to
enable such features while XDP is active, they will be set to `off
[requested on]`. In order to activate these features after XDP is turned
off, the driver needs to call netdev_update_features(). This commit adds
this missing call after XDP state changes.

Fixes: cf6e34c8c22f ("net/mlx5e: Properly block LRO when XDP is enabled")
Fixes: b0617e7b3500 ("net/mlx5e: Properly block HW GRO when XDP is enabled")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 930a5402c817..087952b84ccb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4569,6 +4569,11 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
 
 unlock:
 	mutex_unlock(&priv->state_lock);
+
+	/* Need to fix some features. */
+	if (!err)
+		netdev_update_features(netdev);
+
 	return err;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net 7/7] net/mlx5: Fix mlx5_get_next_dev() peer device matching
  2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2022-05-31 20:54 ` [net 6/7] net/mlx5e: Update netdev features after changing XDP state Saeed Mahameed
@ 2022-05-31 20:54 ` Saeed Mahameed
  6 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2022-05-31 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Saeed Mahameed, Alexander Lobakin, Maher Sanalla,
	Leon Romanovsky, Mark Bloch

From: Saeed Mahameed <saeedm@nvidia.com>

In some use-cases, mlx5 instances will need to search for their peer
device (the other port on the same HCA). For that, mlx5 device matching
mechanism relied on auxiliary_find_device() to search, and used a bad matching
callback function.

This approach has two issues:

1) next_phys_dev() the matching function, assumed all devices are
   of the type mlx5_adev (mlx5 auxiliary device) which is wrong and
   could lead to crashes, this worked for a while, since only lately
   other drivers started registering auxiliary devices.

2) using the auxiliary class bus (auxiliary_find_device) to search for
   mlx5_core_dev devices, who are actually PCIe device instances, is wrong.
   This works since mlx5_core always has at least one mlx5_adev instance
   hanging around in the aux bus.

As suggested by others we can fix 1. by comparing device names prefixes
if they have the string "mlx5_core" in them, which is not a best practice !
but even with that fixed, still 2. needs fixing, we are trying to
match pcie device peers so we should look in the right bus (pci bus),
hence this fix.

The fix:
1) search the pci bus for mlx5 peer devices, instead of the aux bus
2) to validated devices are the same type "mlx5_core_dev" compare if
   they have the same driver, which is bulletproof.

   This wouldn't have worked with the aux bus since the various mlx5 aux
   device types don't share the same driver, even if they share the same device
   wrapper struct (mlx5_adev) "which helped to find the parent device"

Fixes: a925b5e309c9 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Reported-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reported-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/dev.c | 34 +++++++++++++------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index 11f7c03ae81b..0eb9d74547f8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -571,18 +571,32 @@ static int _next_phys_dev(struct mlx5_core_dev *mdev,
 	return 1;
 }
 
+static void *pci_get_other_drvdata(struct device *this, struct device *other)
+{
+	if (this->driver != other->driver)
+		return NULL;
+
+	return pci_get_drvdata(to_pci_dev(other));
+}
+
 static int next_phys_dev(struct device *dev, const void *data)
 {
-	struct mlx5_adev *madev = container_of(dev, struct mlx5_adev, adev.dev);
-	struct mlx5_core_dev *mdev = madev->mdev;
+	struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data;
+
+	mdev = pci_get_other_drvdata(this->device, dev);
+	if (!mdev)
+		return 0;
 
 	return _next_phys_dev(mdev, data);
 }
 
 static int next_phys_dev_lag(struct device *dev, const void *data)
 {
-	struct mlx5_adev *madev = container_of(dev, struct mlx5_adev, adev.dev);
-	struct mlx5_core_dev *mdev = madev->mdev;
+	struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data;
+
+	mdev = pci_get_other_drvdata(this->device, dev);
+	if (!mdev)
+		return 0;
 
 	if (!MLX5_CAP_GEN(mdev, vport_group_manager) ||
 	    !MLX5_CAP_GEN(mdev, lag_master) ||
@@ -596,19 +610,17 @@ static int next_phys_dev_lag(struct device *dev, const void *data)
 static struct mlx5_core_dev *mlx5_get_next_dev(struct mlx5_core_dev *dev,
 					       int (*match)(struct device *dev, const void *data))
 {
-	struct auxiliary_device *adev;
-	struct mlx5_adev *madev;
+	struct device *next;
 
 	if (!mlx5_core_is_pf(dev))
 		return NULL;
 
-	adev = auxiliary_find_device(NULL, dev, match);
-	if (!adev)
+	next = bus_find_device(&pci_bus_type, NULL, dev, match);
+	if (!next)
 		return NULL;
 
-	madev = container_of(adev, struct mlx5_adev, adev);
-	put_device(&adev->dev);
-	return madev->mdev;
+	put_device(next);
+	return pci_get_drvdata(to_pci_dev(next));
 }
 
 /* Must be called with intf_mutex held */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [net 1/7] net/mlx5: Don't use already freed action pointer
  2022-05-31 20:54 ` [net 1/7] net/mlx5: Don't use already freed action pointer Saeed Mahameed
@ 2022-06-02  1:20   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-06-02  1:20 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: davem, kuba, pabeni, netdev, leonro, dan.carpenter, saeedm

Hello:

This series was applied to netdev/net.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:

On Tue, 31 May 2022 13:54:41 -0700 you wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> The call to mlx5dr_action_destroy() releases "action" memory. That
> pointer is set to miss_action later and generates the following smatch
> error:
> 
>  drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c:53 set_miss_action()
>  warn: 'action' was already freed.
> 
> [...]

Here is the summary with links:
  - [net,1/7] net/mlx5: Don't use already freed action pointer
    https://git.kernel.org/netdev/net/c/80b2bd737d0e
  - [net,2/7] net/mlx5e: TC NIC mode, fix tc chains miss table
    https://git.kernel.org/netdev/net/c/66cb64e292d2
  - [net,3/7] net/mlx5: CT: Fix header-rewrite re-use for tupels
    https://git.kernel.org/netdev/net/c/1f2856cde64b
  - [net,4/7] net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition
    https://git.kernel.org/netdev/net/c/2e642afb61b2
  - [net,5/7] net/mlx5: correct ECE offset in query qp output
    https://git.kernel.org/netdev/net/c/3fc2a9e89b35
  - [net,6/7] net/mlx5e: Update netdev features after changing XDP state
    https://git.kernel.org/netdev/net/c/f6279f113ad5
  - [net,7/7] net/mlx5: Fix mlx5_get_next_dev() peer device matching
    https://git.kernel.org/netdev/net/c/1c5de097bea3

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-02  1:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-31 20:54 [pull request][net 0/7] mlx5 fixes 2022-05-31 Saeed Mahameed
2022-05-31 20:54 ` [net 1/7] net/mlx5: Don't use already freed action pointer Saeed Mahameed
2022-06-02  1:20   ` patchwork-bot+netdevbpf
2022-05-31 20:54 ` [net 2/7] net/mlx5e: TC NIC mode, fix tc chains miss table Saeed Mahameed
2022-05-31 20:54 ` [net 3/7] net/mlx5: CT: Fix header-rewrite re-use for tupels Saeed Mahameed
2022-05-31 20:54 ` [net 4/7] net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition Saeed Mahameed
2022-05-31 20:54 ` [net 5/7] net/mlx5: correct ECE offset in query qp output Saeed Mahameed
2022-05-31 20:54 ` [net 6/7] net/mlx5e: Update netdev features after changing XDP state Saeed Mahameed
2022-05-31 20:54 ` [net 7/7] net/mlx5: Fix mlx5_get_next_dev() peer device matching Saeed Mahameed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).