All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net 00/13] mlx5 fixes 2021-11-30
@ 2021-12-01  6:36 Saeed Mahameed
  2021-12-01  6:36 ` [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation Saeed Mahameed
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:36 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Hi Dave, Hi Jakub,

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit b0f38e15979fa8851e88e8aa371367f264e7b6e9:

  natsemi: xtensa: fix section mismatch warnings (2021-11-30 18:13:37 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2021-11-30

for you to fetch changes up to 8c8cf0382257b28378eeff535150c087a653ca19:

  net/mlx5e: SHAMPO, Fix constant expression result (2021-11-30 22:35:06 -0800)

----------------------------------------------------------------
mlx5-fixes-2021-11-30

----------------------------------------------------------------
Amir Tzin (1):
      net/mlx5: Fix use after free in mlx5_health_wait_pci_up

Aya Levin (1):
      net/mlx5: Fix access to a non-supported register

Ben Ben-Ishay (1):
      net/mlx5e: SHAMPO, Fix constant expression result

Dmytro Linkin (2):
      net/mlx5: E-switch, Respect BW share of the new group
      net/mlx5: E-Switch, Check group pointer before reading bw_share value

Gal Pressman (1):
      net/mlx5: Fix too early queueing of log timestamp work

Maor Dickman (1):
      net/mlx5: E-Switch, Use indirect table only if all destinations support it

Maor Gottlieb (1):
      net/mlx5: Lag, Fix recreation of VF LAG

Mark Bloch (1):
      net/mlx5: E-Switch, fix single FDB creation on BlueField

Moshe Shemesh (1):
      net/mlx5: Move MODIFY_RQT command to ignore list in internal error state

Raed Salem (2):
      net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation
      net/mlx5e: Fix missing IPsec statistics on uplink representor

Tariq Toukan (1):
      net/mlx5e: Sync TIR params updates against concurrent create/modify

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c      |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en/rx_res.c    | 41 +++++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/en/rx_res.h    |  6 ++--
 .../mellanox/mlx5/core/en_accel/ipsec_rxtx.c       |  2 +-
 .../ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 24 +------------
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  4 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  8 ++---
 drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c  |  4 +--
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 20 ++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/health.c   |  5 +--
 .../net/ethernet/mellanox/mlx5/core/lag/port_sel.c |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c |  5 ++-
 drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/main.c     | 30 ++++++++--------
 include/linux/mlx5/mlx5_ifc.h                      |  5 ++-
 15 files changed, 97 insertions(+), 61 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
@ 2021-12-01  6:36 ` Saeed Mahameed
  2021-12-01 15:00   ` patchwork-bot+netdevbpf
  2021-12-01  6:36 ` [net 02/13] net/mlx5e: Fix missing IPsec statistics on uplink representor Saeed Mahameed
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:36 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Raed Salem, Maor Dickman, Saeed Mahameed

From: Raed Salem <raeds@nvidia.com>

Current code wrongly uses the skb->protocol field which reflects the
outer l3 protocol to set the inner l3 type in Software Parser (SWP)
fields settings in the ethernet segment (eseg) in flows where inner
l3 exists like in Vxlan over ESP flow, the above method wrongly use
the outer protocol type instead of the inner one. thus breaking cases
where inner and outer headers have different protocols.

Fix by setting the inner l3 type in SWP according to the inner l3 ip
header version.

Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
index fb5397324aa4..2db9573a3fe6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
@@ -191,7 +191,7 @@ static void mlx5e_ipsec_set_swp(struct sk_buff *skb,
 			eseg->swp_inner_l3_offset = skb_inner_network_offset(skb) / 2;
 			eseg->swp_inner_l4_offset =
 				(skb->csum_start + skb->head - skb->data) / 2;
-			if (skb->protocol == htons(ETH_P_IPV6))
+			if (inner_ip_hdr(skb)->version == 6)
 				eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L3_IPV6;
 			break;
 		default:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 02/13] net/mlx5e: Fix missing IPsec statistics on uplink representor
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
  2021-12-01  6:36 ` [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation Saeed Mahameed
@ 2021-12-01  6:36 ` Saeed Mahameed
  2021-12-01  6:36 ` [net 03/13] net/mlx5e: Sync TIR params updates against concurrent create/modify Saeed Mahameed
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:36 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Raed Salem, Alaa Hleihel, Saeed Mahameed

From: Raed Salem <raeds@nvidia.com>

The cited patch added the IPsec support to uplink representor, however
as uplink representors have his private statistics where IPsec stats
is not part of it, that effectively makes IPsec stats hidden when uplink
representor stats queried.

Resolve by adding IPsec stats to uplink representor private statistics.

Fixes: 5589b8f1a2c7 ("net/mlx5e: Add IPsec support to uplink representor")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index e58a9ec42553..48895d79796a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1080,6 +1080,10 @@ static mlx5e_stats_grp_t mlx5e_ul_rep_stats_grps[] = {
 	&MLX5E_STATS_GRP(pme),
 	&MLX5E_STATS_GRP(channels),
 	&MLX5E_STATS_GRP(per_port_buff_congest),
+#ifdef CONFIG_MLX5_EN_IPSEC
+	&MLX5E_STATS_GRP(ipsec_sw),
+	&MLX5E_STATS_GRP(ipsec_hw),
+#endif
 };
 
 static unsigned int mlx5e_ul_rep_stats_grps_num(struct mlx5e_priv *priv)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 03/13] net/mlx5e: Sync TIR params updates against concurrent create/modify
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
  2021-12-01  6:36 ` [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation Saeed Mahameed
  2021-12-01  6:36 ` [net 02/13] net/mlx5e: Fix missing IPsec statistics on uplink representor Saeed Mahameed
@ 2021-12-01  6:36 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 04/13] net/mlx5: Move MODIFY_RQT command to ignore list in internal error state Saeed Mahameed
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:36 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Tariq Toukan, Moshe Shemesh, Maxim Mikityanskiy, Saeed Mahameed

From: Tariq Toukan <tariqt@nvidia.com>

Transport Interface Receive (TIR) objects perform the packet processing and
reassembly and is also responsible for demultiplexing the packets into the
different RQs.

There are certain TIR context attributes that propagate to the pointed RQs
and applied to them (like packet_merge offloads (LRO/SHAMPO) and
tunneled_offload_en).  When TIRs do not agree on attributes values, a "last
one wins" policy is applied.  Hence, if not synced properly, a race between
TIR params update and a concurrent TIR create/modify operation might yield
to a mismatch between the shadow parameters in SW and the actual applied
state of the RQs in HW.

tunneled_offload_en is a fixed attribute per profile, while packet merge
offload state might be toggled and get out-of-sync. When this happens,
packet_merge offload might be working although not requested, or the
opposite.

All updates to packet_merge state and all create/modify operations of
regular redirection/steering TIRs are done under the same priv->state_lock,
so they do not run in parallel, and no race is possible.

However, there are other kind of TIRs (acceleration offloads TIRs, like TLS
TIRs) which are created on demand for each new connection without holding
the coarse priv->state_lock, hence might race.

Fix this by synchronizing all packet_merge state reads and writes against
all TIR create/modify operations. Include the modify operations of the
regular redirection steering TIRs under the new lock, for better code
layering and division of responsibilities.

Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/rx_res.c   | 41 ++++++++++++++++++-
 .../ethernet/mellanox/mlx5/core/en/rx_res.h   |  6 +--
 .../mellanox/mlx5/core/en_accel/ktls_rx.c     | 24 +----------
 3 files changed, 44 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 142953847996..0015a81eb9a1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -13,6 +13,9 @@ struct mlx5e_rx_res {
 	unsigned int max_nch;
 	u32 drop_rqn;
 
+	struct mlx5e_packet_merge_param pkt_merge_param;
+	struct rw_semaphore pkt_merge_param_sem;
+
 	struct mlx5e_rss *rss[MLX5E_MAX_NUM_RSS];
 	bool rss_active;
 	u32 rss_rqns[MLX5E_INDIR_RQT_SIZE];
@@ -392,6 +395,7 @@ static int mlx5e_rx_res_ptp_init(struct mlx5e_rx_res *res)
 	if (err)
 		goto out;
 
+	/* Separated from the channels RQs, does not share pkt_merge state with them */
 	mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn,
 				    mlx5e_rqt_get_rqtn(&res->ptp.rqt),
 				    inner_ft_support);
@@ -447,6 +451,9 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
 	res->max_nch = max_nch;
 	res->drop_rqn = drop_rqn;
 
+	res->pkt_merge_param = *init_pkt_merge_param;
+	init_rwsem(&res->pkt_merge_param_sem);
+
 	err = mlx5e_rx_res_rss_init_def(res, init_pkt_merge_param, init_nch);
 	if (err)
 		goto err_out;
@@ -513,7 +520,7 @@ u32 mlx5e_rx_res_get_tirn_ptp(struct mlx5e_rx_res *res)
 	return mlx5e_tir_get_tirn(&res->ptp.tir);
 }
 
-u32 mlx5e_rx_res_get_rqtn_direct(struct mlx5e_rx_res *res, unsigned int ix)
+static u32 mlx5e_rx_res_get_rqtn_direct(struct mlx5e_rx_res *res, unsigned int ix)
 {
 	return mlx5e_rqt_get_rqtn(&res->channels[ix].direct_rqt);
 }
@@ -656,6 +663,9 @@ int mlx5e_rx_res_packet_merge_set_param(struct mlx5e_rx_res *res,
 	if (!builder)
 		return -ENOMEM;
 
+	down_write(&res->pkt_merge_param_sem);
+	res->pkt_merge_param = *pkt_merge_param;
+
 	mlx5e_tir_builder_build_packet_merge(builder, pkt_merge_param);
 
 	final_err = 0;
@@ -681,6 +691,7 @@ int mlx5e_rx_res_packet_merge_set_param(struct mlx5e_rx_res *res,
 		}
 	}
 
+	up_write(&res->pkt_merge_param_sem);
 	mlx5e_tir_builder_free(builder);
 	return final_err;
 }
@@ -689,3 +700,31 @@ struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *
 {
 	return mlx5e_rss_get_hash(res->rss[0]);
 }
+
+int mlx5e_rx_res_tls_tir_create(struct mlx5e_rx_res *res, unsigned int rxq,
+				struct mlx5e_tir *tir)
+{
+	bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
+	struct mlx5e_tir_builder *builder;
+	u32 rqtn;
+	int err;
+
+	builder = mlx5e_tir_builder_alloc(false);
+	if (!builder)
+		return -ENOMEM;
+
+	rqtn = mlx5e_rx_res_get_rqtn_direct(res, rxq);
+
+	mlx5e_tir_builder_build_rqt(builder, res->mdev->mlx5e_res.hw_objs.td.tdn, rqtn,
+				    inner_ft_support);
+	mlx5e_tir_builder_build_direct(builder);
+	mlx5e_tir_builder_build_tls(builder);
+	down_read(&res->pkt_merge_param_sem);
+	mlx5e_tir_builder_build_packet_merge(builder, &res->pkt_merge_param);
+	err = mlx5e_tir_init(tir, builder, res->mdev, false);
+	up_read(&res->pkt_merge_param_sem);
+
+	mlx5e_tir_builder_free(builder);
+
+	return err;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index d09f7d174a51..b39b20a720e0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -37,9 +37,6 @@ u32 mlx5e_rx_res_get_tirn_rss(struct mlx5e_rx_res *res, enum mlx5_traffic_types
 u32 mlx5e_rx_res_get_tirn_rss_inner(struct mlx5e_rx_res *res, enum mlx5_traffic_types tt);
 u32 mlx5e_rx_res_get_tirn_ptp(struct mlx5e_rx_res *res);
 
-/* RQTN getters for modules that create their own TIRs */
-u32 mlx5e_rx_res_get_rqtn_direct(struct mlx5e_rx_res *res, unsigned int ix);
-
 /* Activate/deactivate API */
 void mlx5e_rx_res_channels_activate(struct mlx5e_rx_res *res, struct mlx5e_channels *chs);
 void mlx5e_rx_res_channels_deactivate(struct mlx5e_rx_res *res);
@@ -69,4 +66,7 @@ struct mlx5e_rss *mlx5e_rx_res_rss_get(struct mlx5e_rx_res *res, u32 rss_idx);
 /* Workaround for hairpin */
 struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res);
 
+/* Accel TIRs */
+int mlx5e_rx_res_tls_tir_create(struct mlx5e_rx_res *res, unsigned int rxq,
+				struct mlx5e_tir *tir);
 #endif /* __MLX5_EN_RX_RES_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
index a2a9f68579dd..15711814d2d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
@@ -100,25 +100,6 @@ mlx5e_ktls_rx_resync_create_resp_list(void)
 	return resp_list;
 }
 
-static int mlx5e_ktls_create_tir(struct mlx5_core_dev *mdev, struct mlx5e_tir *tir, u32 rqtn)
-{
-	struct mlx5e_tir_builder *builder;
-	int err;
-
-	builder = mlx5e_tir_builder_alloc(false);
-	if (!builder)
-		return -ENOMEM;
-
-	mlx5e_tir_builder_build_rqt(builder, mdev->mlx5e_res.hw_objs.td.tdn, rqtn, false);
-	mlx5e_tir_builder_build_direct(builder);
-	mlx5e_tir_builder_build_tls(builder);
-	err = mlx5e_tir_init(tir, builder, mdev, false);
-
-	mlx5e_tir_builder_free(builder);
-
-	return err;
-}
-
 static void accel_rule_handle_work(struct work_struct *work)
 {
 	struct mlx5e_ktls_offload_context_rx *priv_rx;
@@ -609,7 +590,6 @@ int mlx5e_ktls_add_rx(struct net_device *netdev, struct sock *sk,
 	struct mlx5_core_dev *mdev;
 	struct mlx5e_priv *priv;
 	int rxq, err;
-	u32 rqtn;
 
 	tls_ctx = tls_get_ctx(sk);
 	priv = netdev_priv(netdev);
@@ -635,9 +615,7 @@ int mlx5e_ktls_add_rx(struct net_device *netdev, struct sock *sk,
 	priv_rx->sw_stats = &priv->tls->sw_stats;
 	mlx5e_set_ktls_rx_priv_ctx(tls_ctx, priv_rx);
 
-	rqtn = mlx5e_rx_res_get_rqtn_direct(priv->rx_res, rxq);
-
-	err = mlx5e_ktls_create_tir(mdev, &priv_rx->tir, rqtn);
+	err = mlx5e_rx_res_tls_tir_create(priv->rx_res, rxq, &priv_rx->tir);
 	if (err)
 		goto err_create_tir;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 04/13] net/mlx5: Move MODIFY_RQT command to ignore list in internal error state
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2021-12-01  6:36 ` [net 03/13] net/mlx5e: Sync TIR params updates against concurrent create/modify Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 05/13] net/mlx5: Lag, Fix recreation of VF LAG Saeed Mahameed
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Moshe Shemesh, Saeed Mahameed

From: Moshe Shemesh <moshe@nvidia.com>

When the device is in internal error state, command interface isn't
accessible and the driver decides which commands to fail and which
to ignore.

Move the MODIFY_RQT command to the ignore list in order to avoid
the following redundant warning messages in internal error state:

mlx5_core 0000:82:00.1: mlx5e_rss_disable:419:(pid 23754): Failed to redirect RQT 0x0 to drop RQ 0xc00848: err = -5
mlx5_core 0000:82:00.1: mlx5e_rx_res_channels_deactivate:598:(pid 23754): Failed to redirect direct RQT 0x1 to drop RQ 0xc00848 (channel 0): err = -5
mlx5_core 0000:82:00.1: mlx5e_rx_res_channels_deactivate:607:(pid 23754): Failed to redirect XSK RQT 0x19 to drop RQ 0xc00848 (channel 0): err = -5

Fixes: 43ec0f41fa73 ("net/mlx5e: Hide all implementation details of mlx5e_rx_res")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 8eaa24d865c5..a46284ca5172 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -341,6 +341,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op,
 	case MLX5_CMD_OP_DEALLOC_SF:
 	case MLX5_CMD_OP_DESTROY_UCTX:
 	case MLX5_CMD_OP_DESTROY_UMEM:
+	case MLX5_CMD_OP_MODIFY_RQT:
 		return MLX5_CMD_STAT_OK;
 
 	case MLX5_CMD_OP_QUERY_HCA_CAP:
@@ -446,7 +447,6 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op,
 	case MLX5_CMD_OP_MODIFY_TIS:
 	case MLX5_CMD_OP_QUERY_TIS:
 	case MLX5_CMD_OP_CREATE_RQT:
-	case MLX5_CMD_OP_MODIFY_RQT:
 	case MLX5_CMD_OP_QUERY_RQT:
 
 	case MLX5_CMD_OP_CREATE_FLOW_TABLE:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 05/13] net/mlx5: Lag, Fix recreation of VF LAG
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2021-12-01  6:37 ` [net 04/13] net/mlx5: Move MODIFY_RQT command to ignore list in internal error state Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 06/13] net/mlx5: E-switch, Respect BW share of the new group Saeed Mahameed
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maor Gottlieb, Mark Bloch, Saeed Mahameed

From: Maor Gottlieb <maorg@nvidia.com>

Driver needs to nullify the port select attributes of the LAG when
port selection is destroyed, otherwise it breaks recreation of the
LAG.
It fixes the below kernel oops:

 [  587.906377] BUG: kernel NULL pointer dereference, address: 0000000000000008
 [  587.908843] #PF: supervisor read access in kernel mode
 [  587.910730] #PF: error_code(0x0000) - not-present page
 [  587.912580] PGD 0 P4D 0
 [  587.913632] Oops: 0000 [#1] SMP PTI
 [  587.914644] CPU: 5 PID: 165 Comm: kworker/u20:5 Tainted: G           OE     5.9.0_mlnx #1
 [  587.916152] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 [  587.918332] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
 [  587.919479] RIP: 0010:mlx5_del_flow_rules+0x10/0x270 [mlx5_core]
 [  587.920568] mlx5_core 0000:08:00.1 enp8s0f1: Link up
 [  587.920680] Code: c0 09 80 a0 e8 cf 42 a4 e0 48 c7 c3 f4 ff ff ff e8 8a 88 dd e0 e9 ab fe ff ff 0f 1f 44 00 00 41 56 41 55 49 89 fd 41 54 55 53 <48> 8b 47 08 48 8b 68 28 48 85 ed 74 2e 48 8d 7d 38 e8 6a 64 34 e1
 [  587.925116] bond0: (slave enp8s0f1): Enslaving as an active interface with an up link
 [  587.930415] RSP: 0018:ffffc9000048fd88 EFLAGS: 00010282
 [  587.930417] RAX: ffff88846c14fac0 RBX: ffff88846cddcb80 RCX: 0000000080400007
 [  587.930417] RDX: 0000000080400008 RSI: ffff88846cddcb80 RDI: 0000000000000000
 [  587.930419] RBP: ffff88845fd80140 R08: 0000000000000001 R09: ffffffffa074ba00
 [  587.938132] R10: ffff88846c14fec0 R11: 0000000000000001 R12: ffff88846c122f10
 [  587.939473] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88846d7a0000
 [  587.940800] FS:  0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000
 [  587.942416] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  587.943536] CR2: 0000000000000008 CR3: 000000000240a002 CR4: 0000000000770ee0
 [  587.944904] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [  587.946308] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [  587.947639] PKRU: 55555554
 [  587.948236] Call Trace:
 [  587.948834]  mlx5_lag_destroy_definer.isra.3+0x16/0x90 [mlx5_core]
 [  587.950033]  mlx5_lag_destroy_definers+0x5b/0x80 [mlx5_core]
 [  587.951128]  mlx5_deactivate_lag+0x6e/0x80 [mlx5_core]
 [  587.952146]  mlx5_do_bond+0x150/0x450 [mlx5_core]
 [  587.953086]  mlx5_do_bond_work+0x3e/0x50 [mlx5_core]
 [  587.954086]  process_one_work+0x1eb/0x3e0
 [  587.954899]  worker_thread+0x2d/0x3c0
 [  587.955656]  ? process_one_work+0x3e0/0x3e0
 [  587.956493]  kthread+0x115/0x130
 [  587.957174]  ? kthread_park+0x90/0x90
 [  587.957929]  ret_from_fork+0x1f/0x30
 [  587.973055] ---[ end trace 71ccd6eca89f5513 ]---

Fixes: b7267869e923 ("net/mlx5: Lag, add support to create/destroy/modify port selection")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
index ad63dd45c8fb..a6592f9c3c05 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
@@ -608,4 +608,5 @@ void mlx5_lag_port_sel_destroy(struct mlx5_lag *ldev)
 	if (port_sel->tunnel)
 		mlx5_destroy_ttc_table(port_sel->inner.ttc);
 	mlx5_lag_destroy_definers(ldev);
+	memset(port_sel, 0, sizeof(*port_sel));
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 06/13] net/mlx5: E-switch, Respect BW share of the new group
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2021-12-01  6:37 ` [net 05/13] net/mlx5: Lag, Fix recreation of VF LAG Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 07/13] net/mlx5: E-Switch, fix single FDB creation on BlueField Saeed Mahameed
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Dmytro Linkin, Roi Dayan, Parav Pandit, Mark Bloch,
	Saeed Mahameed

From: Dmytro Linkin <dlinkin@nvidia.com>

To enable transmit schduler on vport FW require non-zero configuration
for vport's TSAR. If vport added to the group which has configured BW
share value and TX rate values of the vport are zero, then scheduler
wouldn't be enabled on this vport.
Fix that by calling BW normalization if BW share of the new group is
configured.

Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
index c6cc67cb4f6a..4501e3d737f8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
@@ -423,7 +423,7 @@ static int esw_qos_vport_update_group(struct mlx5_eswitch *esw,
 		return err;
 
 	/* Recalculate bw share weights of old and new groups */
-	if (vport->qos.bw_share) {
+	if (vport->qos.bw_share || new_group->bw_share) {
 		esw_qos_normalize_vports_min_rate(esw, curr_group, extack);
 		esw_qos_normalize_vports_min_rate(esw, new_group, extack);
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 07/13] net/mlx5: E-Switch, fix single FDB creation on BlueField
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2021-12-01  6:37 ` [net 06/13] net/mlx5: E-switch, Respect BW share of the new group Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 08/13] net/mlx5: E-Switch, Check group pointer before reading bw_share value Saeed Mahameed
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Mark Bloch, Maor Gottlieb, Saeed Mahameed

From: Mark Bloch <mbloch@nvidia.com>

Always use MLX5_FLOW_TABLE_OTHER_VPORT flag when creating egress ACL
table for single FDB. Not doing so on BlueField will make firmware fail
the command. On BlueField the E-Switch manager is the ECPF (vport 0xFFFE)
which is filled in the flow table creation command but as the
other_vport field wasn't set the firmware complains about a bad parameter.

This is different from a regular HCA where the E-Switch manager vport is
the PF (vport 0x0). Passing MLX5_FLOW_TABLE_OTHER_VPORT will make the
firmware happy both on BlueField and on regular HCAs without special
condition for each.

This fixes the bellow firmware syndrome:
mlx5_cmd_check:819:(pid 571): CREATE_FLOW_TABLE(0x930) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x754a4)

Fixes: db202995f503 ("net/mlx5: E-Switch, add logic to enable shared FDB")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a46455694f7a..275af1d2b4d3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2512,6 +2512,7 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
 	struct mlx5_eswitch *esw = master->priv.eswitch;
 	struct mlx5_flow_table_attr ft_attr = {
 		.max_fte = 1, .prio = 0, .level = 0,
+		.flags = MLX5_FLOW_TABLE_OTHER_VPORT,
 	};
 	struct mlx5_flow_namespace *egress_ns;
 	struct mlx5_flow_table *acl;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 08/13] net/mlx5: E-Switch, Check group pointer before reading bw_share value
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2021-12-01  6:37 ` [net 07/13] net/mlx5: E-Switch, fix single FDB creation on BlueField Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 09/13] net/mlx5: E-Switch, Use indirect table only if all destinations support it Saeed Mahameed
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Dmytro Linkin, Roi Dayan, Saeed Mahameed

From: Dmytro Linkin <dlinkin@nvidia.com>

If log_esw_max_sched_depth is not supported group pointer of the vport
is NULL. Hence, check the pointer before reading bw_share value.

Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
index 4501e3d737f8..d377ddc70fc7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
@@ -130,7 +130,7 @@ static u32 esw_qos_calculate_min_rate_divider(struct mlx5_eswitch *esw,
 	/* If vports min rate divider is 0 but their group has bw_share configured, then
 	 * need to set bw_share for vports to minimal value.
 	 */
-	if (!group_level && !max_guarantee && group->bw_share)
+	if (!group_level && !max_guarantee && group && group->bw_share)
 		return 1;
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 09/13] net/mlx5: E-Switch, Use indirect table only if all destinations support it
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2021-12-01  6:37 ` [net 08/13] net/mlx5: E-Switch, Check group pointer before reading bw_share value Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 10/13] net/mlx5: Fix use after free in mlx5_health_wait_pci_up Saeed Mahameed
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Maor Dickman, Roi Dayan, Saeed Mahameed

From: Maor Dickman <maord@nvidia.com>

When adding rule with multiple destinations, indirect table is used for all of
the destinations if at least one of the destinations support it, this can cause
creation of invalid indirect tables for the destinations that doesn't support it.

Fixed it by using indirect table only if all destinations support it.

Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/eswitch_offloads.c     | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 275af1d2b4d3..32bc08a39925 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -329,14 +329,25 @@ static bool
 esw_is_indir_table(struct mlx5_eswitch *esw, struct mlx5_flow_attr *attr)
 {
 	struct mlx5_esw_flow_attr *esw_attr = attr->esw_attr;
+	bool result = false;
 	int i;
 
-	for (i = esw_attr->split_count; i < esw_attr->out_count; i++)
+	/* Indirect table is supported only for flows with in_port uplink
+	 * and the destination is vport on the same eswitch as the uplink,
+	 * return false in case at least one of destinations doesn't meet
+	 * this criteria.
+	 */
+	for (i = esw_attr->split_count; i < esw_attr->out_count; i++) {
 		if (esw_attr->dests[i].rep &&
 		    mlx5_esw_indir_table_needed(esw, attr, esw_attr->dests[i].rep->vport,
-						esw_attr->dests[i].mdev))
-			return true;
-	return false;
+						esw_attr->dests[i].mdev)) {
+			result = true;
+		} else {
+			result = false;
+			break;
+		}
+	}
+	return result;
 }
 
 static int
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 10/13] net/mlx5: Fix use after free in mlx5_health_wait_pci_up
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2021-12-01  6:37 ` [net 09/13] net/mlx5: E-Switch, Use indirect table only if all destinations support it Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 11/13] net/mlx5: Fix too early queueing of log timestamp work Saeed Mahameed
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Amir Tzin, Moshe Shemesh, Saeed Mahameed

From: Amir Tzin <amirtz@nvidia.com>

The device health recovery flow calls mlx5_health_wait_pci_up() which
queries the device for FW_RESET timeout after freeing the device
timeouts structure on mlx5_function_teardown(). Fix this bug by moving
timeouts structure init/cleanup to the device's init/uninit phases.
Since it is necessary to reset default software timeouts on function
reload, extract setting of defaults values from mlx5_tout_init() and
call mlx5_tout_set_def_val() directly from mlx5_function_setup().

Fixes: 5945e1adeab5 ("net/mlx5: Read timeout values from init segment")
Reported by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/lib/tout.c    |  5 ++---
 .../ethernet/mellanox/mlx5/core/lib/tout.h    |  1 +
 .../net/ethernet/mellanox/mlx5/core/main.c    | 22 ++++++++++---------
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
index 0dd96a6b140d..c1df0d3595d8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
@@ -31,11 +31,11 @@ static void tout_set(struct mlx5_core_dev *dev, u64 val, enum mlx5_timeouts_type
 	dev->timeouts->to[type] = val;
 }
 
-static void tout_set_def_val(struct mlx5_core_dev *dev)
+void mlx5_tout_set_def_val(struct mlx5_core_dev *dev)
 {
 	int i;
 
-	for (i = MLX5_TO_FW_PRE_INIT_TIMEOUT_MS; i < MAX_TIMEOUT_TYPES; i++)
+	for (i = 0; i < MAX_TIMEOUT_TYPES; i++)
 		tout_set(dev, tout_def_sw_val[i], i);
 }
 
@@ -45,7 +45,6 @@ int mlx5_tout_init(struct mlx5_core_dev *dev)
 	if (!dev->timeouts)
 		return -ENOMEM;
 
-	tout_set_def_val(dev);
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
index 31faa5c17aa9..1c42ead782fa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
@@ -34,6 +34,7 @@ int mlx5_tout_init(struct mlx5_core_dev *dev);
 void mlx5_tout_cleanup(struct mlx5_core_dev *dev);
 void mlx5_tout_query_iseg(struct mlx5_core_dev *dev);
 int mlx5_tout_query_dtor(struct mlx5_core_dev *dev);
+void mlx5_tout_set_def_val(struct mlx5_core_dev *dev);
 u64 _mlx5_tout_ms(struct mlx5_core_dev *dev, enum mlx5_timeouts_types type);
 
 #define mlx5_tout_ms(dev, type) _mlx5_tout_ms(dev, MLX5_TO_##type##_MS)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index a92a92a52346..e127c0530b3a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -992,11 +992,7 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot)
 	if (mlx5_core_is_pf(dev))
 		pcie_print_link_status(dev->pdev);
 
-	err = mlx5_tout_init(dev);
-	if (err) {
-		mlx5_core_err(dev, "Failed initializing timeouts, aborting\n");
-		return err;
-	}
+	mlx5_tout_set_def_val(dev);
 
 	/* wait for firmware to accept initialization segments configurations
 	 */
@@ -1005,13 +1001,13 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot)
 	if (err) {
 		mlx5_core_err(dev, "Firmware over %llu MS in pre-initializing state, aborting\n",
 			      mlx5_tout_ms(dev, FW_PRE_INIT_TIMEOUT));
-		goto err_tout_cleanup;
+		return err;
 	}
 
 	err = mlx5_cmd_init(dev);
 	if (err) {
 		mlx5_core_err(dev, "Failed initializing command interface, aborting\n");
-		goto err_tout_cleanup;
+		return err;
 	}
 
 	mlx5_tout_query_iseg(dev);
@@ -1094,8 +1090,6 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot)
 err_cmd_cleanup:
 	mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN);
 	mlx5_cmd_cleanup(dev);
-err_tout_cleanup:
-	mlx5_tout_cleanup(dev);
 
 	return err;
 }
@@ -1114,7 +1108,6 @@ static int mlx5_function_teardown(struct mlx5_core_dev *dev, bool boot)
 	mlx5_core_disable_hca(dev, 0);
 	mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN);
 	mlx5_cmd_cleanup(dev);
-	mlx5_tout_cleanup(dev);
 
 	return 0;
 }
@@ -1476,6 +1469,12 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
 					    mlx5_debugfs_root);
 	INIT_LIST_HEAD(&priv->traps);
 
+	err = mlx5_tout_init(dev);
+	if (err) {
+		mlx5_core_err(dev, "Failed initializing timeouts, aborting\n");
+		goto err_timeout_init;
+	}
+
 	err = mlx5_health_init(dev);
 	if (err)
 		goto err_health_init;
@@ -1501,6 +1500,8 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
 err_pagealloc_init:
 	mlx5_health_cleanup(dev);
 err_health_init:
+	mlx5_tout_cleanup(dev);
+err_timeout_init:
 	debugfs_remove(dev->priv.dbg_root);
 	mutex_destroy(&priv->pgdir_mutex);
 	mutex_destroy(&priv->alloc_mutex);
@@ -1518,6 +1519,7 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev)
 	mlx5_adev_cleanup(dev);
 	mlx5_pagealloc_cleanup(dev);
 	mlx5_health_cleanup(dev);
+	mlx5_tout_cleanup(dev);
 	debugfs_remove_recursive(dev->priv.dbg_root);
 	mutex_destroy(&priv->pgdir_mutex);
 	mutex_destroy(&priv->alloc_mutex);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 11/13] net/mlx5: Fix too early queueing of log timestamp work
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2021-12-01  6:37 ` [net 10/13] net/mlx5: Fix use after free in mlx5_health_wait_pci_up Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 12/13] net/mlx5: Fix access to a non-supported register Saeed Mahameed
  2021-12-01  6:37 ` [net 13/13] net/mlx5e: SHAMPO, Fix constant expression result Saeed Mahameed
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Gal Pressman, Moshe Shemesh, Saeed Mahameed

From: Gal Pressman <gal@nvidia.com>

The log timestamp work should not be queued before the command interface
is initialized, move it to a later stage in the init flow.

Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 64f1abc4dc36..380f50d5462d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -835,6 +835,9 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev)
 
 	health->timer.expires = jiffies + msecs_to_jiffies(poll_interval_ms);
 	add_timer(&health->timer);
+
+	if (mlx5_core_is_pf(dev))
+		queue_delayed_work(health->wq, &health->update_fw_log_ts_work, 0);
 }
 
 void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health)
@@ -902,8 +905,6 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	INIT_WORK(&health->fatal_report_work, mlx5_fw_fatal_reporter_err_work);
 	INIT_WORK(&health->report_work, mlx5_fw_reporter_err_work);
 	INIT_DELAYED_WORK(&health->update_fw_log_ts_work, mlx5_health_log_ts_update);
-	if (mlx5_core_is_pf(dev))
-		queue_delayed_work(health->wq, &health->update_fw_log_ts_work, 0);
 
 	return 0;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 12/13] net/mlx5: Fix access to a non-supported register
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2021-12-01  6:37 ` [net 11/13] net/mlx5: Fix too early queueing of log timestamp work Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  2021-12-01  6:37 ` [net 13/13] net/mlx5e: SHAMPO, Fix constant expression result Saeed Mahameed
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Aya Levin, Gal Pressman, Moshe Shemesh, Saeed Mahameed

From: Aya Levin <ayal@nvidia.com>

Validate MRTC register is supported before triggering a delayed work
which accesses it.

Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/main.c   | 8 +++-----
 include/linux/mlx5/mlx5_ifc.h                    | 5 ++++-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 380f50d5462d..3ca998874c50 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -836,7 +836,7 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev)
 	health->timer.expires = jiffies + msecs_to_jiffies(poll_interval_ms);
 	add_timer(&health->timer);
 
-	if (mlx5_core_is_pf(dev))
+	if (mlx5_core_is_pf(dev) && MLX5_CAP_MCAM_REG(dev, mrtc))
 		queue_delayed_work(health->wq, &health->update_fw_log_ts_work, 0);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index e127c0530b3a..7df9c7f8d9c8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1071,18 +1071,16 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot)
 
 	mlx5_set_driver_version(dev);
 
-	mlx5_start_health_poll(dev);
-
 	err = mlx5_query_hca_caps(dev);
 	if (err) {
 		mlx5_core_err(dev, "query hca failed\n");
-		goto stop_health;
+		goto reclaim_boot_pages;
 	}
 
+	mlx5_start_health_poll(dev);
+
 	return 0;
 
-stop_health:
-	mlx5_stop_health_poll(dev, boot);
 reclaim_boot_pages:
 	mlx5_reclaim_startup_pages(dev);
 err_disable_hca:
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 3636df90899a..fbaab440a484 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -9698,7 +9698,10 @@ struct mlx5_ifc_mcam_access_reg_bits {
 	u8         regs_84_to_68[0x11];
 	u8         tracer_registers[0x4];
 
-	u8         regs_63_to_32[0x20];
+	u8         regs_63_to_46[0x12];
+	u8         mrtc[0x1];
+	u8         regs_44_to_32[0xd];
+
 	u8         regs_31_to_0[0x20];
 };
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [net 13/13] net/mlx5e: SHAMPO, Fix constant expression result
  2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2021-12-01  6:37 ` [net 12/13] net/mlx5: Fix access to a non-supported register Saeed Mahameed
@ 2021-12-01  6:37 ` Saeed Mahameed
  12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2021-12-01  6:37 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Ben Ben-Ishay, Tariq Toukan, Saeed Mahameed

From: Ben Ben-Ishay <benishay@nvidia.com>

mlx5e_build_shampo_hd_umr uses counters i and index incorrectly
as unsigned, thus the err state err_unmap could stuck in endless loop.
Change i to int to solve the first issue.
Reduce index check to solve the second issue, the caller function
validates that index could not rotate.

Fixes: 64509b052525 ("net/mlx5e: Add data path for SHAMPO feature")
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 96967b0a2441..793511d5ee4c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -543,13 +543,13 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq,
 				     u16 klm_entries, u16 index)
 {
 	struct mlx5e_shampo_hd *shampo = rq->mpwqe.shampo;
-	u16 entries, pi, i, header_offset, err, wqe_bbs, new_entries;
+	u16 entries, pi, header_offset, err, wqe_bbs, new_entries;
 	u32 lkey = rq->mdev->mlx5e_res.hw_objs.mkey;
 	struct page *page = shampo->last_page;
 	u64 addr = shampo->last_addr;
 	struct mlx5e_dma_info *dma_info;
 	struct mlx5e_umr_wqe *umr_wqe;
-	int headroom;
+	int headroom, i;
 
 	headroom = rq->buff.headroom;
 	new_entries = klm_entries - (shampo->pi & (MLX5_UMR_KLM_ALIGNMENT - 1));
@@ -601,9 +601,7 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq,
 
 err_unmap:
 	while (--i >= 0) {
-		if (--index < 0)
-			index = shampo->hd_per_wq - 1;
-		dma_info = &shampo->info[index];
+		dma_info = &shampo->info[--index];
 		if (!(i & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1))) {
 			dma_info->addr = ALIGN_DOWN(dma_info->addr, PAGE_SIZE);
 			mlx5e_page_release(rq, dma_info, true);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation
  2021-12-01  6:36 ` [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation Saeed Mahameed
@ 2021-12-01 15:00   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-12-01 15:00 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: davem, kuba, netdev, raeds, maord, saeedm

Hello:

This series was applied to netdev/net.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:

On Tue, 30 Nov 2021 22:36:57 -0800 you wrote:
> From: Raed Salem <raeds@nvidia.com>
> 
> Current code wrongly uses the skb->protocol field which reflects the
> outer l3 protocol to set the inner l3 type in Software Parser (SWP)
> fields settings in the ethernet segment (eseg) in flows where inner
> l3 exists like in Vxlan over ESP flow, the above method wrongly use
> the outer protocol type instead of the inner one. thus breaking cases
> where inner and outer headers have different protocols.
> 
> [...]

Here is the summary with links:
  - [net,01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation
    https://git.kernel.org/netdev/net/c/c65d638ab390
  - [net,02/13] net/mlx5e: Fix missing IPsec statistics on uplink representor
    https://git.kernel.org/netdev/net/c/51ebf5db67f5
  - [net,03/13] net/mlx5e: Sync TIR params updates against concurrent create/modify
    https://git.kernel.org/netdev/net/c/4cce2ccf08fb
  - [net,04/13] net/mlx5: Move MODIFY_RQT command to ignore list in internal error state
    https://git.kernel.org/netdev/net/c/e45c0b34493c
  - [net,05/13] net/mlx5: Lag, Fix recreation of VF LAG
    https://git.kernel.org/netdev/net/c/ffdf45315226
  - [net,06/13] net/mlx5: E-switch, Respect BW share of the new group
    https://git.kernel.org/netdev/net/c/1e59b32e45e4
  - [net,07/13] net/mlx5: E-Switch, fix single FDB creation on BlueField
    https://git.kernel.org/netdev/net/c/43a0696f1156
  - [net,08/13] net/mlx5: E-Switch, Check group pointer before reading bw_share value
    https://git.kernel.org/netdev/net/c/5c4e8ae7aa48
  - [net,09/13] net/mlx5: E-Switch, Use indirect table only if all destinations support it
    https://git.kernel.org/netdev/net/c/e219440da0c3
  - [net,10/13] net/mlx5: Fix use after free in mlx5_health_wait_pci_up
    https://git.kernel.org/netdev/net/c/76091b0fb609
  - [net,11/13] net/mlx5: Fix too early queueing of log timestamp work
    https://git.kernel.org/netdev/net/c/924cc4633f04
  - [net,12/13] net/mlx5: Fix access to a non-supported register
    https://git.kernel.org/netdev/net/c/502e82b91361
  - [net,13/13] net/mlx5e: SHAMPO, Fix constant expression result
    https://git.kernel.org/netdev/net/c/8c8cf0382257

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-12-01 15:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01  6:36 [pull request][net 00/13] mlx5 fixes 2021-11-30 Saeed Mahameed
2021-12-01  6:36 ` [net 01/13] net/mlx5e: IPsec: Fix Software parser inner l3 type setting in case of encapsulation Saeed Mahameed
2021-12-01 15:00   ` patchwork-bot+netdevbpf
2021-12-01  6:36 ` [net 02/13] net/mlx5e: Fix missing IPsec statistics on uplink representor Saeed Mahameed
2021-12-01  6:36 ` [net 03/13] net/mlx5e: Sync TIR params updates against concurrent create/modify Saeed Mahameed
2021-12-01  6:37 ` [net 04/13] net/mlx5: Move MODIFY_RQT command to ignore list in internal error state Saeed Mahameed
2021-12-01  6:37 ` [net 05/13] net/mlx5: Lag, Fix recreation of VF LAG Saeed Mahameed
2021-12-01  6:37 ` [net 06/13] net/mlx5: E-switch, Respect BW share of the new group Saeed Mahameed
2021-12-01  6:37 ` [net 07/13] net/mlx5: E-Switch, fix single FDB creation on BlueField Saeed Mahameed
2021-12-01  6:37 ` [net 08/13] net/mlx5: E-Switch, Check group pointer before reading bw_share value Saeed Mahameed
2021-12-01  6:37 ` [net 09/13] net/mlx5: E-Switch, Use indirect table only if all destinations support it Saeed Mahameed
2021-12-01  6:37 ` [net 10/13] net/mlx5: Fix use after free in mlx5_health_wait_pci_up Saeed Mahameed
2021-12-01  6:37 ` [net 11/13] net/mlx5: Fix too early queueing of log timestamp work Saeed Mahameed
2021-12-01  6:37 ` [net 12/13] net/mlx5: Fix access to a non-supported register Saeed Mahameed
2021-12-01  6:37 ` [net 13/13] net/mlx5e: SHAMPO, Fix constant expression result Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.