All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net 00/14] mlx5 fixes 2022-11-21
@ 2022-11-22  2:25 Saeed Mahameed
  2022-11-22  2:25 ` [net 01/14] net/mlx5: Do not query pci info while pci disabled Saeed Mahameed
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan

From: Saeed Mahameed <saeedm@nvidia.com>

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit badbda1a01860c80c6ab60f329ef46c713653a27:

  octeontx2-af: cn10k: mcs: Fix copy and paste bug in mcs_bbe_intr_handler() (2022-11-21 13:04:28 +0000)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2022-11-21

for you to fetch changes up to 8514e325ef016e3fdabaa015ed1adaa6e6d8722a:

  net/mlx5e: Fix possible race condition in macsec extended packet number update routine (2022-11-21 18:14:35 -0800)

----------------------------------------------------------------
mlx5-fixes-2022-11-21

----------------------------------------------------------------
Chris Mi (1):
      net/mlx5e: Offload rule only when all encaps are valid

Eli Cohen (1):
      net/mlx5: Lag, avoid lockdep warnings

Emeel Hakim (3):
      net/mlx5e: Fix MACsec SA initialization routine
      net/mlx5e: Fix MACsec update SecY
      net/mlx5e: Fix possible race condition in macsec extended packet number update routine

Moshe Shemesh (4):
      net/mlx5: Fix FW tracer timestamp calculation
      net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
      net/mlx5: Fix handling of entry refcount when command is not issued to FW
      net/mlx5: Fix sync reset event handler error flow

Roi Dayan (1):
      net/mlx5: E-Switch, Set correctly vport destination

Roy Novich (1):
      net/mlx5: Do not query pci info while pci disabled

Shay Drory (1):
      net/mlx5: SF: Fix probing active SFs during driver probe phase

Tariq Toukan (2):
      net/mlx5e: Fix missing alignment in size of MTT/KLM entries
      net/mlx5e: Remove leftovers from old XSK queues enumeration

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c      |  47 +++++-----
 .../mellanox/mlx5/core/diag/cmd_tracepoint.h       |  45 ++++++++++
 .../ethernet/mellanox/mlx5/core/diag/fw_tracer.c   |   2 +-
 .../ethernet/mellanox/mlx5/core/en/tc_tun_encap.c  |  16 ++--
 .../ethernet/mellanox/mlx5/core/en/tc_tun_encap.h  |   3 +-
 .../ethernet/mellanox/mlx5/core/en_accel/macsec.c  |  19 ++--
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  18 ----
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   5 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    |  17 ++--
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c |   9 +-
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c  |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h  |  14 ++-
 .../net/ethernet/mellanox/mlx5/core/lag/mpesw.c    | 100 +++++++++++++--------
 .../net/ethernet/mellanox/mlx5/core/lag/mpesw.h    |   1 -
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |   9 +-
 .../net/ethernet/mellanox/mlx5/core/sf/dev/dev.c   |  88 ++++++++++++++++++
 include/linux/mlx5/driver.h                        |   1 +
 18 files changed, 281 insertions(+), 118 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/cmd_tracepoint.h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [net 01/14] net/mlx5: Do not query pci info while pci disabled
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-23  5:00   ` patchwork-bot+netdevbpf
  2022-11-22  2:25 ` [net 02/14] net/mlx5: Fix FW tracer timestamp calculation Saeed Mahameed
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Roy Novich, Moshe Shemesh,
	Aya Levin

From: Roy Novich <royno@nvidia.com>

The driver should not interact with PCI while PCI is disabled. Trying to
do so may result in being unable to get vital signs during PCI reset,
driver gets timed out and fails to recover.

Fixes: fad1783a6d66 ("net/mlx5: Print more info on pci error handlers")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 283c4cc28944..e58775a7d955 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1798,7 +1798,8 @@ static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
 	res = state == pci_channel_io_perm_failure ?
 		PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
 
-	mlx5_pci_trace(dev, "Exit, result = %d, %s\n",  res, result2str(res));
+	mlx5_core_info(dev, "%s Device state = %d pci_status: %d. Exit, result = %d, %s\n",
+		       __func__, dev->state, dev->pci_status, res, result2str(res));
 	return res;
 }
 
@@ -1837,7 +1838,8 @@ static pci_ers_result_t mlx5_pci_slot_reset(struct pci_dev *pdev)
 	struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
 	int err;
 
-	mlx5_pci_trace(dev, "Enter\n");
+	mlx5_core_info(dev, "%s Device state = %d pci_status: %d. Enter\n",
+		       __func__, dev->state, dev->pci_status);
 
 	err = mlx5_pci_enable_device(dev);
 	if (err) {
@@ -1859,7 +1861,8 @@ static pci_ers_result_t mlx5_pci_slot_reset(struct pci_dev *pdev)
 
 	res = PCI_ERS_RESULT_RECOVERED;
 out:
-	mlx5_pci_trace(dev, "Exit, err = %d, result = %d, %s\n", err, res, result2str(res));
+	mlx5_core_info(dev, "%s Device state = %d pci_status: %d. Exit, err = %d, result = %d, %s\n",
+		       __func__, dev->state, dev->pci_status, err, res, result2str(res));
 	return res;
 }
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 02/14] net/mlx5: Fix FW tracer timestamp calculation
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
  2022-11-22  2:25 ` [net 01/14] net/mlx5: Do not query pci info while pci disabled Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase Saeed Mahameed
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Feras Daoud

From: Moshe Shemesh <moshe@nvidia.com>

Fix a bug in calculation of FW tracer timestamp. Decreasing one in the
calculation should effect only bits 52_7 and not effect bits 6_0 of the
timestamp, otherwise bits 6_0 are always set in this calculation.

Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Feras Daoud <ferasda@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
index 978a2bb8e122..21831386b26e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
@@ -638,7 +638,7 @@ static void mlx5_tracer_handle_timestamp_trace(struct mlx5_fw_tracer *tracer,
 			trace_timestamp = (timestamp_event.timestamp & MASK_52_7) |
 					  (str_frmt->timestamp & MASK_6_0);
 		else
-			trace_timestamp = ((timestamp_event.timestamp & MASK_52_7) - 1) |
+			trace_timestamp = ((timestamp_event.timestamp - 1) & MASK_52_7) |
 					  (str_frmt->timestamp & MASK_6_0);
 
 		mlx5_tracer_print_trace(str_frmt, dev, trace_timestamp);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
  2022-11-22  2:25 ` [net 01/14] net/mlx5: Do not query pci info while pci disabled Saeed Mahameed
  2022-11-22  2:25 ` [net 02/14] net/mlx5: Fix FW tracer timestamp calculation Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-23 14:57   ` Maciej Fijalkowski
  2022-11-22  2:25 ` [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint Saeed Mahameed
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Parav Pandit

From: Shay Drory <shayd@nvidia.com>

When SF devices and SF port representors are located on different
functions, unloading and reloading of SF parent driver doesn't recreate
the existing SF present in the device.
Fix it by querying SFs and probe active SFs during driver probe phase.

Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 88 +++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
index 7da012ff0d41..8e2abbab05f0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
@@ -18,6 +18,10 @@ struct mlx5_sf_dev_table {
 	phys_addr_t base_address;
 	u64 sf_bar_length;
 	struct notifier_block nb;
+	struct mutex table_lock; /* Serializes sf life cycle and vhca state change handler */
+	struct workqueue_struct *active_wq;
+	struct work_struct work;
+	u8 stop_active_wq:1;
 	struct mlx5_core_dev *dev;
 };
 
@@ -168,6 +172,7 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
 		return 0;
 
 	sf_index = event->function_id - base_id;
+	mutex_lock(&table->table_lock);
 	sf_dev = xa_load(&table->devices, sf_index);
 	switch (event->new_vhca_state) {
 	case MLX5_VHCA_STATE_INVALID:
@@ -191,6 +196,7 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
 	default:
 		break;
 	}
+	mutex_unlock(&table->table_lock);
 	return 0;
 }
 
@@ -215,6 +221,78 @@ static int mlx5_sf_dev_vhca_arm_all(struct mlx5_sf_dev_table *table)
 	return 0;
 }
 
+static void mlx5_sf_dev_add_active_work(struct work_struct *work)
+{
+	struct mlx5_sf_dev_table *table = container_of(work, struct mlx5_sf_dev_table, work);
+	u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {};
+	struct mlx5_core_dev *dev = table->dev;
+	u16 max_functions;
+	u16 function_id;
+	u16 sw_func_id;
+	int err = 0;
+	u8 state;
+	int i;
+
+	max_functions = mlx5_sf_max_functions(dev);
+	function_id = MLX5_CAP_GEN(dev, sf_base_id);
+	for (i = 0; i < max_functions; i++, function_id++) {
+		if (table->stop_active_wq)
+			return;
+		err = mlx5_cmd_query_vhca_state(dev, function_id, out, sizeof(out));
+		if (err)
+			/* A failure of specific vhca doesn't mean others will
+			 * fail as well.
+			 */
+			continue;
+		state = MLX5_GET(query_vhca_state_out, out, vhca_state_context.vhca_state);
+		if (state != MLX5_VHCA_STATE_ACTIVE)
+			continue;
+
+		sw_func_id = MLX5_GET(query_vhca_state_out, out, vhca_state_context.sw_function_id);
+		mutex_lock(&table->table_lock);
+		/* Don't probe device which is already probe */
+		if (!xa_load(&table->devices, i))
+			mlx5_sf_dev_add(dev, i, function_id, sw_func_id);
+		/* There is a race where SF got inactive after the query
+		 * above. e.g.: the query returns that the state of the
+		 * SF is active, and after that the eswitch manager set it to
+		 * inactive.
+		 * This case cannot be managed in SW, since the probing of the
+		 * SF is on one system, and the inactivation is on a different
+		 * system.
+		 * If the inactive is done after the SF perform init_hca(),
+		 * the SF will fully probe and then removed. If it was
+		 * done before init_hca(), the SF probe will fail.
+		 */
+		mutex_unlock(&table->table_lock);
+	}
+}
+
+/* In case SFs are generated externally, probe active SFs */
+static int mlx5_sf_dev_queue_active_work(struct mlx5_sf_dev_table *table)
+{
+	if (MLX5_CAP_GEN(table->dev, eswitch_manager))
+		return 0; /* the table is local */
+
+	/* Use a workqueue to probe active SFs, which are in large
+	 * quantity and may take up to minutes to probe.
+	 */
+	table->active_wq = create_singlethread_workqueue("mlx5_active_sf");
+	if (!table->active_wq)
+		return -ENOMEM;
+	INIT_WORK(&table->work, &mlx5_sf_dev_add_active_work);
+	queue_work(table->active_wq, &table->work);
+	return 0;
+}
+
+static void mlx5_sf_dev_destroy_active_work(struct mlx5_sf_dev_table *table)
+{
+	if (table->active_wq) {
+		table->stop_active_wq = true;
+		destroy_workqueue(table->active_wq);
+	}
+}
+
 void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
 {
 	struct mlx5_sf_dev_table *table;
@@ -240,11 +318,17 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
 	table->base_address = pci_resource_start(dev->pdev, 2);
 	table->max_sfs = max_sfs;
 	xa_init(&table->devices);
+	mutex_init(&table->table_lock);
 	dev->priv.sf_dev_table = table;
 
 	err = mlx5_vhca_event_notifier_register(dev, &table->nb);
 	if (err)
 		goto vhca_err;
+
+	err = mlx5_sf_dev_queue_active_work(table);
+	if (err)
+		goto add_active_err;
+
 	err = mlx5_sf_dev_vhca_arm_all(table);
 	if (err)
 		goto arm_err;
@@ -252,6 +336,8 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
 	return;
 
 arm_err:
+	mlx5_sf_dev_destroy_active_work(table);
+add_active_err:
 	mlx5_vhca_event_notifier_unregister(dev, &table->nb);
 vhca_err:
 	table->max_sfs = 0;
@@ -279,7 +365,9 @@ void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev)
 	if (!table)
 		return;
 
+	mlx5_sf_dev_destroy_active_work(table);
 	mlx5_vhca_event_notifier_unregister(dev, &table->nb);
+	mutex_destroy(&table->table_lock);
 
 	/* Now that event handler is not running, it is safe to destroy
 	 * the sf device without race.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2022-11-22  2:25 ` [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-23 15:06   ` Maciej Fijalkowski
  2022-11-22  2:25 ` [net 05/14] net/mlx5: Fix handling of entry refcount when command is not issued to FW Saeed Mahameed
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Shay Drory,
	Maor Gottlieb

From: Moshe Shemesh <moshe@nvidia.com>

While moving to new CMD API (quiet API), some pre-existing flows may call the new API
function that in case of error, returns the error instead of printing it as previously done.
For such flows we bring back the print but to tracepoint this time for sys admins to
have the ability to check for errors especially for commands using the new quiet API.

Tracepoint output example:
         devlink-1333    [001] .....   822.746922: mlx5_cmd: ACCESS_REG(0x805) op_mod(0x0) failed, status bad resource(0x5), syndrome (0xb06e1f), err(-22)

Fixes: f23519e542e5 ("net/mlx5: cmdif, Add new api for command execution")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 41 +++++++++--------
 .../mellanox/mlx5/core/diag/cmd_tracepoint.h  | 45 +++++++++++++++++++
 include/linux/mlx5/driver.h                   |  1 +
 3 files changed, 68 insertions(+), 19 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/cmd_tracepoint.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 2e0d59ca62b5..df3e284ca5c6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -45,6 +45,8 @@
 #include "mlx5_core.h"
 #include "lib/eq.h"
 #include "lib/tout.h"
+#define CREATE_TRACE_POINTS
+#include "diag/cmd_tracepoint.h"
 
 enum {
 	CMD_IF_REV = 5,
@@ -785,27 +787,14 @@ EXPORT_SYMBOL(mlx5_cmd_out_err);
 static void cmd_status_print(struct mlx5_core_dev *dev, void *in, void *out)
 {
 	u16 opcode, op_mod;
-	u32 syndrome;
-	u8  status;
 	u16 uid;
-	int err;
-
-	syndrome = MLX5_GET(mbox_out, out, syndrome);
-	status = MLX5_GET(mbox_out, out, status);
 
 	opcode = MLX5_GET(mbox_in, in, opcode);
 	op_mod = MLX5_GET(mbox_in, in, op_mod);
 	uid    = MLX5_GET(mbox_in, in, uid);
 
-	err = cmd_status_to_err(status);
-
 	if (!uid && opcode != MLX5_CMD_OP_DESTROY_MKEY)
 		mlx5_cmd_out_err(dev, opcode, op_mod, out);
-	else
-		mlx5_core_dbg(dev,
-			"%s(0x%x) op_mod(0x%x) uid(%d) failed, status %s(0x%x), syndrome (0x%x), err(%d)\n",
-			mlx5_command_str(opcode), opcode, op_mod, uid,
-			cmd_status_str(status), status, syndrome, err);
 }
 
 int mlx5_cmd_check(struct mlx5_core_dev *dev, int err, void *in, void *out)
@@ -1892,6 +1881,16 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
 	return err;
 }
 
+static void mlx5_cmd_err_trace(struct mlx5_core_dev *dev, u16 opcode, u16 op_mod, void *out)
+{
+	u32 syndrome = MLX5_GET(mbox_out, out, syndrome);
+	u8 status = MLX5_GET(mbox_out, out, status);
+
+	trace_mlx5_cmd(mlx5_command_str(opcode), opcode, op_mod,
+		       cmd_status_str(status), status, syndrome,
+		       cmd_status_to_err(status));
+}
+
 static void cmd_status_log(struct mlx5_core_dev *dev, u16 opcode, u8 status,
 			   u32 syndrome, int err)
 {
@@ -1914,7 +1913,7 @@ static void cmd_status_log(struct mlx5_core_dev *dev, u16 opcode, u8 status,
 }
 
 /* preserve -EREMOTEIO for outbox.status != OK, otherwise return err as is */
-static int cmd_status_err(struct mlx5_core_dev *dev, int err, u16 opcode, void *out)
+static int cmd_status_err(struct mlx5_core_dev *dev, int err, u16 opcode, u16 op_mod, void *out)
 {
 	u32 syndrome = MLX5_GET(mbox_out, out, syndrome);
 	u8 status = MLX5_GET(mbox_out, out, status);
@@ -1922,8 +1921,10 @@ static int cmd_status_err(struct mlx5_core_dev *dev, int err, u16 opcode, void *
 	if (err == -EREMOTEIO) /* -EREMOTEIO is preserved */
 		err = -EIO;
 
-	if (!err && status != MLX5_CMD_STAT_OK)
+	if (!err && status != MLX5_CMD_STAT_OK) {
 		err = -EREMOTEIO;
+		mlx5_cmd_err_trace(dev, opcode, op_mod, out);
+	}
 
 	cmd_status_log(dev, opcode, status, syndrome, err);
 	return err;
@@ -1951,9 +1952,9 @@ int mlx5_cmd_do(struct mlx5_core_dev *dev, void *in, int in_size, void *out, int
 {
 	int err = cmd_exec(dev, in, in_size, out, out_size, NULL, NULL, false);
 	u16 opcode = MLX5_GET(mbox_in, in, opcode);
+	u16 op_mod = MLX5_GET(mbox_in, in, op_mod);
 
-	err = cmd_status_err(dev, err, opcode, out);
-	return err;
+	return cmd_status_err(dev, err, opcode, op_mod, out);
 }
 EXPORT_SYMBOL(mlx5_cmd_do);
 
@@ -1997,8 +1998,9 @@ int mlx5_cmd_exec_polling(struct mlx5_core_dev *dev, void *in, int in_size,
 {
 	int err = cmd_exec(dev, in, in_size, out, out_size, NULL, NULL, true);
 	u16 opcode = MLX5_GET(mbox_in, in, opcode);
+	u16 op_mod = MLX5_GET(mbox_in, in, op_mod);
 
-	err = cmd_status_err(dev, err, opcode, out);
+	err = cmd_status_err(dev, err, opcode, op_mod, out);
 	return mlx5_cmd_check(dev, err, in, out);
 }
 EXPORT_SYMBOL(mlx5_cmd_exec_polling);
@@ -2034,7 +2036,7 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
 	struct mlx5_async_ctx *ctx;
 
 	ctx = work->ctx;
-	status = cmd_status_err(ctx->dev, status, work->opcode, work->out);
+	status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out);
 	work->user_callback(status, work);
 	if (atomic_dec_and_test(&ctx->num_inflight))
 		complete(&ctx->inflight_done);
@@ -2049,6 +2051,7 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
 	work->ctx = ctx;
 	work->user_callback = callback;
 	work->opcode = MLX5_GET(mbox_in, in, opcode);
+	work->op_mod = MLX5_GET(mbox_in, in, op_mod);
 	work->out = out;
 	if (WARN_ON(!atomic_inc_not_zero(&ctx->num_inflight)))
 		return -EIO;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/cmd_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/diag/cmd_tracepoint.h
new file mode 100644
index 000000000000..406ebe17405f
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/cmd_tracepoint.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mlx5
+
+#if !defined(_MLX5_CMD_TP_H_) || defined(TRACE_HEADER_MULTI_READ)
+#define _MLX5_CMD_TP_H_
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+TRACE_EVENT(mlx5_cmd,
+	    TP_PROTO(const char *command_str, u16 opcode, u16 op_mod,
+		     const char *status_str, u8 status, u32 syndrome, int err),
+	    TP_ARGS(command_str, opcode, op_mod, status_str, status, syndrome, err),
+	    TP_STRUCT__entry(__string(command_str, command_str)
+			     __field(u16, opcode)
+			     __field(u16, op_mod)
+			    __string(status_str, status_str)
+			    __field(u8, status)
+			    __field(u32, syndrome)
+			    __field(int, err)
+			    ),
+	    TP_fast_assign(__assign_str(command_str, command_str);
+			__entry->opcode = opcode;
+			__entry->op_mod = op_mod;
+			__assign_str(status_str, status_str);
+			__entry->status = status;
+			__entry->syndrome = syndrome;
+			__entry->err = err;
+	    ),
+	    TP_printk("%s(0x%x) op_mod(0x%x) failed, status %s(0x%x), syndrome (0x%x), err(%d)",
+		      __get_str(command_str), __entry->opcode, __entry->op_mod,
+		      __get_str(status_str), __entry->status, __entry->syndrome,
+		      __entry->err)
+);
+
+#endif /* _MLX5_CMD_TP_H_ */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH ./diag
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE cmd_tracepoint
+#include <trace/define_trace.h>
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index af2ceb4160bc..06cbad166225 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -981,6 +981,7 @@ struct mlx5_async_work {
 	struct mlx5_async_ctx *ctx;
 	mlx5_async_cbk_t user_callback;
 	u16 opcode; /* cmd opcode */
+	u16 op_mod; /* cmd op_mod */
 	void *out; /* pointer to the cmd output buffer */
 };
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 05/14] net/mlx5: Fix handling of entry refcount when command is not issued to FW
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2022-11-22  2:25 ` [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 06/14] net/mlx5: Lag, avoid lockdep warnings Saeed Mahameed
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh,
	Eran Ben Elisha, Jack Wang

From: Moshe Shemesh <moshe@nvidia.com>

In case command interface is down, or the command is not allowed, driver
did not increment the entry refcount, but might have decrement as part
of forced completion handling.

Fix that by always increment and decrement the refcount to make it
symmetric for all flows.

Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reported-by: Jack Wang <jinpu.wang@ionos.com>
Tested-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index df3e284ca5c6..74bd05e5dda2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1005,6 +1005,7 @@ static void cmd_work_handler(struct work_struct *work)
 		cmd_ent_get(ent);
 	set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state);
 
+	cmd_ent_get(ent); /* for the _real_ FW event on completion */
 	/* Skip sending command to fw if internal error */
 	if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, ent->op)) {
 		ent->ret = -ENXIO;
@@ -1012,7 +1013,6 @@ static void cmd_work_handler(struct work_struct *work)
 		return;
 	}
 
-	cmd_ent_get(ent); /* for the _real_ FW event on completion */
 	/* ring doorbell after the descriptor is valid */
 	mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx);
 	wmb();
@@ -1661,8 +1661,8 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
 				cmd_ent_put(ent); /* timeout work was canceled */
 
 			if (!forced || /* Real FW completion */
-			    pci_channel_offline(dev->pdev) || /* FW is inaccessible */
-			    dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
+			     mlx5_cmd_is_down(dev) || /* No real FW completion is expected */
+			     !opcode_allowed(cmd, ent->op))
 				cmd_ent_put(ent);
 
 			ent->ts2 = ktime_get_ns();
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 06/14] net/mlx5: Lag, avoid lockdep warnings
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2022-11-22  2:25 ` [net 05/14] net/mlx5: Fix handling of entry refcount when command is not issued to FW Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 07/14] net/mlx5: E-Switch, Set correctly vport destination Saeed Mahameed
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Eli Cohen, Mark Bloch

From: Eli Cohen <elic@nvidia.com>

ldev->lock is used to serialize lag change operations. Since multiport
eswtich functionality was added, we now change the mode dynamically.
However, acquiring ldev->lock is not allowed as it could possibly lead
to a deadlock as reported by the lockdep mechanism.

[  836.154963] WARNING: possible circular locking dependency detected
[  836.155850] 5.19.0-rc5_net_56b7df2 #1 Not tainted
[  836.156549] ------------------------------------------------------
[  836.157418] handler1/12198 is trying to acquire lock:
[  836.158178] ffff888187d52b58 (&ldev->lock){+.+.}-{3:3}, at: mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[  836.159575]
[  836.159575] but task is already holding lock:
[  836.160474] ffff8881d4de2930 (&block->cb_lock){++++}-{3:3}, at: tc_setup_cb_add+0x5b/0x200
[  836.161669] which lock already depends on the new lock.
[  836.162905]
[  836.162905] the existing dependency chain (in reverse order) is:
[  836.164008] -> #3 (&block->cb_lock){++++}-{3:3}:
[  836.164946]        down_write+0x25/0x60
[  836.165548]        tcf_block_get_ext+0x1c6/0x5d0
[  836.166253]        ingress_init+0x74/0xa0 [sch_ingress]
[  836.167028]        qdisc_create.constprop.0+0x130/0x5e0
[  836.167805]        tc_modify_qdisc+0x481/0x9f0
[  836.168490]        rtnetlink_rcv_msg+0x16e/0x5a0
[  836.169189]        netlink_rcv_skb+0x4e/0xf0
[  836.169861]        netlink_unicast+0x190/0x250
[  836.170543]        netlink_sendmsg+0x243/0x4b0
[  836.171226]        sock_sendmsg+0x33/0x40
[  836.171860]        ____sys_sendmsg+0x1d1/0x1f0
[  836.172535]        ___sys_sendmsg+0xab/0xf0
[  836.173183]        __sys_sendmsg+0x51/0x90
[  836.173836]        do_syscall_64+0x3d/0x90
[  836.174471]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  836.175282]

[  836.175282] -> #2 (rtnl_mutex){+.+.}-{3:3}:
[  836.176190]        __mutex_lock+0x6b/0xf80
[  836.176830]        register_netdevice_notifier+0x21/0x120
[  836.177631]        rtnetlink_init+0x2d/0x1e9
[  836.178289]        netlink_proto_init+0x163/0x179
[  836.178994]        do_one_initcall+0x63/0x300
[  836.179672]        kernel_init_freeable+0x2cb/0x31b
[  836.180403]        kernel_init+0x17/0x140
[  836.181035]        ret_from_fork+0x1f/0x30

 [  836.181687] -> #1 (pernet_ops_rwsem){+.+.}-{3:3}:
[  836.182628]        down_write+0x25/0x60
[  836.183235]        unregister_netdevice_notifier+0x1c/0xb0
[  836.184029]        mlx5_ib_roce_cleanup+0x94/0x120 [mlx5_ib]
[  836.184855]        __mlx5_ib_remove+0x35/0x60 [mlx5_ib]
[  836.185637]        mlx5_eswitch_unregister_vport_reps+0x22f/0x440 [mlx5_core]
[  836.186698]        auxiliary_bus_remove+0x18/0x30
[  836.187409]        device_release_driver_internal+0x1f6/0x270
[  836.188253]        bus_remove_device+0xef/0x160
[  836.188939]        device_del+0x18b/0x3f0
[  836.189562]        mlx5_rescan_drivers_locked+0xd6/0x2d0 [mlx5_core]
[  836.190516]        mlx5_lag_remove_devices+0x69/0xe0 [mlx5_core]
[  836.191414]        mlx5_do_bond_work+0x441/0x620 [mlx5_core]
[  836.192278]        process_one_work+0x25c/0x590
[  836.192963]        worker_thread+0x4f/0x3d0
[  836.193609]        kthread+0xcb/0xf0
[  836.194189]        ret_from_fork+0x1f/0x30

[  836.194826] -> #0 (&ldev->lock){+.+.}-{3:3}:
[  836.195734]        __lock_acquire+0x15b8/0x2a10
[  836.196426]        lock_acquire+0xce/0x2d0
[  836.197057]        __mutex_lock+0x6b/0xf80
[  836.197708]        mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[  836.198575]        tc_act_parse_mirred+0x25b/0x800 [mlx5_core]
[  836.199467]        parse_tc_actions+0x168/0x5a0 [mlx5_core]
[  836.200340]        __mlx5e_add_fdb_flow+0x263/0x480 [mlx5_core]
[  836.201241]        mlx5e_configure_flower+0x8a0/0x1820 [mlx5_core]
[  836.202187]        tc_setup_cb_add+0xd7/0x200
[  836.202856]        fl_hw_replace_filter+0x14c/0x1f0 [cls_flower]
[  836.203739]        fl_change+0xbbe/0x1730 [cls_flower]
[  836.204501]        tc_new_tfilter+0x407/0xd90
[  836.205168]        rtnetlink_rcv_msg+0x406/0x5a0
[  836.205877]        netlink_rcv_skb+0x4e/0xf0
[  836.206535]        netlink_unicast+0x190/0x250
[  836.207217]        netlink_sendmsg+0x243/0x4b0
[  836.207915]        sock_sendmsg+0x33/0x40
[  836.208538]        ____sys_sendmsg+0x1d1/0x1f0
[  836.209219]        ___sys_sendmsg+0xab/0xf0
[  836.209878]        __sys_sendmsg+0x51/0x90
[  836.210510]        do_syscall_64+0x3d/0x90
[  836.211137]        entry_SYSCALL_64_after_hwframe+0x46/0xb0

[  836.211954] other info that might help us debug this:
[  836.213174] Chain exists of:
[  836.213174]   &ldev->lock --> rtnl_mutex --> &block->cb_lock
   836.214650]  Possible unsafe locking scenario:
[  836.214650]
[  836.215574]        CPU0                    CPU1
[  836.216255]        ----                    ----
[  836.216943]   lock(&block->cb_lock);
[  836.217518]                                lock(rtnl_mutex);
[  836.218348]                                lock(&block->cb_lock);
[  836.219212]   lock(&ldev->lock);
[  836.219758]
[  836.219758]  *** DEADLOCK ***
[  836.219758]
 [  836.220747] 2 locks held by handler1/12198:
[  836.221390]  #0: ffff8881d4de2930 (&block->cb_lock){++++}-{3:3}, at: tc_setup_cb_add+0x5b/0x200
[  836.222646]  #1: ffff88810c9a92c0 (&esw->mode_lock){++++}-{3:3}, at: mlx5_esw_hold+0x39/0x50 [mlx5_core]

[  836.224063] stack backtrace:
[  836.224799] CPU: 6 PID: 12198 Comm: handler1 Not tainted 5.19.0-rc5_net_56b7df2 #1
[  836.225923] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  836.227476] Call Trace:
[  836.227929]  <TASK>
[  836.228332]  dump_stack_lvl+0x57/0x7d
[  836.228924]  check_noncircular+0x104/0x120
[  836.229562]  __lock_acquire+0x15b8/0x2a10
[  836.230201]  lock_acquire+0xce/0x2d0
[  836.230776]  ? mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[  836.231614]  ? find_held_lock+0x2b/0x80
[  836.232221]  __mutex_lock+0x6b/0xf80
[  836.232799]  ? mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[  836.233636]  ? mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[  836.234451]  ? xa_load+0xc3/0x190
[  836.234995]  mlx5_lag_do_mirred+0x3b/0x70 [mlx5_core]
[  836.235803]  tc_act_parse_mirred+0x25b/0x800 [mlx5_core]
[  836.236636]  ? tc_act_can_offload_mirred+0x135/0x210 [mlx5_core]
[  836.237550]  parse_tc_actions+0x168/0x5a0 [mlx5_core]
[  836.238364]  __mlx5e_add_fdb_flow+0x263/0x480 [mlx5_core]
[  836.239202]  mlx5e_configure_flower+0x8a0/0x1820 [mlx5_core]
[  836.240076]  ? lock_acquire+0xce/0x2d0
[  836.240668]  ? tc_setup_cb_add+0x5b/0x200
[  836.241294]  tc_setup_cb_add+0xd7/0x200
[  836.241917]  fl_hw_replace_filter+0x14c/0x1f0 [cls_flower]
[  836.242709]  fl_change+0xbbe/0x1730 [cls_flower]
[  836.243408]  tc_new_tfilter+0x407/0xd90
[  836.244043]  ? tc_del_tfilter+0x880/0x880
[  836.244672]  rtnetlink_rcv_msg+0x406/0x5a0
[  836.245310]  ? netlink_deliver_tap+0x7a/0x4b0
[  836.245991]  ? if_nlmsg_stats_size+0x2b0/0x2b0
[  836.246675]  netlink_rcv_skb+0x4e/0xf0
[  836.258046]  netlink_unicast+0x190/0x250
[  836.258669]  netlink_sendmsg+0x243/0x4b0
[  836.259288]  sock_sendmsg+0x33/0x40
[  836.259857]  ____sys_sendmsg+0x1d1/0x1f0
[  836.260473]  ___sys_sendmsg+0xab/0xf0
[  836.261064]  ? lock_acquire+0xce/0x2d0
[  836.261669]  ? find_held_lock+0x2b/0x80
[  836.262272]  ? __fget_files+0xb9/0x190
[  836.262871]  ? __fget_files+0xd3/0x190
[  836.263462]  __sys_sendmsg+0x51/0x90
[  836.264064]  do_syscall_64+0x3d/0x90
[  836.264652]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  836.265425] RIP: 0033:0x7fdbe5e2677d

[  836.266012] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ba ee
ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 ee ee ff ff 48
[  836.268485] RSP: 002b:00007fdbe48a75a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[  836.269598] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fdbe5e2677d
[  836.270576] RDX: 0000000000000000 RSI: 00007fdbe48a7640 RDI: 000000000000003c
[  836.271565] RBP: 00007fdbe48a8368 R08: 0000000000000000 R09: 0000000000000000
[  836.272546] R10: 00007fdbe48a84b0 R11: 0000000000000293 R12: 0000557bd17dc860
[  836.273527] R13: 0000000000000000 R14: 0000557bd17dc860 R15: 00007fdbe48a7640

[  836.274521]  </TASK>

To avoid using mode holding ldev->lock in the configure flow, we queue a
work to the lag workqueue and cease wait on a completion object.

In addition, we remove the lock from mlx5_lag_do_mirred() since it is
not really protecting anything.

It should be noted that an actual deadlock has not been observed.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/lag/lag.c |   3 +-
 .../net/ethernet/mellanox/mlx5/core/lag/lag.h |  14 ++-
 .../ethernet/mellanox/mlx5/core/lag/mpesw.c   | 100 +++++++++++-------
 .../ethernet/mellanox/mlx5/core/lag/mpesw.h   |   1 -
 4 files changed, 78 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index a9f4ede4a9bf..be1307a63e6d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -228,9 +228,8 @@ static void mlx5_ldev_free(struct kref *ref)
 	if (ldev->nb.notifier_call)
 		unregister_netdevice_notifier_net(&init_net, &ldev->nb);
 	mlx5_lag_mp_cleanup(ldev);
-	mlx5_lag_mpesw_cleanup(ldev);
-	cancel_work_sync(&ldev->mpesw_work);
 	destroy_workqueue(ldev->wq);
+	mlx5_lag_mpesw_cleanup(ldev);
 	mutex_destroy(&ldev->lock);
 	kfree(ldev);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
index ce2ce8ccbd70..f30ac2de639f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
@@ -50,6 +50,19 @@ struct lag_tracker {
 	enum netdev_lag_hash hash_type;
 };
 
+enum mpesw_op {
+	MLX5_MPESW_OP_ENABLE,
+	MLX5_MPESW_OP_DISABLE,
+};
+
+struct mlx5_mpesw_work_st {
+	struct work_struct work;
+	struct mlx5_lag    *lag;
+	enum mpesw_op	   op;
+	struct completion  comp;
+	int result;
+};
+
 /* LAG data of a ConnectX card.
  * It serves both its phys functions.
  */
@@ -66,7 +79,6 @@ struct mlx5_lag {
 	struct lag_tracker        tracker;
 	struct workqueue_struct   *wq;
 	struct delayed_work       bond_work;
-	struct work_struct	  mpesw_work;
 	struct notifier_block     nb;
 	struct lag_mp             lag_mp;
 	struct mlx5_lag_port_sel  port_sel;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
index f643202b29c6..c17e8f1ec914 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
@@ -7,63 +7,95 @@
 #include "eswitch.h"
 #include "lib/mlx5.h"
 
-void mlx5_mpesw_work(struct work_struct *work)
+static int add_mpesw_rule(struct mlx5_lag *ldev)
 {
-	struct mlx5_lag *ldev = container_of(work, struct mlx5_lag, mpesw_work);
+	struct mlx5_core_dev *dev = ldev->pf[MLX5_LAG_P1].dev;
+	int err;
 
-	mutex_lock(&ldev->lock);
-	mlx5_disable_lag(ldev);
-	mutex_unlock(&ldev->lock);
-}
+	if (atomic_add_return(1, &ldev->lag_mpesw.mpesw_rule_count) != 1)
+		return 0;
 
-static void mlx5_lag_disable_mpesw(struct mlx5_core_dev *dev)
-{
-	struct mlx5_lag *ldev = dev->priv.lag;
+	if (ldev->mode != MLX5_LAG_MODE_NONE) {
+		err = -EINVAL;
+		goto out_err;
+	}
 
-	if (!queue_work(ldev->wq, &ldev->mpesw_work))
-		mlx5_core_warn(dev, "failed to queue work\n");
+	err = mlx5_activate_lag(ldev, NULL, MLX5_LAG_MODE_MPESW, false);
+	if (err) {
+		mlx5_core_warn(dev, "Failed to create LAG in MPESW mode (%d)\n", err);
+		goto out_err;
+	}
+
+	return 0;
+
+out_err:
+	atomic_dec(&ldev->lag_mpesw.mpesw_rule_count);
+	return err;
 }
 
-void mlx5_lag_del_mpesw_rule(struct mlx5_core_dev *dev)
+static void del_mpesw_rule(struct mlx5_lag *ldev)
 {
-	struct mlx5_lag *ldev = dev->priv.lag;
+	if (!atomic_dec_return(&ldev->lag_mpesw.mpesw_rule_count) &&
+	    ldev->mode == MLX5_LAG_MODE_MPESW)
+		mlx5_disable_lag(ldev);
+}
 
-	if (!ldev)
-		return;
+static void mlx5_mpesw_work(struct work_struct *work)
+{
+	struct mlx5_mpesw_work_st *mpesww = container_of(work, struct mlx5_mpesw_work_st, work);
+	struct mlx5_lag *ldev = mpesww->lag;
 
 	mutex_lock(&ldev->lock);
-	if (!atomic_dec_return(&ldev->lag_mpesw.mpesw_rule_count) &&
-	    ldev->mode == MLX5_LAG_MODE_MPESW)
-		mlx5_lag_disable_mpesw(dev);
+	if (mpesww->op == MLX5_MPESW_OP_ENABLE)
+		mpesww->result = add_mpesw_rule(ldev);
+	else if (mpesww->op == MLX5_MPESW_OP_DISABLE)
+		del_mpesw_rule(ldev);
 	mutex_unlock(&ldev->lock);
+
+	complete(&mpesww->comp);
 }
 
-int mlx5_lag_add_mpesw_rule(struct mlx5_core_dev *dev)
+static int mlx5_lag_mpesw_queue_work(struct mlx5_core_dev *dev,
+				     enum mpesw_op op)
 {
 	struct mlx5_lag *ldev = dev->priv.lag;
+	struct mlx5_mpesw_work_st *work;
 	int err = 0;
 
 	if (!ldev)
 		return 0;
 
-	mutex_lock(&ldev->lock);
-	if (atomic_add_return(1, &ldev->lag_mpesw.mpesw_rule_count) != 1)
-		goto out;
+	work = kzalloc(sizeof(*work), GFP_KERNEL);
+	if (!work)
+		return -ENOMEM;
 
-	if (ldev->mode != MLX5_LAG_MODE_NONE) {
+	INIT_WORK(&work->work, mlx5_mpesw_work);
+	init_completion(&work->comp);
+	work->op = op;
+	work->lag = ldev;
+
+	if (!queue_work(ldev->wq, &work->work)) {
+		mlx5_core_warn(dev, "failed to queue mpesw work\n");
 		err = -EINVAL;
 		goto out;
 	}
-
-	err = mlx5_activate_lag(ldev, NULL, MLX5_LAG_MODE_MPESW, false);
-	if (err)
-		mlx5_core_warn(dev, "Failed to create LAG in MPESW mode (%d)\n", err);
-
+	wait_for_completion(&work->comp);
+	err = work->result;
 out:
-	mutex_unlock(&ldev->lock);
+	kfree(work);
 	return err;
 }
 
+void mlx5_lag_del_mpesw_rule(struct mlx5_core_dev *dev)
+{
+	mlx5_lag_mpesw_queue_work(dev, MLX5_MPESW_OP_DISABLE);
+}
+
+int mlx5_lag_add_mpesw_rule(struct mlx5_core_dev *dev)
+{
+	return mlx5_lag_mpesw_queue_work(dev, MLX5_MPESW_OP_ENABLE);
+}
+
 int mlx5_lag_do_mirred(struct mlx5_core_dev *mdev, struct net_device *out_dev)
 {
 	struct mlx5_lag *ldev = mdev->priv.lag;
@@ -71,12 +103,9 @@ int mlx5_lag_do_mirred(struct mlx5_core_dev *mdev, struct net_device *out_dev)
 	if (!netif_is_bond_master(out_dev) || !ldev)
 		return 0;
 
-	mutex_lock(&ldev->lock);
-	if (ldev->mode == MLX5_LAG_MODE_MPESW) {
-		mutex_unlock(&ldev->lock);
+	if (ldev->mode == MLX5_LAG_MODE_MPESW)
 		return -EOPNOTSUPP;
-	}
-	mutex_unlock(&ldev->lock);
+
 	return 0;
 }
 
@@ -90,11 +119,10 @@ bool mlx5_lag_mpesw_is_activated(struct mlx5_core_dev *dev)
 
 void mlx5_lag_mpesw_init(struct mlx5_lag *ldev)
 {
-	INIT_WORK(&ldev->mpesw_work, mlx5_mpesw_work);
 	atomic_set(&ldev->lag_mpesw.mpesw_rule_count, 0);
 }
 
 void mlx5_lag_mpesw_cleanup(struct mlx5_lag *ldev)
 {
-	cancel_delayed_work_sync(&ldev->bond_work);
+	WARN_ON(atomic_read(&ldev->lag_mpesw.mpesw_rule_count));
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.h
index be4abcb8fcd5..88e8daffcf92 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.h
@@ -12,7 +12,6 @@ struct lag_mpesw {
 	atomic_t mpesw_rule_count;
 };
 
-void mlx5_mpesw_work(struct work_struct *work);
 int mlx5_lag_do_mirred(struct mlx5_core_dev *mdev, struct net_device *out_dev);
 bool mlx5_lag_mpesw_is_activated(struct mlx5_core_dev *dev);
 #if IS_ENABLED(CONFIG_MLX5_ESWITCH)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 07/14] net/mlx5: E-Switch, Set correctly vport destination
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2022-11-22  2:25 ` [net 06/14] net/mlx5: Lag, avoid lockdep warnings Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 08/14] net/mlx5: Fix sync reset event handler error flow Saeed Mahameed
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Roi Dayan, Chris Mi

From: Roi Dayan <roid@nvidia.com>

The cited commit moved from using reformat_id integer to packet_reformat
pointer which introduced the possibility to null pointer dereference.
When setting packet reformat flag and pkt_reformat pointer must
exists so checking MLX5_ESW_DEST_ENCAP is not enough, we need
to make sure the pkt_reformat is valid and check for MLX5_ESW_DEST_ENCAP_VALID.
If the dest encap valid flag does not exists then pkt_reformat can be
either invalid address or null.
Also, to make sure we don't try to access invalid pkt_reformat set it to
null when invalidated and invalidate it before calling add flow code as
its logically more correct and to be safe.

Fixes: 2b688ea5efde ("net/mlx5: Add flow steering actions to fs_cmd shim layer")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c  | 10 ++++++----
 .../net/ethernet/mellanox/mlx5/core/eswitch_offloads.c |  2 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
index 5aff97914367..5b6a79d2034e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
@@ -224,15 +224,16 @@ void mlx5e_tc_encap_flows_del(struct mlx5e_priv *priv,
 	list_for_each_entry(flow, flow_list, tmp_list) {
 		if (!mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, SLOW))
 			continue;
-		spec = &flow->attr->parse_attr->spec;
-
-		/* update from encap rule to slow path rule */
-		rule = mlx5e_tc_offload_to_slow_path(esw, flow, spec);
 
 		attr = mlx5e_tc_get_encap_attr(flow);
 		esw_attr = attr->esw_attr;
 		/* mark the flow's encap dest as non-valid */
 		esw_attr->dests[flow->tmp_entry_index].flags &= ~MLX5_ESW_DEST_ENCAP_VALID;
+		esw_attr->dests[flow->tmp_entry_index].pkt_reformat = NULL;
+
+		/* update from encap rule to slow path rule */
+		spec = &flow->attr->parse_attr->spec;
+		rule = mlx5e_tc_offload_to_slow_path(esw, flow, spec);
 
 		if (IS_ERR(rule)) {
 			err = PTR_ERR(rule);
@@ -251,6 +252,7 @@ void mlx5e_tc_encap_flows_del(struct mlx5e_priv *priv,
 	/* we know that the encap is valid */
 	e->flags &= ~MLX5_ENCAP_ENTRY_VALID;
 	mlx5_packet_reformat_dealloc(priv->mdev, e->pkt_reformat);
+	e->pkt_reformat = NULL;
 }
 
 static void mlx5e_take_tmp_flow(struct mlx5e_tc_flow *flow,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 728ca9f2bb9d..3fda75fe168c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -433,7 +433,7 @@ esw_setup_vport_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *f
 		    mlx5_lag_mpesw_is_activated(esw->dev))
 			dest[dest_idx].type = MLX5_FLOW_DESTINATION_TYPE_UPLINK;
 	}
-	if (esw_attr->dests[attr_idx].flags & MLX5_ESW_DEST_ENCAP) {
+	if (esw_attr->dests[attr_idx].flags & MLX5_ESW_DEST_ENCAP_VALID) {
 		if (pkt_reformat) {
 			flow_act->action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
 			flow_act->pkt_reformat = esw_attr->dests[attr_idx].pkt_reformat;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 08/14] net/mlx5: Fix sync reset event handler error flow
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2022-11-22  2:25 ` [net 07/14] net/mlx5: E-Switch, Set correctly vport destination Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 09/14] net/mlx5e: Fix missing alignment in size of MTT/KLM entries Saeed Mahameed
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Aya Levin

From: Moshe Shemesh <moshe@nvidia.com>

When sync reset now event handling fails on mlx5_pci_link_toggle() then
no reset was done. However, since mlx5_cmd_fast_teardown_hca() was
already done, the firmware function is closed and the driver is left
without firmware functionality.

Fix it by setting device error state and reopen the firmware resources.
Reopening is done by the thread that was called for devlink reload
fw_activate as it already holds the devlink lock.

Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 9d908a0ccfef..1e46f9afa40e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -9,7 +9,8 @@ enum {
 	MLX5_FW_RESET_FLAGS_RESET_REQUESTED,
 	MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST,
 	MLX5_FW_RESET_FLAGS_PENDING_COMP,
-	MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS
+	MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS,
+	MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED
 };
 
 struct mlx5_fw_reset {
@@ -406,7 +407,7 @@ static void mlx5_sync_reset_now_event(struct work_struct *work)
 	err = mlx5_pci_link_toggle(dev);
 	if (err) {
 		mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, no reset done, err %d\n", err);
-		goto done;
+		set_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags);
 	}
 
 	mlx5_enter_error_state(dev, true);
@@ -482,6 +483,10 @@ int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev)
 		goto out;
 	}
 	err = fw_reset->ret;
+	if (test_and_clear_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags)) {
+		mlx5_unload_one_devl_locked(dev);
+		mlx5_load_one_devl_locked(dev, false);
+	}
 out:
 	clear_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags);
 	return err;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 09/14] net/mlx5e: Fix missing alignment in size of MTT/KLM entries
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2022-11-22  2:25 ` [net 08/14] net/mlx5: Fix sync reset event handler error flow Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 10/14] net/mlx5e: Offload rule only when all encaps are valid Saeed Mahameed
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman

From: Tariq Toukan <tariqt@nvidia.com>

In the cited patch, an alignment required by the HW spec was mistakenly
dropped. Bring it back to fix error completions like the below:

mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x40b, ci 0x0, qn 0x104f, opcode 0xd, syndrome 0x2, vendor syndrome 0x68
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 86 00 68 02 25 00 10 4f 00 00 bb d2
WQE DUMP: WQ size 1024 WQ cur size 0, WQE index 0x0, len: 192
00000000: 00 00 00 25 00 10 4f 0c 00 00 00 00 00 18 2e 00
00000010: 90 00 00 00 00 02 00 00 00 00 00 00 20 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000080: 08 00 00 00 48 6a 00 02 08 00 00 00 0e 10 00 02
00000090: 08 00 00 00 0c db 00 02 08 00 00 00 0e 82 00 02
000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fixes: 9f123f740428 ("net/mlx5e: Improve MTT/KSM alignment")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e3a4f01bcceb..5e41dfdf79c8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -206,10 +206,11 @@ static void mlx5e_disable_blocking_events(struct mlx5e_priv *priv)
 static u16 mlx5e_mpwrq_umr_octowords(u32 entries, enum mlx5e_mpwrq_umr_mode umr_mode)
 {
 	u8 umr_entry_size = mlx5e_mpwrq_umr_entry_size(umr_mode);
+	u32 sz;
 
-	WARN_ON_ONCE(entries * umr_entry_size % MLX5_OCTWORD);
+	sz = ALIGN(entries * umr_entry_size, MLX5_UMR_MTT_ALIGNMENT);
 
-	return entries * umr_entry_size / MLX5_OCTWORD;
+	return sz / MLX5_OCTWORD;
 }
 
 static inline void mlx5e_build_umr_wqe(struct mlx5e_rq *rq,
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 10/14] net/mlx5e: Offload rule only when all encaps are valid
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2022-11-22  2:25 ` [net 09/14] net/mlx5e: Fix missing alignment in size of MTT/KLM entries Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 11/14] net/mlx5e: Remove leftovers from old XSK queues enumeration Saeed Mahameed
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Chris Mi, Roi Dayan

From: Chris Mi <cmi@nvidia.com>

The cited commit adds a for loop to support multiple encapsulations.
But it only checks if the last encap is valid.

Fix it by setting slow path flag when one of the encap is invalid.

Fixes: f493f15534ec ("net/mlx5e: Move flow attr reformat action bit to per dest flags")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/tc_tun_encap.c        |  6 ++----
 .../mellanox/mlx5/core/en/tc_tun_encap.h        |  3 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 17 ++++++-----------
 3 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
index 5b6a79d2034e..ff73d25bc6eb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
@@ -764,8 +764,7 @@ int mlx5e_attach_encap(struct mlx5e_priv *priv,
 		       struct net_device *mirred_dev,
 		       int out_index,
 		       struct netlink_ext_ack *extack,
-		       struct net_device **encap_dev,
-		       bool *encap_valid)
+		       struct net_device **encap_dev)
 {
 	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	struct mlx5e_tc_flow_parse_attr *parse_attr;
@@ -880,9 +879,8 @@ int mlx5e_attach_encap(struct mlx5e_priv *priv,
 	if (e->flags & MLX5_ENCAP_ENTRY_VALID) {
 		attr->esw_attr->dests[out_index].pkt_reformat = e->pkt_reformat;
 		attr->esw_attr->dests[out_index].flags |= MLX5_ESW_DEST_ENCAP_VALID;
-		*encap_valid = true;
 	} else {
-		*encap_valid = false;
+		flow_flag_set(flow, SLOW);
 	}
 	mutex_unlock(&esw->offloads.encap_tbl_lock);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.h
index d542b8476491..8ad273dde40e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.h
@@ -17,8 +17,7 @@ int mlx5e_attach_encap(struct mlx5e_priv *priv,
 		       struct net_device *mirred_dev,
 		       int out_index,
 		       struct netlink_ext_ack *extack,
-		       struct net_device **encap_dev,
-		       bool *encap_valid);
+		       struct net_device **encap_dev);
 
 int mlx5e_attach_decap(struct mlx5e_priv *priv,
 		       struct mlx5e_tc_flow *flow,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 5a6aa61ec82a..bd9936af4582 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1634,7 +1634,6 @@ set_encap_dests(struct mlx5e_priv *priv,
 		struct mlx5e_tc_flow *flow,
 		struct mlx5_flow_attr *attr,
 		struct netlink_ext_ack *extack,
-		bool *encap_valid,
 		bool *vf_tun)
 {
 	struct mlx5e_tc_flow_parse_attr *parse_attr;
@@ -1651,7 +1650,6 @@ set_encap_dests(struct mlx5e_priv *priv,
 	parse_attr = attr->parse_attr;
 	esw_attr = attr->esw_attr;
 	*vf_tun = false;
-	*encap_valid = true;
 
 	for (out_index = 0; out_index < MLX5_MAX_FLOW_FWD_VPORTS; out_index++) {
 		struct net_device *out_dev;
@@ -1668,7 +1666,7 @@ set_encap_dests(struct mlx5e_priv *priv,
 			goto out;
 		}
 		err = mlx5e_attach_encap(priv, flow, attr, out_dev, out_index,
-					 extack, &encap_dev, encap_valid);
+					 extack, &encap_dev);
 		dev_put(out_dev);
 		if (err)
 			goto out;
@@ -1732,8 +1730,8 @@ mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
 	struct mlx5e_tc_flow_parse_attr *parse_attr;
 	struct mlx5_flow_attr *attr = flow->attr;
 	struct mlx5_esw_flow_attr *esw_attr;
-	bool vf_tun, encap_valid;
 	u32 max_prio, max_chain;
+	bool vf_tun;
 	int err = 0;
 
 	parse_attr = attr->parse_attr;
@@ -1823,7 +1821,7 @@ mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
 		esw_attr->int_port = int_port;
 	}
 
-	err = set_encap_dests(priv, flow, attr, extack, &encap_valid, &vf_tun);
+	err = set_encap_dests(priv, flow, attr, extack, &vf_tun);
 	if (err)
 		goto err_out;
 
@@ -1853,7 +1851,7 @@ mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
 	 * (1) there's no error
 	 * (2) there's an encap action and we don't have valid neigh
 	 */
-	if (!encap_valid || flow_flag_test(flow, SLOW))
+	if (flow_flag_test(flow, SLOW))
 		flow->rule[0] = mlx5e_tc_offload_to_slow_path(esw, flow, &parse_attr->spec);
 	else
 		flow->rule[0] = mlx5e_tc_offload_fdb_rules(esw, flow, &parse_attr->spec, attr);
@@ -3759,7 +3757,7 @@ alloc_flow_post_acts(struct mlx5e_tc_flow *flow, struct netlink_ext_ack *extack)
 	struct mlx5e_post_act *post_act = get_post_action(flow->priv);
 	struct mlx5_flow_attr *attr, *next_attr = NULL;
 	struct mlx5e_post_act_handle *handle;
-	bool vf_tun, encap_valid = true;
+	bool vf_tun;
 	int err;
 
 	/* This is going in reverse order as needed.
@@ -3781,13 +3779,10 @@ alloc_flow_post_acts(struct mlx5e_tc_flow *flow, struct netlink_ext_ack *extack)
 		if (list_is_last(&attr->list, &flow->attrs))
 			break;
 
-		err = set_encap_dests(flow->priv, flow, attr, extack, &encap_valid, &vf_tun);
+		err = set_encap_dests(flow->priv, flow, attr, extack, &vf_tun);
 		if (err)
 			goto out_free;
 
-		if (!encap_valid)
-			flow_flag_set(flow, SLOW);
-
 		err = actions_prepare_mod_hdr_actions(flow->priv, flow, attr, extack);
 		if (err)
 			goto out_free;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 11/14] net/mlx5e: Remove leftovers from old XSK queues enumeration
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2022-11-22  2:25 ` [net 10/14] net/mlx5e: Offload rule only when all encaps are valid Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 12/14] net/mlx5e: Fix MACsec SA initialization routine Saeed Mahameed
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Gal Pressman

From: Tariq Toukan <tariqt@nvidia.com>

Before the cited commit, for N channels, a dedicated set of N queues was
created to support XSK, in indices [N, 2N-1], doubling the number of
queues.

In addition, changing the number of channels was prohibited, as it would
shift the indices.

Remove these two leftovers, as we moved XSK to a new queueing scheme,
starting from index 0.

Fixes: 3db4c85cde7a ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en_ethtool.c   | 18 ------------------
 1 file changed, 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 24aa25da482b..1728e197558d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -35,7 +35,6 @@
 #include "en.h"
 #include "en/port.h"
 #include "en/params.h"
-#include "en/xsk/pool.h"
 #include "en/ptp.h"
 #include "lib/clock.h"
 #include "en/fs_ethtool.h"
@@ -412,15 +411,8 @@ void mlx5e_ethtool_get_channels(struct mlx5e_priv *priv,
 				struct ethtool_channels *ch)
 {
 	mutex_lock(&priv->state_lock);
-
 	ch->max_combined   = priv->max_nch;
 	ch->combined_count = priv->channels.params.num_channels;
-	if (priv->xsk.refcnt) {
-		/* The upper half are XSK queues. */
-		ch->max_combined *= 2;
-		ch->combined_count *= 2;
-	}
-
 	mutex_unlock(&priv->state_lock);
 }
 
@@ -454,16 +446,6 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
 
 	mutex_lock(&priv->state_lock);
 
-	/* Don't allow changing the number of channels if there is an active
-	 * XSK, because the numeration of the XSK and regular RQs will change.
-	 */
-	if (priv->xsk.refcnt) {
-		err = -EINVAL;
-		netdev_err(priv->netdev, "%s: AF_XDP is active, cannot change the number of channels\n",
-			   __func__);
-		goto out;
-	}
-
 	/* Don't allow changing the number of channels if HTB offload is active,
 	 * because the numeration of the QoS SQs will change, while per-queue
 	 * qdiscs are attached.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 12/14] net/mlx5e: Fix MACsec SA initialization routine
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2022-11-22  2:25 ` [net 11/14] net/mlx5e: Remove leftovers from old XSK queues enumeration Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-22  2:25 ` [net 13/14] net/mlx5e: Fix MACsec update SecY Saeed Mahameed
  2022-11-22  2:25 ` [net 14/14] net/mlx5e: Fix possible race condition in macsec extended packet number update routine Saeed Mahameed
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Emeel Hakim, Raed Salem

From: Emeel Hakim <ehakim@nvidia.com>

Currently as part of MACsec SA initialization routine
extended packet number (EPN) object attribute is always
being set without checking if EPN is actually enabled,
the above could lead to a NULL dereference.
Fix by adding such a check.

Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en_accel/macsec.c  | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
index 2ef36cb9555a..8f8a735a4501 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
@@ -368,15 +368,15 @@ static int mlx5e_macsec_init_sa(struct macsec_context *ctx,
 	obj_attrs.aso_pdn = macsec->aso.pdn;
 	obj_attrs.epn_state = sa->epn_state;
 
-	if (is_tx) {
-		obj_attrs.ssci = cpu_to_be32((__force u32)ctx->sa.tx_sa->ssci);
-		key = &ctx->sa.tx_sa->key;
-	} else {
-		obj_attrs.ssci = cpu_to_be32((__force u32)ctx->sa.rx_sa->ssci);
-		key = &ctx->sa.rx_sa->key;
+	key = (is_tx) ? &ctx->sa.tx_sa->key : &ctx->sa.rx_sa->key;
+
+	if (sa->epn_state.epn_enabled) {
+		obj_attrs.ssci = (is_tx) ? cpu_to_be32((__force u32)ctx->sa.tx_sa->ssci) :
+					   cpu_to_be32((__force u32)ctx->sa.rx_sa->ssci);
+
+		memcpy(&obj_attrs.salt, &key->salt, sizeof(key->salt));
 	}
 
-	memcpy(&obj_attrs.salt, &key->salt, sizeof(key->salt));
 	obj_attrs.replay_window = ctx->secy->replay_window;
 	obj_attrs.replay_protect = ctx->secy->replay_protect;
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 13/14] net/mlx5e: Fix MACsec update SecY
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2022-11-22  2:25 ` [net 12/14] net/mlx5e: Fix MACsec SA initialization routine Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  2022-11-23 15:21   ` Maciej Fijalkowski
  2022-11-22  2:25 ` [net 14/14] net/mlx5e: Fix possible race condition in macsec extended packet number update routine Saeed Mahameed
  13 siblings, 1 reply; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Emeel Hakim, Raed Salem

From: Emeel Hakim <ehakim@nvidia.com>

Currently updating SecY destroys and re-creates RX SA objects,
the re-created RX SA objects are not identical to the destroyed
objects and it disagree on the encryption enabled property which
holds the value false after recreation, this value is not
supported with offload which leads to no traffic after an update.
Fix by recreating an identical objects.

Fixes: 5a39816a75e5 ("net/mlx5e: Add MACsec offload SecY support")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
index 8f8a735a4501..4f96c69c6cc4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
@@ -1155,7 +1155,7 @@ static int macsec_upd_secy_hw_address(struct macsec_context *ctx,
 				continue;
 
 			if (rx_sa->active) {
-				err = mlx5e_macsec_init_sa(ctx, rx_sa, false, false);
+				err = mlx5e_macsec_init_sa(ctx, rx_sa, true, false);
 				if (err)
 					goto out;
 			}
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [net 14/14] net/mlx5e: Fix possible race condition in macsec extended packet number update routine
  2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2022-11-22  2:25 ` [net 13/14] net/mlx5e: Fix MACsec update SecY Saeed Mahameed
@ 2022-11-22  2:25 ` Saeed Mahameed
  13 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-22  2:25 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Emeel Hakim

From: Emeel Hakim <ehakim@nvidia.com>

Currenty extended packet number (EPN) update routine is accessing
macsec object without holding the general macsec lock hence facing
a possible race condition when an EPN update occurs while updating
or deleting the SA.
Fix by holding the general macsec lock before accessing the object.

Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
index 4f96c69c6cc4..3dc6c987b8da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
@@ -1536,6 +1536,8 @@ static void macsec_async_event(struct work_struct *work)
 
 	async_work = container_of(work, struct mlx5e_macsec_async_work, work);
 	macsec = async_work->macsec;
+	mutex_lock(&macsec->lock);
+
 	mdev = async_work->mdev;
 	obj_id = async_work->obj_id;
 	macsec_sa = get_macsec_tx_sa_from_obj_id(macsec, obj_id);
@@ -1557,6 +1559,7 @@ static void macsec_async_event(struct work_struct *work)
 
 out_async_work:
 	kfree(async_work);
+	mutex_unlock(&macsec->lock);
 }
 
 static int macsec_obj_change_event(struct notifier_block *nb, unsigned long event, void *data)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [net 01/14] net/mlx5: Do not query pci info while pci disabled
  2022-11-22  2:25 ` [net 01/14] net/mlx5: Do not query pci info while pci disabled Saeed Mahameed
@ 2022-11-23  5:00   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 26+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-11-23  5:00 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: davem, kuba, pabeni, edumazet, saeedm, netdev, tariqt, royno,
	moshe, ayal

Hello:

This series was applied to netdev/net.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:

On Mon, 21 Nov 2022 18:25:46 -0800 you wrote:
> From: Roy Novich <royno@nvidia.com>
> 
> The driver should not interact with PCI while PCI is disabled. Trying to
> do so may result in being unable to get vital signs during PCI reset,
> driver gets timed out and fails to recover.
> 
> Fixes: fad1783a6d66 ("net/mlx5: Print more info on pci error handlers")
> Signed-off-by: Roy Novich <royno@nvidia.com>
> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
> Reviewed-by: Aya Levin <ayal@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> 
> [...]

Here is the summary with links:
  - [net,01/14] net/mlx5: Do not query pci info while pci disabled
    https://git.kernel.org/netdev/net/c/394164f9d5a3
  - [net,02/14] net/mlx5: Fix FW tracer timestamp calculation
    https://git.kernel.org/netdev/net/c/61db3d7b99a3
  - [net,03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase
    https://git.kernel.org/netdev/net/c/4f57332d6a55
  - [net,04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
    https://git.kernel.org/netdev/net/c/870c2481174b
  - [net,05/14] net/mlx5: Fix handling of entry refcount when command is not issued to FW
    https://git.kernel.org/netdev/net/c/aaf2e65cac7f
  - [net,06/14] net/mlx5: Lag, avoid lockdep warnings
    https://git.kernel.org/netdev/net/c/0d4e8ed139d8
  - [net,07/14] net/mlx5: E-Switch, Set correctly vport destination
    https://git.kernel.org/netdev/net/c/6d942e404489
  - [net,08/14] net/mlx5: Fix sync reset event handler error flow
    https://git.kernel.org/netdev/net/c/e1ad07b9227f
  - [net,09/14] net/mlx5e: Fix missing alignment in size of MTT/KLM entries
    https://git.kernel.org/netdev/net/c/3e874cb1e0a3
  - [net,10/14] net/mlx5e: Offload rule only when all encaps are valid
    https://git.kernel.org/netdev/net/c/f377422044b2
  - [net,11/14] net/mlx5e: Remove leftovers from old XSK queues enumeration
    https://git.kernel.org/netdev/net/c/11abca031ee3
  - [net,12/14] net/mlx5e: Fix MACsec SA initialization routine
    https://git.kernel.org/netdev/net/c/d20a56b0eb00
  - [net,13/14] net/mlx5e: Fix MACsec update SecY
    https://git.kernel.org/netdev/net/c/94ffd6e0c7db
  - [net,14/14] net/mlx5e: Fix possible race condition in macsec extended packet number update routine
    https://git.kernel.org/netdev/net/c/8514e325ef01

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase
  2022-11-22  2:25 ` [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase Saeed Mahameed
@ 2022-11-23 14:57   ` Maciej Fijalkowski
  2022-11-23 17:11     ` Parav Pandit
  2022-11-23 23:36     ` Saeed Mahameed
  0 siblings, 2 replies; 26+ messages in thread
From: Maciej Fijalkowski @ 2022-11-23 14:57 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Parav Pandit

On Mon, Nov 21, 2022 at 06:25:48PM -0800, Saeed Mahameed wrote:
> From: Shay Drory <shayd@nvidia.com>
> 
> When SF devices and SF port representors are located on different
> functions, unloading and reloading of SF parent driver doesn't recreate
> the existing SF present in the device.
> Fix it by querying SFs and probe active SFs during driver probe phase.

Maybe shed some light on how it's actually being done? Have a few words
that you're adding a workqueue dedicated for that etc. There is also a
new mutex, I was always expecting that such mechanisms get a bit of
explanation/justification why there is a need for its introduction.

Not sure if including some example reproducer in here is mandatory or not
(and therefore splat, if any). General feeling is that commit message
could be beefed up.

> 
> Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support")
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
>  .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 88 +++++++++++++++++++
>  1 file changed, 88 insertions(+)
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
  2022-11-22  2:25 ` [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint Saeed Mahameed
@ 2022-11-23 15:06   ` Maciej Fijalkowski
  2022-11-23 23:48     ` Saeed Mahameed
  0 siblings, 1 reply; 26+ messages in thread
From: Maciej Fijalkowski @ 2022-11-23 15:06 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Shay Drory,
	Maor Gottlieb

On Mon, Nov 21, 2022 at 06:25:49PM -0800, Saeed Mahameed wrote:
> From: Moshe Shemesh <moshe@nvidia.com>
> 
> While moving to new CMD API (quiet API), some pre-existing flows may call the new API
> function that in case of error, returns the error instead of printing it as previously done.
> For such flows we bring back the print but to tracepoint this time for sys admins to
> have the ability to check for errors especially for commands using the new quiet API.
> 

WARNING: Possible unwrapped commit description (prefer a maximum 75 chars
per line)


> Tracepoint output example:
>          devlink-1333    [001] .....   822.746922: mlx5_cmd: ACCESS_REG(0x805) op_mod(0x0) failed, status bad resource(0x5), syndrome (0xb06e1f), err(-22)
> 
> Fixes: f23519e542e5 ("net/mlx5: cmdif, Add new api for command execution")
> Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 41 +++++++++--------
>  .../mellanox/mlx5/core/diag/cmd_tracepoint.h  | 45 +++++++++++++++++++
>  include/linux/mlx5/driver.h                   |  1 +
>  3 files changed, 68 insertions(+), 19 deletions(-)
>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/cmd_tracepoint.h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 13/14] net/mlx5e: Fix MACsec update SecY
  2022-11-22  2:25 ` [net 13/14] net/mlx5e: Fix MACsec update SecY Saeed Mahameed
@ 2022-11-23 15:21   ` Maciej Fijalkowski
  2022-11-23 23:57     ` Saeed Mahameed
  0 siblings, 1 reply; 26+ messages in thread
From: Maciej Fijalkowski @ 2022-11-23 15:21 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Emeel Hakim, Raed Salem

On Mon, Nov 21, 2022 at 06:25:58PM -0800, Saeed Mahameed wrote:
> From: Emeel Hakim <ehakim@nvidia.com>
> 
> Currently updating SecY destroys and re-creates RX SA objects,
> the re-created RX SA objects are not identical to the destroyed
> objects and it disagree on the encryption enabled property which

nit: disagrees?

> holds the value false after recreation, this value is not
> supported with offload which leads to no traffic after an update.
> Fix by recreating an identical objects.
> 
> Fixes: 5a39816a75e5 ("net/mlx5e: Add MACsec offload SecY support")
> Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
> Reviewed-by: Raed Salem <raeds@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
> index 8f8a735a4501..4f96c69c6cc4 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
> @@ -1155,7 +1155,7 @@ static int macsec_upd_secy_hw_address(struct macsec_context *ctx,
>  				continue;
>  
>  			if (rx_sa->active) {
> -				err = mlx5e_macsec_init_sa(ctx, rx_sa, false, false);
> +				err = mlx5e_macsec_init_sa(ctx, rx_sa, true, false);
>  				if (err)
>  					goto out;
>  			}
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase
  2022-11-23 14:57   ` Maciej Fijalkowski
@ 2022-11-23 17:11     ` Parav Pandit
  2022-11-23 17:44       ` Maciej Fijalkowski
  2022-11-23 23:36     ` Saeed Mahameed
  1 sibling, 1 reply; 26+ messages in thread
From: Parav Pandit @ 2022-11-23 17:11 UTC (permalink / raw)
  To: Maciej Fijalkowski, Saeed Mahameed
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Shay Drory



> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Sent: Wednesday, November 23, 2022 9:58 AM
> 
> On Mon, Nov 21, 2022 at 06:25:48PM -0800, Saeed Mahameed wrote:
> > From: Shay Drory <shayd@nvidia.com>
> >
> > When SF devices and SF port representors are located on different
> > functions, unloading and reloading of SF parent driver doesn't
> > recreate the existing SF present in the device.
> > Fix it by querying SFs and probe active SFs during driver probe phase.
> 
> Maybe shed some light on how it's actually being done? Have a few words
> that you're adding a workqueue dedicated for that etc. There is also a new
> mutex, I was always expecting that such mechanisms get a bit of
> explanation/justification why there is a need for its introduction.

Linux kernel has a clear coding guideline is to not explain 'how' of the code.
It is described in section 8 of [1].
It largely expands to commit log too as code following [1].
So, commit log and comment skipped the 'how' part. :)

You likely meant to ask why workqueue is used and not 'how'.
Having in the code comment is more useful than the commit log here.
So, Shay already explained this in the code function mlx5_sf_dev_queue_active_work() where wq is created.

Regarding mutex, there is well defined mandatory checkpatch.pl requirement to explain what does this mutex protect.
This is also covered in the code comment.

[1] https://www.kernel.org/doc/html/v4.10/process/coding-style.html

> 
> Not sure if including some example reproducer in here is mandatory or not
> (and therefore splat, if any). General feeling is that commit message could be
> beefed up.
> 
> >
> > Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support")
> > Signed-off-by: Shay Drory <shayd@nvidia.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> > ---
> >  .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 88
> > +++++++++++++++++++
> >  1 file changed, 88 insertions(+)
> >

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase
  2022-11-23 17:11     ` Parav Pandit
@ 2022-11-23 17:44       ` Maciej Fijalkowski
  0 siblings, 0 replies; 26+ messages in thread
From: Maciej Fijalkowski @ 2022-11-23 17:44 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, Saeed Mahameed, netdev, Tariq Toukan, Shay Drory

On Wed, Nov 23, 2022 at 05:11:46PM +0000, Parav Pandit wrote:
> 
> 
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Sent: Wednesday, November 23, 2022 9:58 AM
> > 
> > On Mon, Nov 21, 2022 at 06:25:48PM -0800, Saeed Mahameed wrote:
> > > From: Shay Drory <shayd@nvidia.com>
> > >
> > > When SF devices and SF port representors are located on different
> > > functions, unloading and reloading of SF parent driver doesn't
> > > recreate the existing SF present in the device.
> > > Fix it by querying SFs and probe active SFs during driver probe phase.
> > 
> > Maybe shed some light on how it's actually being done? Have a few words
> > that you're adding a workqueue dedicated for that etc. There is also a new
> > mutex, I was always expecting that such mechanisms get a bit of
> > explanation/justification why there is a need for its introduction.
> 
> Linux kernel has a clear coding guideline is to not explain 'how' of the code.
> It is described in section 8 of [1].

I think that you're just being picky over here

> It largely expands to commit log too as code following [1].

Who told you that it applies to commit message?

> So, commit log and comment skipped the 'how' part. :)
> 
> You likely meant to ask why workqueue is used and not 'how'.
> Having in the code comment is more useful than the commit log here.

This is your subjective opinion about the usefulness and mine was to say
that commit message could be more elaborative. Series got merged anyway
from what I see so let's cut the discussion here.

> So, Shay already explained this in the code function mlx5_sf_dev_queue_active_work() where wq is created.
> 
> Regarding mutex, there is well defined mandatory checkpatch.pl requirement to explain what does this mutex protect.

P.S. Please fix your mail client to don't go over 80 chars per line

> This is also covered in the code comment.
> 
> [1] https://www.kernel.org/doc/html/v4.10/process/coding-style.html
> 
> > 
> > Not sure if including some example reproducer in here is mandatory or not
> > (and therefore splat, if any). General feeling is that commit message could be
> > beefed up.
> > 
> > >
> > > Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support")
> > > Signed-off-by: Shay Drory <shayd@nvidia.com>
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> > > ---
> > >  .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 88
> > > +++++++++++++++++++
> > >  1 file changed, 88 insertions(+)
> > >

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase
  2022-11-23 14:57   ` Maciej Fijalkowski
  2022-11-23 17:11     ` Parav Pandit
@ 2022-11-23 23:36     ` Saeed Mahameed
  1 sibling, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-23 23:36 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Parav Pandit

On 23 Nov 15:57, Maciej Fijalkowski wrote:
>On Mon, Nov 21, 2022 at 06:25:48PM -0800, Saeed Mahameed wrote:
>> From: Shay Drory <shayd@nvidia.com>
>>
>> When SF devices and SF port representors are located on different
>> functions, unloading and reloading of SF parent driver doesn't recreate
>> the existing SF present in the device.
>> Fix it by querying SFs and probe active SFs during driver probe phase.
>
>Maybe shed some light on how it's actually being done? Have a few words
>that you're adding a workqueue dedicated for that etc. There is also a
>new mutex, I was always expecting that such mechanisms get a bit of
>explanation/justification why there is a need for its introduction.
>

it's needed so we can synchronize the new sync operation on load with the
pre-existing SF adding mechanism "device events" .. 

But i don't believe it's a requirement to explain the code behind a
commit, I think for this case it's self explanatory for someone who
understand how the basic mechanism of adding SFs to a PF funciton works in
mlx5. but yes I agree, some more information would've been useful, we will
be more verbose for future patches.

>Not sure if including some example reproducer in here is mandatory or not
>(and therefore splat, if any). General feeling is that commit message
>could be beefed up.
>

reproducer is really as simple as stated in the commit message, 
a devlink reload when you have SFs loaded on a dual function system 
(smart nic) where the switchdev SF admin is on a remote cpu function.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
  2022-11-23 15:06   ` Maciej Fijalkowski
@ 2022-11-23 23:48     ` Saeed Mahameed
  2022-11-24  1:55       ` Jakub Kicinski
  0 siblings, 1 reply; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-23 23:48 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Shay Drory,
	Maor Gottlieb

On 23 Nov 16:06, Maciej Fijalkowski wrote:
>On Mon, Nov 21, 2022 at 06:25:49PM -0800, Saeed Mahameed wrote:
>> From: Moshe Shemesh <moshe@nvidia.com>
>>
>> While moving to new CMD API (quiet API), some pre-existing flows may call the new API
>> function that in case of error, returns the error instead of printing it as previously done.
>> For such flows we bring back the print but to tracepoint this time for sys admins to
>> have the ability to check for errors especially for commands using the new quiet API.
>>
>
>WARNING: Possible unwrapped commit description (prefer a maximum 75 chars
>per line)
>

we don't enforce this in netdev, especially when you want to share output,
etc .. 

also for future reference in mlx5 we allow up to 95 chars per code line, we
got wide screens :P, so also please ignore these warnings.

chkpatch = "!./scripts/checkpatch.pl --max-line-length=95 --strict --show-types --ignore COMMIT_LOG_LONG_LINE,FILE_PATH_CHANGES,MACRO_ARG_REUSE,MACRO_ARG_PRECEDENCE"


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 13/14] net/mlx5e: Fix MACsec update SecY
  2022-11-23 15:21   ` Maciej Fijalkowski
@ 2022-11-23 23:57     ` Saeed Mahameed
  0 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-23 23:57 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Emeel Hakim, Raed Salem

On 23 Nov 16:21, Maciej Fijalkowski wrote:
>On Mon, Nov 21, 2022 at 06:25:58PM -0800, Saeed Mahameed wrote:
>> From: Emeel Hakim <ehakim@nvidia.com>
>>
>> Currently updating SecY destroys and re-creates RX SA objects,
>> the re-created RX SA objects are not identical to the destroyed
>> objects and it disagree on the encryption enabled property which
>
>nit: disagrees?
>

I think this is minor, let's be considerate, many contributors are non
native english speakers, so as long as the commit message is
understandable and sound, i think we should just ignore such simple
mistakes. We are not here to review proper grammar.. 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
  2022-11-23 23:48     ` Saeed Mahameed
@ 2022-11-24  1:55       ` Jakub Kicinski
  2022-11-24  4:37         ` Saeed Mahameed
  0 siblings, 1 reply; 26+ messages in thread
From: Jakub Kicinski @ 2022-11-24  1:55 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Maciej Fijalkowski, David S. Miller, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Shay Drory,
	Maor Gottlieb

On Wed, 23 Nov 2022 15:48:07 -0800 Saeed Mahameed wrote:
> >> While moving to new CMD API (quiet API), some pre-existing flows may call the new API
> >> function that in case of error, returns the error instead of printing it as previously done.
> >> For such flows we bring back the print but to tracepoint this time for sys admins to
> >> have the ability to check for errors especially for commands using the new quiet API.
> >
> > WARNING: Possible unwrapped commit description (prefer a maximum 75 chars
> > per line)
> 
> we don't enforce this in netdev, especially when you want to share output,
> etc .. 

Maciej is right, it's one of those things I don't have stamina to
enforce. But you just gave me motivation, please start complying with
the rules.

> also for future reference in mlx5 we allow up to 95 chars per code line, we
> got wide screens :P, so also please ignore these warnings.

"We got wide screens"? Your inability to write concise code is
certainly not a virtue. Don't be flippant about it :/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint
  2022-11-24  1:55       ` Jakub Kicinski
@ 2022-11-24  4:37         ` Saeed Mahameed
  0 siblings, 0 replies; 26+ messages in thread
From: Saeed Mahameed @ 2022-11-24  4:37 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Maciej Fijalkowski, David S. Miller, Paolo Abeni, Eric Dumazet,
	Saeed Mahameed, netdev, Tariq Toukan, Moshe Shemesh, Shay Drory,
	Maor Gottlieb

On 23 Nov 17:55, Jakub Kicinski wrote:
>On Wed, 23 Nov 2022 15:48:07 -0800 Saeed Mahameed wrote:
>> >> While moving to new CMD API (quiet API), some pre-existing flows may call the new API
>> >> function that in case of error, returns the error instead of printing it as previously done.
>> >> For such flows we bring back the print but to tracepoint this time for sys admins to
>> >> have the ability to check for errors especially for commands using the new quiet API.
>> >
>> > WARNING: Possible unwrapped commit description (prefer a maximum 75 chars
>> > per line)
>>
>> we don't enforce this in netdev, especially when you want to share output,
>> etc ..
>
>Maciej is right, it's one of those things I don't have stamina to
>enforce. But you just gave me motivation, please start complying with
>the rules.
>
>> also for future reference in mlx5 we allow up to 95 chars per code line, we
>> got wide screens :P, so also please ignore these warnings.
>
>"We got wide screens"? Your inability to write concise code is
>certainly not a virtue. Don't be flippant about it :/
it was an innocent very commonly used joke, what are you talking about ???

Anyway, I don't appreciate or respect your tune! if you want me to respect
your rules, you got to give some respect to your drivers maintainers !!

We have certain coding style in mlx5 and we gonna stick to it.. it's been
the case for the past 8 years, i am not going to suddenly turn the wheel
and just start obeying like a monkey.

And please watch your language, we do write concise code, your inability to
understand it is not my problem, and i don't expect you to understand it..
How the code looks like and how we interact with our HW is our problem, of
course suggestions and contributions are always welcome, but please be nice
about it.

mlx5 is the largest netdev driver and i got more than 20 active contributors
who are very happy with it. Please show some respect to those who worked
their asses off for years before you even knew what netdev is.

thank you!


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-11-24  4:37 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-22  2:25 [pull request][net 00/14] mlx5 fixes 2022-11-21 Saeed Mahameed
2022-11-22  2:25 ` [net 01/14] net/mlx5: Do not query pci info while pci disabled Saeed Mahameed
2022-11-23  5:00   ` patchwork-bot+netdevbpf
2022-11-22  2:25 ` [net 02/14] net/mlx5: Fix FW tracer timestamp calculation Saeed Mahameed
2022-11-22  2:25 ` [net 03/14] net/mlx5: SF: Fix probing active SFs during driver probe phase Saeed Mahameed
2022-11-23 14:57   ` Maciej Fijalkowski
2022-11-23 17:11     ` Parav Pandit
2022-11-23 17:44       ` Maciej Fijalkowski
2022-11-23 23:36     ` Saeed Mahameed
2022-11-22  2:25 ` [net 04/14] net/mlx5: cmdif, Print info on any firmware cmd failure to tracepoint Saeed Mahameed
2022-11-23 15:06   ` Maciej Fijalkowski
2022-11-23 23:48     ` Saeed Mahameed
2022-11-24  1:55       ` Jakub Kicinski
2022-11-24  4:37         ` Saeed Mahameed
2022-11-22  2:25 ` [net 05/14] net/mlx5: Fix handling of entry refcount when command is not issued to FW Saeed Mahameed
2022-11-22  2:25 ` [net 06/14] net/mlx5: Lag, avoid lockdep warnings Saeed Mahameed
2022-11-22  2:25 ` [net 07/14] net/mlx5: E-Switch, Set correctly vport destination Saeed Mahameed
2022-11-22  2:25 ` [net 08/14] net/mlx5: Fix sync reset event handler error flow Saeed Mahameed
2022-11-22  2:25 ` [net 09/14] net/mlx5e: Fix missing alignment in size of MTT/KLM entries Saeed Mahameed
2022-11-22  2:25 ` [net 10/14] net/mlx5e: Offload rule only when all encaps are valid Saeed Mahameed
2022-11-22  2:25 ` [net 11/14] net/mlx5e: Remove leftovers from old XSK queues enumeration Saeed Mahameed
2022-11-22  2:25 ` [net 12/14] net/mlx5e: Fix MACsec SA initialization routine Saeed Mahameed
2022-11-22  2:25 ` [net 13/14] net/mlx5e: Fix MACsec update SecY Saeed Mahameed
2022-11-23 15:21   ` Maciej Fijalkowski
2022-11-23 23:57     ` Saeed Mahameed
2022-11-22  2:25 ` [net 14/14] net/mlx5e: Fix possible race condition in macsec extended packet number update routine Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.