All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net 00/11] mlx5 fixes 2022-05-17
@ 2022-05-18  6:34 Saeed Mahameed
  2022-05-18  6:34 ` [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table Saeed Mahameed
                   ` (10 more replies)
  0 siblings, 11 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni; +Cc: netdev, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit 23dd4581350d4ffa23d58976ec46408f8f4c1e16:

  NFC: nci: fix sleep in atomic context bugs caused by nci_skb_alloc (2022-05-17 17:55:53 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2022-05-17

for you to fetch changes up to 16d42d313350946f4b9a8b74a13c99f0461a6572:

  net/mlx5: Drain fw_reset when removing device (2022-05-17 23:03:57 -0700)

----------------------------------------------------------------
mlx5-fixes-2022-05-17

----------------------------------------------------------------
Aya Levin (1):
      net/mlx5e: Block rx-gro-hw feature in switchdev mode

Gal Pressman (1):
      net/mlx5e: Remove HW-GRO from reported features

Maor Dickman (1):
      net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table

Maxim Mikityanskiy (3):
      net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock
      net/mlx5e: Properly block LRO when XDP is enabled
      net/mlx5e: Properly block HW GRO when XDP is enabled

Paul Blakey (2):
      net/mlx5e: CT: Fix support for GRE tuples
      net/mlx5e: CT: Fix setting flow_source for smfs ct tuples

Shay Drory (2):
      net/mlx5: Initialize flow steering during driver probe
      net/mlx5: Drain fw_reset when removing device

Yevgeny Kliteynik (1):
      net/mlx5: DR, Ignore modify TTL on RX if device doesn't support it

 .../ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c |  58 +++++----
 drivers/net/ethernet/mellanox/mlx5/core/en/trap.c  |  13 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  27 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  | 131 ++++++++++++---------
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h  |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c |  25 +++-
 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h |   1 +
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  19 ++-
 .../mellanox/mlx5/core/steering/dr_action.c        |  71 +++++++----
 .../ethernet/mellanox/mlx5/core/steering/dr_fw.c   |   4 +-
 .../mellanox/mlx5/core/steering/dr_ste_v0.c        |   4 +-
 .../mellanox/mlx5/core/steering/dr_types.h         |   3 +-
 .../ethernet/mellanox/mlx5/core/steering/fs_dr.c   |   4 +-
 .../ethernet/mellanox/mlx5/core/steering/mlx5dr.h  |   3 +-
 14 files changed, 246 insertions(+), 123 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18 11:10   ` patchwork-bot+netdevbpf
  2022-05-18  6:34 ` [net 02/11] net/mlx5: Initialize flow steering during driver probe Saeed Mahameed
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maor Dickman, Yevgeny Kliteynik, Saeed Mahameed

From: Maor Dickman <maord@nvidia.com>

In order to support multiple destination FTEs with SW steering
FW table is created with single FTE with multiple actions and
SW steering rule forward to it. When creating this table, flow
source isn't set according to the original FTE.

Fix this by passing the original FTE flow source to the created
FW table.

Fixes: 34583beea4b7 ("net/mlx5: DR, Create multi-destination table for SW-steering use")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/steering/dr_action.c    | 6 ++++--
 drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c    | 4 +++-
 drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h | 3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c    | 4 +++-
 drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h   | 3 ++-
 5 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 850937cd8bf9..b52b539c8d2c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -842,7 +842,8 @@ struct mlx5dr_action *
 mlx5dr_action_create_mult_dest_tbl(struct mlx5dr_domain *dmn,
 				   struct mlx5dr_action_dest *dests,
 				   u32 num_of_dests,
-				   bool ignore_flow_level)
+				   bool ignore_flow_level,
+				   u32 flow_source)
 {
 	struct mlx5dr_cmd_flow_destination_hw_info *hw_dests;
 	struct mlx5dr_action **ref_actions;
@@ -914,7 +915,8 @@ mlx5dr_action_create_mult_dest_tbl(struct mlx5dr_domain *dmn,
 				      reformat_req,
 				      &action->dest_tbl->fw_tbl.id,
 				      &action->dest_tbl->fw_tbl.group_id,
-				      ignore_flow_level);
+				      ignore_flow_level,
+				      flow_source);
 	if (ret)
 		goto free_action;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c
index 68a4c32d5f34..f05ef0cd54ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c
@@ -104,7 +104,8 @@ int mlx5dr_fw_create_md_tbl(struct mlx5dr_domain *dmn,
 			    bool reformat_req,
 			    u32 *tbl_id,
 			    u32 *group_id,
-			    bool ignore_flow_level)
+			    bool ignore_flow_level,
+			    u32 flow_source)
 {
 	struct mlx5dr_cmd_create_flow_table_attr ft_attr = {};
 	struct mlx5dr_cmd_fte_info fte_info = {};
@@ -139,6 +140,7 @@ int mlx5dr_fw_create_md_tbl(struct mlx5dr_domain *dmn,
 	fte_info.val = val;
 	fte_info.dest_arr = dest;
 	fte_info.ignore_flow_level = ignore_flow_level;
+	fte_info.flow_context.flow_source = flow_source;
 
 	ret = mlx5dr_cmd_set_fte(dmn->mdev, 0, 0, &ft_info, *group_id, &fte_info);
 	if (ret) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 46866a5fc5ca..98320e3945ad 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -1461,7 +1461,8 @@ int mlx5dr_fw_create_md_tbl(struct mlx5dr_domain *dmn,
 			    bool reformat_req,
 			    u32 *tbl_id,
 			    u32 *group_id,
-			    bool ignore_flow_level);
+			    bool ignore_flow_level,
+			    u32 flow_source);
 void mlx5dr_fw_destroy_md_tbl(struct mlx5dr_domain *dmn, u32 tbl_id,
 			      u32 group_id);
 #endif  /* _DR_TYPES_H_ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
index 045b0cf90063..728f81882589 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
@@ -520,6 +520,7 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns,
 	} else if (num_term_actions > 1) {
 		bool ignore_flow_level =
 			!!(fte->action.flags & FLOW_ACT_IGNORE_FLOW_LEVEL);
+		u32 flow_source = fte->flow_context.flow_source;
 
 		if (num_actions == MLX5_FLOW_CONTEXT_ACTION_MAX ||
 		    fs_dr_num_actions == MLX5_FLOW_CONTEXT_ACTION_MAX) {
@@ -529,7 +530,8 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns,
 		tmp_action = mlx5dr_action_create_mult_dest_tbl(domain,
 								term_actions,
 								num_term_actions,
-								ignore_flow_level);
+								ignore_flow_level,
+								flow_source);
 		if (!tmp_action) {
 			err = -EOPNOTSUPP;
 			goto free_actions;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
index ec5cbec0d455..7626c85643b1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
@@ -99,7 +99,8 @@ struct mlx5dr_action *
 mlx5dr_action_create_mult_dest_tbl(struct mlx5dr_domain *dmn,
 				   struct mlx5dr_action_dest *dests,
 				   u32 num_of_dests,
-				   bool ignore_flow_level);
+				   bool ignore_flow_level,
+				   u32 flow_source);
 
 struct mlx5dr_action *mlx5dr_action_create_drop(void);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 02/11] net/mlx5: Initialize flow steering during driver probe
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
  2022-05-18  6:34 ` [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 03/11] net/mlx5: DR, Ignore modify TTL on RX if device doesn't support it Saeed Mahameed
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Shay Drory, Mark Bloch, Saeed Mahameed

From: Shay Drory <shayd@nvidia.com>

Currently, software objects of flow steering are created and destroyed
during reload flow. In case a device is unloaded, the following error
is printed during grace period:

 mlx5_core 0000:00:0b.0: mlx5_fw_fatal_reporter_err_work:690:(pid 95):
    Driver is in error state. Unloading

As a solution to fix use-after-free bugs, where we try to access
these objects, when reading the value of flow_steering_mode devlink
param[1], let's split flow steering creation and destruction into two
routines:
    * init and cleanup: memory, cache, and pools allocation/free.
    * create and destroy: namespaces initialization and cleanup.

While at it, re-order the cleanup function to mirror the init function.

[1]
Kasan trace:

[  385.119849 ] BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x3b/0xa0
[  385.119849 ] Read of size 4 at addr ffff888104b79308 by task bash/291
[  385.119849 ]
[  385.119849 ] CPU: 1 PID: 291 Comm: bash Not tainted 5.17.0-rc1+ #2
[  385.119849 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[  385.119849 ] Call Trace:
[  385.119849 ]  <TASK>
[  385.119849 ]  dump_stack_lvl+0x6e/0x91
[  385.119849 ]  print_address_description.constprop.0+0x1f/0x160
[  385.119849 ]  ? mlx5_devlink_fs_mode_get+0x3b/0xa0
[  385.119849 ]  ? mlx5_devlink_fs_mode_get+0x3b/0xa0
[  385.119849 ]  kasan_report.cold+0x83/0xdf
[  385.119849 ]  ? devlink_param_notify+0x20/0x190
[  385.119849 ]  ? mlx5_devlink_fs_mode_get+0x3b/0xa0
[  385.119849 ]  mlx5_devlink_fs_mode_get+0x3b/0xa0
[  385.119849 ]  devlink_nl_param_fill+0x18a/0xa50
[  385.119849 ]  ? _raw_spin_lock_irqsave+0x8d/0xe0
[  385.119849 ]  ? devlink_flash_update_timeout_notify+0xf0/0xf0
[  385.119849 ]  ? __wake_up_common+0x4b/0x1e0
[  385.119849 ]  ? preempt_count_sub+0x14/0xc0
[  385.119849 ]  ? _raw_spin_unlock_irqrestore+0x28/0x40
[  385.119849 ]  ? __wake_up_common_lock+0xe3/0x140
[  385.119849 ]  ? __wake_up_common+0x1e0/0x1e0
[  385.119849 ]  ? __sanitizer_cov_trace_const_cmp8+0x27/0x80
[  385.119849 ]  ? __rcu_read_unlock+0x48/0x70
[  385.119849 ]  ? kasan_unpoison+0x23/0x50
[  385.119849 ]  ? __kasan_slab_alloc+0x2c/0x80
[  385.119849 ]  ? memset+0x20/0x40
[  385.119849 ]  ? __sanitizer_cov_trace_const_cmp4+0x25/0x80
[  385.119849 ]  devlink_param_notify+0xce/0x190
[  385.119849 ]  devlink_unregister+0x92/0x2b0
[  385.119849 ]  remove_one+0x41/0x140
[  385.119849 ]  pci_device_remove+0x68/0x140
[  385.119849 ]  ? pcibios_free_irq+0x10/0x10
[  385.119849 ]  __device_release_driver+0x294/0x3f0
[  385.119849 ]  device_driver_detach+0x82/0x130
[  385.119849 ]  unbind_store+0x193/0x1b0
[  385.119849 ]  ? subsys_interface_unregister+0x270/0x270
[  385.119849 ]  drv_attr_store+0x4e/0x70
[  385.119849 ]  ? drv_attr_show+0x60/0x60
[  385.119849 ]  sysfs_kf_write+0xa7/0xc0
[  385.119849 ]  kernfs_fop_write_iter+0x23a/0x2f0
[  385.119849 ]  ? sysfs_kf_bin_read+0x160/0x160
[  385.119849 ]  new_sync_write+0x311/0x430
[  385.119849 ]  ? new_sync_read+0x480/0x480
[  385.119849 ]  ? _raw_spin_lock+0x87/0xe0
[  385.119849 ]  ? __sanitizer_cov_trace_cmp4+0x25/0x80
[  385.119849 ]  ? security_file_permission+0x94/0xa0
[  385.119849 ]  vfs_write+0x4c7/0x590
[  385.119849 ]  ksys_write+0xf6/0x1e0
[  385.119849 ]  ? __x64_sys_read+0x50/0x50
[  385.119849 ]  ? fpregs_assert_state_consistent+0x99/0xa0
[  385.119849 ]  do_syscall_64+0x3d/0x90
[  385.119849 ]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  385.119849 ] RIP: 0033:0x7fc36ef38504
[  385.119849 ] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f
80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[  385.119849 ] RSP: 002b:00007ffde0ff3d08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  385.119849 ] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007fc36ef38504
[  385.119849 ] RDX: 000000000000000c RSI: 00007fc370521040 RDI: 0000000000000001
[  385.119849 ] RBP: 00007fc370521040 R08: 00007fc36f00b8c0 R09: 00007fc36ee4b740
[  385.119849 ] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc36f00a760
[  385.119849 ] R13: 000000000000000c R14: 00007fc36f005760 R15: 000000000000000c
[  385.119849 ]  </TASK>
[  385.119849 ]
[  385.119849 ] Allocated by task 65:
[  385.119849 ]  kasan_save_stack+0x1e/0x40
[  385.119849 ]  __kasan_kmalloc+0x81/0xa0
[  385.119849 ]  mlx5_init_fs+0x11b/0x1160
[  385.119849 ]  mlx5_load+0x13c/0x220
[  385.119849 ]  mlx5_load_one+0xda/0x160
[  385.119849 ]  mlx5_recover_device+0xb8/0x100
[  385.119849 ]  mlx5_health_try_recover+0x2f9/0x3a1
[  385.119849 ]  devlink_health_reporter_recover+0x75/0x100
[  385.119849 ]  devlink_health_report+0x26c/0x4b0
[  385.275909 ]  mlx5_fw_fatal_reporter_err_work+0x11e/0x1b0
[  385.275909 ]  process_one_work+0x520/0x970
[  385.275909 ]  worker_thread+0x378/0x950
[  385.275909 ]  kthread+0x1bb/0x200
[  385.275909 ]  ret_from_fork+0x1f/0x30
[  385.275909 ]
[  385.275909 ] Freed by task 65:
[  385.275909 ]  kasan_save_stack+0x1e/0x40
[  385.275909 ]  kasan_set_track+0x21/0x30
[  385.275909 ]  kasan_set_free_info+0x20/0x30
[  385.275909 ]  __kasan_slab_free+0xfc/0x140
[  385.275909 ]  kfree+0xa5/0x3b0
[  385.275909 ]  mlx5_unload+0x2e/0xb0
[  385.275909 ]  mlx5_unload_one+0x86/0xb0
[  385.275909 ]  mlx5_fw_fatal_reporter_err_work.cold+0xca/0xcf
[  385.275909 ]  process_one_work+0x520/0x970
[  385.275909 ]  worker_thread+0x378/0x950
[  385.275909 ]  kthread+0x1bb/0x200
[  385.275909 ]  ret_from_fork+0x1f/0x30
[  385.275909 ]
[  385.275909 ] The buggy address belongs to the object at ffff888104b79300
[  385.275909 ]  which belongs to the cache kmalloc-128 of size 128
[  385.275909 ] The buggy address is located 8 bytes inside of
[  385.275909 ]  128-byte region [ffff888104b79300, ffff888104b79380)
[  385.275909 ] The buggy address belongs to the page:
[  385.275909 ] page:00000000de44dd39 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x104b78
[  385.275909 ] head:00000000de44dd39 order:1 compound_mapcount:0
[  385.275909 ] flags: 0x8000000000010200(slab|head|zone=2)
[  385.275909 ] raw: 8000000000010200 0000000000000000 dead000000000122 ffff8881000428c0
[  385.275909 ] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[  385.275909 ] page dumped because: kasan: bad access detected
[  385.275909 ]
[  385.275909 ] Memory state around the buggy address:
[  385.275909 ]  ffff888104b79200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
[  385.275909 ]  ffff888104b79280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  385.275909 ] >ffff888104b79300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  385.275909 ]                       ^
[  385.275909 ]  ffff888104b79380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  385.275909 ]  ffff888104b79400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  385.275909 ]]

Fixes: e890acd5ff18 ("net/mlx5: Add devlink flow_steering_mode parameter")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/fs_core.c | 131 ++++++++++--------
 .../net/ethernet/mellanox/mlx5/core/fs_core.h |   6 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  15 +-
 3 files changed, 91 insertions(+), 61 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 816d991f7621..3ad67e6b5586 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2663,28 +2663,6 @@ static void cleanup_root_ns(struct mlx5_flow_root_namespace *root_ns)
 	clean_tree(&root_ns->ns.node);
 }
 
-void mlx5_cleanup_fs(struct mlx5_core_dev *dev)
-{
-	struct mlx5_flow_steering *steering = dev->priv.steering;
-
-	cleanup_root_ns(steering->root_ns);
-	cleanup_root_ns(steering->fdb_root_ns);
-	steering->fdb_root_ns = NULL;
-	kfree(steering->fdb_sub_ns);
-	steering->fdb_sub_ns = NULL;
-	cleanup_root_ns(steering->port_sel_root_ns);
-	cleanup_root_ns(steering->sniffer_rx_root_ns);
-	cleanup_root_ns(steering->sniffer_tx_root_ns);
-	cleanup_root_ns(steering->rdma_rx_root_ns);
-	cleanup_root_ns(steering->rdma_tx_root_ns);
-	cleanup_root_ns(steering->egress_root_ns);
-	mlx5_cleanup_fc_stats(dev);
-	kmem_cache_destroy(steering->ftes_cache);
-	kmem_cache_destroy(steering->fgs_cache);
-	mlx5_ft_pool_destroy(dev);
-	kfree(steering);
-}
-
 static int init_sniffer_tx_root_ns(struct mlx5_flow_steering *steering)
 {
 	struct fs_prio *prio;
@@ -3086,42 +3064,27 @@ static int init_egress_root_ns(struct mlx5_flow_steering *steering)
 	return err;
 }
 
-int mlx5_init_fs(struct mlx5_core_dev *dev)
+void mlx5_fs_core_cleanup(struct mlx5_core_dev *dev)
 {
-	struct mlx5_flow_steering *steering;
-	int err = 0;
-
-	err = mlx5_init_fc_stats(dev);
-	if (err)
-		return err;
-
-	err = mlx5_ft_pool_init(dev);
-	if (err)
-		return err;
-
-	steering = kzalloc(sizeof(*steering), GFP_KERNEL);
-	if (!steering) {
-		err = -ENOMEM;
-		goto err;
-	}
-
-	steering->dev = dev;
-	dev->priv.steering = steering;
+	struct mlx5_flow_steering *steering = dev->priv.steering;
 
-	if (mlx5_fs_dr_is_supported(dev))
-		steering->mode = MLX5_FLOW_STEERING_MODE_SMFS;
-	else
-		steering->mode = MLX5_FLOW_STEERING_MODE_DMFS;
+	cleanup_root_ns(steering->root_ns);
+	cleanup_root_ns(steering->fdb_root_ns);
+	steering->fdb_root_ns = NULL;
+	kfree(steering->fdb_sub_ns);
+	steering->fdb_sub_ns = NULL;
+	cleanup_root_ns(steering->port_sel_root_ns);
+	cleanup_root_ns(steering->sniffer_rx_root_ns);
+	cleanup_root_ns(steering->sniffer_tx_root_ns);
+	cleanup_root_ns(steering->rdma_rx_root_ns);
+	cleanup_root_ns(steering->rdma_tx_root_ns);
+	cleanup_root_ns(steering->egress_root_ns);
+}
 
-	steering->fgs_cache = kmem_cache_create("mlx5_fs_fgs",
-						sizeof(struct mlx5_flow_group), 0,
-						0, NULL);
-	steering->ftes_cache = kmem_cache_create("mlx5_fs_ftes", sizeof(struct fs_fte), 0,
-						 0, NULL);
-	if (!steering->ftes_cache || !steering->fgs_cache) {
-		err = -ENOMEM;
-		goto err;
-	}
+int mlx5_fs_core_init(struct mlx5_core_dev *dev)
+{
+	struct mlx5_flow_steering *steering = dev->priv.steering;
+	int err = 0;
 
 	if ((((MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
 	      (MLX5_CAP_GEN(dev, nic_flow_table))) ||
@@ -3180,8 +3143,64 @@ int mlx5_init_fs(struct mlx5_core_dev *dev)
 	}
 
 	return 0;
+
+err:
+	mlx5_fs_core_cleanup(dev);
+	return err;
+}
+
+void mlx5_fs_core_free(struct mlx5_core_dev *dev)
+{
+	struct mlx5_flow_steering *steering = dev->priv.steering;
+
+	kmem_cache_destroy(steering->ftes_cache);
+	kmem_cache_destroy(steering->fgs_cache);
+	kfree(steering);
+	mlx5_ft_pool_destroy(dev);
+	mlx5_cleanup_fc_stats(dev);
+}
+
+int mlx5_fs_core_alloc(struct mlx5_core_dev *dev)
+{
+	struct mlx5_flow_steering *steering;
+	int err = 0;
+
+	err = mlx5_init_fc_stats(dev);
+	if (err)
+		return err;
+
+	err = mlx5_ft_pool_init(dev);
+	if (err)
+		goto err;
+
+	steering = kzalloc(sizeof(*steering), GFP_KERNEL);
+	if (!steering) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	steering->dev = dev;
+	dev->priv.steering = steering;
+
+	if (mlx5_fs_dr_is_supported(dev))
+		steering->mode = MLX5_FLOW_STEERING_MODE_SMFS;
+	else
+		steering->mode = MLX5_FLOW_STEERING_MODE_DMFS;
+
+	steering->fgs_cache = kmem_cache_create("mlx5_fs_fgs",
+						sizeof(struct mlx5_flow_group), 0,
+						0, NULL);
+	steering->ftes_cache = kmem_cache_create("mlx5_fs_ftes", sizeof(struct fs_fte), 0,
+						 0, NULL);
+	if (!steering->ftes_cache || !steering->fgs_cache) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	return 0;
+
 err:
-	mlx5_cleanup_fs(dev);
+	mlx5_fs_core_free(dev);
 	return err;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index c488a7c5b07e..3f20523e514f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -298,8 +298,10 @@ int mlx5_flow_namespace_set_peer(struct mlx5_flow_root_namespace *ns,
 int mlx5_flow_namespace_set_mode(struct mlx5_flow_namespace *ns,
 				 enum mlx5_flow_steering_mode mode);
 
-int mlx5_init_fs(struct mlx5_core_dev *dev);
-void mlx5_cleanup_fs(struct mlx5_core_dev *dev);
+int mlx5_fs_core_alloc(struct mlx5_core_dev *dev);
+void mlx5_fs_core_free(struct mlx5_core_dev *dev);
+int mlx5_fs_core_init(struct mlx5_core_dev *dev);
+void mlx5_fs_core_cleanup(struct mlx5_core_dev *dev);
 
 int mlx5_fs_egress_acls_init(struct mlx5_core_dev *dev, int total_vports);
 void mlx5_fs_egress_acls_cleanup(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 2589e39eb9c7..c6737720bf3a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -938,6 +938,12 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 		goto err_sf_table_cleanup;
 	}
 
+	err = mlx5_fs_core_alloc(dev);
+	if (err) {
+		mlx5_core_err(dev, "Failed to alloc flow steering\n");
+		goto err_fs;
+	}
+
 	dev->dm = mlx5_dm_create(dev);
 	if (IS_ERR(dev->dm))
 		mlx5_core_warn(dev, "Failed to init device memory%d\n", err);
@@ -948,6 +954,8 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 
 	return 0;
 
+err_fs:
+	mlx5_sf_table_cleanup(dev);
 err_sf_table_cleanup:
 	mlx5_sf_hw_table_cleanup(dev);
 err_sf_hw_table_cleanup:
@@ -985,6 +993,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
 	mlx5_hv_vhca_destroy(dev->hv_vhca);
 	mlx5_fw_tracer_destroy(dev->tracer);
 	mlx5_dm_cleanup(dev);
+	mlx5_fs_core_free(dev);
 	mlx5_sf_table_cleanup(dev);
 	mlx5_sf_hw_table_cleanup(dev);
 	mlx5_vhca_event_cleanup(dev);
@@ -1191,7 +1200,7 @@ static int mlx5_load(struct mlx5_core_dev *dev)
 		goto err_tls_start;
 	}
 
-	err = mlx5_init_fs(dev);
+	err = mlx5_fs_core_init(dev);
 	if (err) {
 		mlx5_core_err(dev, "Failed to init flow steering\n");
 		goto err_fs;
@@ -1236,7 +1245,7 @@ static int mlx5_load(struct mlx5_core_dev *dev)
 err_vhca:
 	mlx5_vhca_event_stop(dev);
 err_set_hca:
-	mlx5_cleanup_fs(dev);
+	mlx5_fs_core_cleanup(dev);
 err_fs:
 	mlx5_accel_tls_cleanup(dev);
 err_tls_start:
@@ -1265,7 +1274,7 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
 	mlx5_ec_cleanup(dev);
 	mlx5_sf_hw_table_destroy(dev);
 	mlx5_vhca_event_stop(dev);
-	mlx5_cleanup_fs(dev);
+	mlx5_fs_core_cleanup(dev);
 	mlx5_accel_ipsec_cleanup(dev);
 	mlx5_accel_tls_cleanup(dev);
 	mlx5_fpga_device_stop(dev);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 03/11] net/mlx5: DR, Ignore modify TTL on RX if device doesn't support it
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
  2022-05-18  6:34 ` [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table Saeed Mahameed
  2022-05-18  6:34 ` [net 02/11] net/mlx5: Initialize flow steering during driver probe Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 04/11] net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock Saeed Mahameed
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Yevgeny Kliteynik, Alex Vesker, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

When modifying TTL, packet's csum has to be recalculated.
Due to HW issue in ConnectX-5, csum recalculation for modify
TTL on RX is supported through a work-around that is specifically
enabled by configuration.
If the work-around isn't enabled, rather than adding an unsupported
action the modify TTL action on RX should be ignored.
Ignoring modify TTL action might result in zero actions, so in such
cases we will not convert the match STE to modify STE, as it is done
by FW in DMFS.

This patch fixes an issue where modify TTL action was ignored both
on RX and TX instead of only on RX.

Fixes: 4ff725e1d4ad ("net/mlx5: DR, Ignore modify TTL if device doesn't support it")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_action.c   | 65 +++++++++++++------
 .../mellanox/mlx5/core/steering/dr_ste_v0.c   |  4 +-
 2 files changed, 48 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index b52b539c8d2c..1383550f44c1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -530,6 +530,37 @@ static int dr_action_handle_cs_recalc(struct mlx5dr_domain *dmn,
 	return 0;
 }
 
+static void dr_action_modify_ttl_adjust(struct mlx5dr_domain *dmn,
+					struct mlx5dr_ste_actions_attr *attr,
+					bool rx_rule,
+					bool *recalc_cs_required)
+{
+	*recalc_cs_required = false;
+
+	/* if device supports csum recalculation - no adjustment needed */
+	if (mlx5dr_ste_supp_ttl_cs_recalc(&dmn->info.caps))
+		return;
+
+	/* no adjustment needed on TX rules */
+	if (!rx_rule)
+		return;
+
+	if (!MLX5_CAP_ESW_FLOWTABLE(dmn->mdev, fdb_ipv4_ttl_modify)) {
+		/* Ignore the modify TTL action.
+		 * It is always kept as last HW action.
+		 */
+		attr->modify_actions--;
+		return;
+	}
+
+	if (dmn->type == MLX5DR_DOMAIN_TYPE_FDB)
+		/* Due to a HW bug on some devices, modifying TTL on RX flows
+		 * will cause an incorrect checksum calculation. In such cases
+		 * we will use a FW table to recalculate the checksum.
+		 */
+		*recalc_cs_required = true;
+}
+
 static void dr_action_print_sequence(struct mlx5dr_domain *dmn,
 				     struct mlx5dr_action *actions[],
 				     int last_idx)
@@ -650,8 +681,9 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 		case DR_ACTION_TYP_MODIFY_HDR:
 			attr.modify_index = action->rewrite->index;
 			attr.modify_actions = action->rewrite->num_of_actions;
-			recalc_cs_required = action->rewrite->modify_ttl &&
-					     !mlx5dr_ste_supp_ttl_cs_recalc(&dmn->info.caps);
+			if (action->rewrite->modify_ttl)
+				dr_action_modify_ttl_adjust(dmn, &attr, rx_rule,
+							    &recalc_cs_required);
 			break;
 		case DR_ACTION_TYP_L2_TO_TNL_L2:
 		case DR_ACTION_TYP_L2_TO_TNL_L3:
@@ -732,12 +764,7 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 	*new_hw_ste_arr_sz = nic_matcher->num_of_builders;
 	last_ste = ste_arr + DR_STE_SIZE * (nic_matcher->num_of_builders - 1);
 
-	/* Due to a HW bug in some devices, modifying TTL on RX flows will
-	 * cause an incorrect checksum calculation. In this case we will
-	 * use a FW table to recalculate.
-	 */
-	if (dmn->type == MLX5DR_DOMAIN_TYPE_FDB &&
-	    rx_rule && recalc_cs_required && dest_action) {
+	if (recalc_cs_required && dest_action) {
 		ret = dr_action_handle_cs_recalc(dmn, dest_action, &attr.final_icm_addr);
 		if (ret) {
 			mlx5dr_err(dmn,
@@ -1558,12 +1585,6 @@ dr_action_modify_check_is_ttl_modify(const void *sw_action)
 	return sw_field == MLX5_ACTION_IN_FIELD_OUT_IP_TTL;
 }
 
-static bool dr_action_modify_ttl_ignore(struct mlx5dr_domain *dmn)
-{
-	return !mlx5dr_ste_supp_ttl_cs_recalc(&dmn->info.caps) &&
-	       !MLX5_CAP_ESW_FLOWTABLE(dmn->mdev, fdb_ipv4_ttl_modify);
-}
-
 static int dr_actions_convert_modify_header(struct mlx5dr_action *action,
 					    u32 max_hw_actions,
 					    u32 num_sw_actions,
@@ -1575,6 +1596,7 @@ static int dr_actions_convert_modify_header(struct mlx5dr_action *action,
 	const struct mlx5dr_ste_action_modify_field *hw_dst_action_info;
 	const struct mlx5dr_ste_action_modify_field *hw_src_action_info;
 	struct mlx5dr_domain *dmn = action->rewrite->dmn;
+	__be64 *modify_ttl_sw_action = NULL;
 	int ret, i, hw_idx = 0;
 	__be64 *sw_action;
 	__be64 hw_action;
@@ -1587,8 +1609,14 @@ static int dr_actions_convert_modify_header(struct mlx5dr_action *action,
 	action->rewrite->allow_rx = 1;
 	action->rewrite->allow_tx = 1;
 
-	for (i = 0; i < num_sw_actions; i++) {
-		sw_action = &sw_actions[i];
+	for (i = 0; i < num_sw_actions || modify_ttl_sw_action; i++) {
+		/* modify TTL is handled separately, as a last action */
+		if (i == num_sw_actions) {
+			sw_action = modify_ttl_sw_action;
+			modify_ttl_sw_action = NULL;
+		} else {
+			sw_action = &sw_actions[i];
+		}
 
 		ret = dr_action_modify_check_field_limitation(action,
 							      sw_action);
@@ -1597,10 +1625,9 @@ static int dr_actions_convert_modify_header(struct mlx5dr_action *action,
 
 		if (!(*modify_ttl) &&
 		    dr_action_modify_check_is_ttl_modify(sw_action)) {
-			if (dr_action_modify_ttl_ignore(dmn))
-				continue;
-
+			modify_ttl_sw_action = sw_action;
 			*modify_ttl = true;
+			continue;
 		}
 
 		/* Convert SW action to HW action */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
index 5a322335f204..2010d4ac6519 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
@@ -420,7 +420,7 @@ dr_ste_v0_set_actions_tx(struct mlx5dr_domain *dmn,
 	 * encapsulation. The reason for that is that we support
 	 * modify headers for outer headers only
 	 */
-	if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) {
+	if (action_type_set[DR_ACTION_TYP_MODIFY_HDR] && attr->modify_actions) {
 		dr_ste_v0_set_entry_type(last_ste, DR_STE_TYPE_MODIFY_PKT);
 		dr_ste_v0_set_rewrite_actions(last_ste,
 					      attr->modify_actions,
@@ -513,7 +513,7 @@ dr_ste_v0_set_actions_rx(struct mlx5dr_domain *dmn,
 		}
 	}
 
-	if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) {
+	if (action_type_set[DR_ACTION_TYP_MODIFY_HDR] && attr->modify_actions) {
 		if (dr_ste_v0_get_entry_type(last_ste) == DR_STE_TYPE_MODIFY_PKT)
 			dr_ste_v0_arr_init_next(&last_ste,
 						added_stes,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 04/11] net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2022-05-18  6:34 ` [net 03/11] net/mlx5: DR, Ignore modify TTL on RX if device doesn't support it Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 05/11] net/mlx5e: Block rx-gro-hw feature in switchdev mode Saeed Mahameed
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

The body of mlx5e_napi_poll is wrapped into rcu_read_lock to be able to
read the XDP program pointer using rcu_dereference. However, the trap RQ
NAPI doesn't use rcu_read_lock, because the trap RQ works only in the
non-linear mode, and mlx5e_skb_from_cqe_nonlinear, until recently,
didn't support XDP and didn't call rcu_dereference.

Starting from the cited commit, mlx5e_skb_from_cqe_nonlinear supports
XDP and calls rcu_dereference, but mlx5e_trap_napi_poll doesn't wrap it
into rcu_read_lock. It leads to RCU-lockdep warnings like this:

    WARNING: suspicious RCU usage

This commit fixes the issue by adding an rcu_read_lock to
mlx5e_trap_napi_poll, similarly to mlx5e_napi_poll.

Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/trap.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
index a55b066746cb..857840ab1e91 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
@@ -14,19 +14,26 @@ static int mlx5e_trap_napi_poll(struct napi_struct *napi, int budget)
 	bool busy = false;
 	int work_done = 0;
 
+	rcu_read_lock();
+
 	ch_stats->poll++;
 
 	work_done = mlx5e_poll_rx_cq(&rq->cq, budget);
 	busy |= work_done == budget;
 	busy |= rq->post_wqes(rq);
 
-	if (busy)
-		return budget;
+	if (busy) {
+		work_done = budget;
+		goto out;
+	}
 
 	if (unlikely(!napi_complete_done(napi, work_done)))
-		return work_done;
+		goto out;
 
 	mlx5e_cq_arm(&rq->cq);
+
+out:
+	rcu_read_unlock();
 	return work_done;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 05/11] net/mlx5e: Block rx-gro-hw feature in switchdev mode
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2022-05-18  6:34 ` [net 04/11] net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 06/11] net/mlx5e: Properly block LRO when XDP is enabled Saeed Mahameed
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Aya Levin, Tariq Toukan, Saeed Mahameed

From: Aya Levin <ayal@nvidia.com>

When the driver is in switchdev mode and rx-gro-hw is set, the RQ needs
special CQE handling. Till then, block setting of rx-gro-hw feature in
switchdev mode, to avoid failure while setting the feature due to
failure while opening the RQ.

Fixes: f97d5c2a453e ("net/mlx5e: Add handle SHAMPO cqe support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 2f1dedc721d1..999241961714 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3864,6 +3864,10 @@ static netdev_features_t mlx5e_fix_uplink_rep_features(struct net_device *netdev
 	if (netdev->features & NETIF_F_NTUPLE)
 		netdev_warn(netdev, "Disabling ntuple, not supported in switchdev mode\n");
 
+	features &= ~NETIF_F_GRO_HW;
+	if (netdev->features & NETIF_F_GRO_HW)
+		netdev_warn(netdev, "Disabling HW_GRO, not supported in switchdev mode\n");
+
 	return features;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 06/11] net/mlx5e: Properly block LRO when XDP is enabled
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2022-05-18  6:34 ` [net 05/11] net/mlx5e: Block rx-gro-hw feature in switchdev mode Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 07/11] net/mlx5e: Properly block HW GRO " Saeed Mahameed
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

LRO is incompatible and mutually exclusive with XDP. However, the needed
checks are only made when enabling XDP. If LRO is enabled when XDP is
already active, the command will succeed, and XDP will be skipped in the
data path, although still enabled.

This commit fixes the bug by checking the XDP status in
mlx5e_fix_features and disabling LRO if XDP is enabled.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 999241961714..3969916cfafe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3900,6 +3900,13 @@ static netdev_features_t mlx5e_fix_features(struct net_device *netdev,
 		}
 	}
 
+	if (params->xdp_prog) {
+		if (features & NETIF_F_LRO) {
+			netdev_warn(netdev, "LRO is incompatible with XDP\n");
+			features &= ~NETIF_F_LRO;
+		}
+	}
+
 	if (MLX5E_GET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS)) {
 		features &= ~NETIF_F_RXHASH;
 		if (netdev->features & NETIF_F_RXHASH)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 07/11] net/mlx5e: Properly block HW GRO when XDP is enabled
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2022-05-18  6:34 ` [net 06/11] net/mlx5e: Properly block LRO when XDP is enabled Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 08/11] net/mlx5e: Remove HW-GRO from reported features Saeed Mahameed
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Maxim Mikityanskiy, Tariq Toukan, Saeed Mahameed

From: Maxim Mikityanskiy <maximmi@nvidia.com>

HW GRO is incompatible and mutually exclusive with XDP and XSK. However,
the needed checks are only made when enabling XDP. If HW GRO is enabled
when XDP is already active, the command will succeed, and XDP will be
skipped in the data path, although still enabled.

This commit fixes the bug by checking the XDP and XSK status in
mlx5e_fix_features and disabling HW GRO if XDP is enabled.

Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 3969916cfafe..6afd07901a10 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3905,6 +3905,18 @@ static netdev_features_t mlx5e_fix_features(struct net_device *netdev,
 			netdev_warn(netdev, "LRO is incompatible with XDP\n");
 			features &= ~NETIF_F_LRO;
 		}
+		if (features & NETIF_F_GRO_HW) {
+			netdev_warn(netdev, "HW GRO is incompatible with XDP\n");
+			features &= ~NETIF_F_GRO_HW;
+		}
+	}
+
+	if (priv->xsk.refcnt) {
+		if (features & NETIF_F_GRO_HW) {
+			netdev_warn(netdev, "HW GRO is incompatible with AF_XDP (%u XSKs are active)\n",
+				    priv->xsk.refcnt);
+			features &= ~NETIF_F_GRO_HW;
+		}
 	}
 
 	if (MLX5E_GET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS)) {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 08/11] net/mlx5e: Remove HW-GRO from reported features
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2022-05-18  6:34 ` [net 07/11] net/mlx5e: Properly block HW GRO " Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 09/11] net/mlx5e: CT: Fix support for GRE tuples Saeed Mahameed
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Gal Pressman, Saeed Mahameed

From: Gal Pressman <gal@nvidia.com>

We got reports of certain HW-GRO flows causing kernel call traces, which
might be related to firmware. To be on the safe side, disable the
feature for now and re-enable it once a driver/firmware fix is found.

Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 6afd07901a10..fa229998606c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4873,10 +4873,6 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
 	netdev->hw_features      |= NETIF_F_HW_VLAN_CTAG_FILTER;
 	netdev->hw_features      |= NETIF_F_HW_VLAN_STAG_TX;
 
-	if (!!MLX5_CAP_GEN(mdev, shampo) &&
-	    mlx5e_check_fragmented_striding_rq_cap(mdev))
-		netdev->hw_features    |= NETIF_F_GRO_HW;
-
 	if (mlx5e_tunnel_any_tx_proto_supported(mdev)) {
 		netdev->hw_enc_features |= NETIF_F_HW_CSUM;
 		netdev->hw_enc_features |= NETIF_F_TSO;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 09/11] net/mlx5e: CT: Fix support for GRE tuples
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2022-05-18  6:34 ` [net 08/11] net/mlx5e: Remove HW-GRO from reported features Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 10/11] net/mlx5e: CT: Fix setting flow_source for smfs ct tuples Saeed Mahameed
  2022-05-18  6:34 ` [net 11/11] net/mlx5: Drain fw_reset when removing device Saeed Mahameed
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Paul Blakey, Oz Shlomo, Saeed Mahameed

From: Paul Blakey <paulb@nvidia.com>

cited commit removed support for GRE tuples when software steering was enabled.

To bring back support for GRE tuples, add GRE ipv4/ipv6 matchers.

Fixes: 3ee61ebb0df1 ("net/mlx5: CT: Add software steering ct flow steering provider")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/tc/ct_fs_smfs.c     | 56 +++++++++++--------
 1 file changed, 32 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
index 59988e24b704..271261bf1dc2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
@@ -23,7 +23,7 @@ struct mlx5_ct_fs_smfs_matcher {
 };
 
 struct mlx5_ct_fs_smfs_matchers {
-	struct mlx5_ct_fs_smfs_matcher smfs_matchers[4];
+	struct mlx5_ct_fs_smfs_matcher smfs_matchers[6];
 	struct list_head used;
 };
 
@@ -44,7 +44,8 @@ struct mlx5_ct_fs_smfs_rule {
 };
 
 static inline void
-mlx5_ct_fs_smfs_fill_mask(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec, bool ipv4, bool tcp)
+mlx5_ct_fs_smfs_fill_mask(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec, bool ipv4, bool tcp,
+			  bool gre)
 {
 	void *headers_c = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, outer_headers);
 
@@ -77,7 +78,7 @@ mlx5_ct_fs_smfs_fill_mask(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec, bo
 		MLX5_SET_TO_ONES(fte_match_set_lyr_2_4, headers_c, tcp_dport);
 		MLX5_SET(fte_match_set_lyr_2_4, headers_c, tcp_flags,
 			 ntohs(MLX5_CT_TCP_FLAGS_MASK));
-	} else {
+	} else if (!gre) {
 		MLX5_SET_TO_ONES(fte_match_set_lyr_2_4, headers_c, udp_sport);
 		MLX5_SET_TO_ONES(fte_match_set_lyr_2_4, headers_c, udp_dport);
 	}
@@ -87,7 +88,7 @@ mlx5_ct_fs_smfs_fill_mask(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec, bo
 
 static struct mlx5dr_matcher *
 mlx5_ct_fs_smfs_matcher_create(struct mlx5_ct_fs *fs, struct mlx5dr_table *tbl, bool ipv4,
-			       bool tcp, u32 priority)
+			       bool tcp, bool gre, u32 priority)
 {
 	struct mlx5dr_matcher *dr_matcher;
 	struct mlx5_flow_spec *spec;
@@ -96,7 +97,7 @@ mlx5_ct_fs_smfs_matcher_create(struct mlx5_ct_fs *fs, struct mlx5dr_table *tbl,
 	if (!spec)
 		return ERR_PTR(-ENOMEM);
 
-	mlx5_ct_fs_smfs_fill_mask(fs, spec, ipv4, tcp);
+	mlx5_ct_fs_smfs_fill_mask(fs, spec, ipv4, tcp, gre);
 	spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2 | MLX5_MATCH_OUTER_HEADERS;
 
 	dr_matcher = mlx5_smfs_matcher_create(tbl, priority, spec);
@@ -108,7 +109,7 @@ mlx5_ct_fs_smfs_matcher_create(struct mlx5_ct_fs *fs, struct mlx5dr_table *tbl,
 }
 
 static struct mlx5_ct_fs_smfs_matcher *
-mlx5_ct_fs_smfs_matcher_get(struct mlx5_ct_fs *fs, bool nat, bool ipv4, bool tcp)
+mlx5_ct_fs_smfs_matcher_get(struct mlx5_ct_fs *fs, bool nat, bool ipv4, bool tcp, bool gre)
 {
 	struct mlx5_ct_fs_smfs *fs_smfs = mlx5_ct_fs_priv(fs);
 	struct mlx5_ct_fs_smfs_matcher *m, *smfs_matcher;
@@ -119,7 +120,7 @@ mlx5_ct_fs_smfs_matcher_get(struct mlx5_ct_fs *fs, bool nat, bool ipv4, bool tcp
 	int prio;
 
 	matchers = nat ? &fs_smfs->matchers_nat : &fs_smfs->matchers;
-	smfs_matcher = &matchers->smfs_matchers[ipv4 * 2 + tcp];
+	smfs_matcher = &matchers->smfs_matchers[ipv4 * 3 + tcp * 2 + gre];
 
 	if (refcount_inc_not_zero(&smfs_matcher->ref))
 		return smfs_matcher;
@@ -145,11 +146,11 @@ mlx5_ct_fs_smfs_matcher_get(struct mlx5_ct_fs *fs, bool nat, bool ipv4, bool tcp
 	}
 
 	tbl = nat ? fs_smfs->ct_nat_tbl : fs_smfs->ct_tbl;
-	dr_matcher = mlx5_ct_fs_smfs_matcher_create(fs, tbl, ipv4, tcp, prio);
+	dr_matcher = mlx5_ct_fs_smfs_matcher_create(fs, tbl, ipv4, tcp, gre, prio);
 	if (IS_ERR(dr_matcher)) {
 		netdev_warn(fs->netdev,
-			    "ct_fs_smfs: failed to create matcher (nat %d, ipv4 %d, tcp %d), err: %ld\n",
-			    nat, ipv4, tcp, PTR_ERR(dr_matcher));
+			    "ct_fs_smfs: failed to create matcher (nat %d, ipv4 %d, tcp %d, gre %d), err: %ld\n",
+			    nat, ipv4, tcp, gre, PTR_ERR(dr_matcher));
 
 		smfs_matcher = ERR_CAST(dr_matcher);
 		goto out_unlock;
@@ -222,16 +223,17 @@ mlx5_ct_fs_smfs_destroy(struct mlx5_ct_fs *fs)
 static inline bool
 mlx5_tc_ct_valid_used_dissector_keys(const u32 used_keys)
 {
-#define DISSECTOR_BIT(name) BIT(FLOW_DISSECTOR_KEY_ ## name)
-	const u32 basic_keys = DISSECTOR_BIT(BASIC) | DISSECTOR_BIT(CONTROL) |
-			       DISSECTOR_BIT(PORTS) | DISSECTOR_BIT(META);
-	const u32 ipv4_tcp = basic_keys | DISSECTOR_BIT(IPV4_ADDRS) | DISSECTOR_BIT(TCP);
-	const u32 ipv4_udp = basic_keys | DISSECTOR_BIT(IPV4_ADDRS);
-	const u32 ipv6_tcp = basic_keys | DISSECTOR_BIT(IPV6_ADDRS) | DISSECTOR_BIT(TCP);
-	const u32 ipv6_udp = basic_keys | DISSECTOR_BIT(IPV6_ADDRS);
+#define DISS_BIT(name) BIT(FLOW_DISSECTOR_KEY_ ## name)
+	const u32 basic_keys = DISS_BIT(BASIC) | DISS_BIT(CONTROL) | DISS_BIT(META);
+	const u32 ipv4_tcp = basic_keys | DISS_BIT(IPV4_ADDRS) | DISS_BIT(PORTS) | DISS_BIT(TCP);
+	const u32 ipv6_tcp = basic_keys | DISS_BIT(IPV6_ADDRS) | DISS_BIT(PORTS) | DISS_BIT(TCP);
+	const u32 ipv4_udp = basic_keys | DISS_BIT(IPV4_ADDRS) | DISS_BIT(PORTS);
+	const u32 ipv6_udp = basic_keys | DISS_BIT(IPV6_ADDRS) | DISS_BIT(PORTS);
+	const u32 ipv4_gre = basic_keys | DISS_BIT(IPV4_ADDRS);
+	const u32 ipv6_gre = basic_keys | DISS_BIT(IPV6_ADDRS);
 
 	return (used_keys == ipv4_tcp || used_keys == ipv4_udp || used_keys == ipv6_tcp ||
-		used_keys == ipv6_udp);
+		used_keys == ipv6_udp || used_keys == ipv4_gre || used_keys == ipv6_gre);
 }
 
 static bool
@@ -254,20 +256,24 @@ mlx5_ct_fs_smfs_ct_validate_flow_rule(struct mlx5_ct_fs *fs, struct flow_rule *f
 	flow_rule_match_control(flow_rule, &control);
 	flow_rule_match_ipv4_addrs(flow_rule, &ipv4_addrs);
 	flow_rule_match_ipv6_addrs(flow_rule, &ipv6_addrs);
-	flow_rule_match_ports(flow_rule, &ports);
-	flow_rule_match_tcp(flow_rule, &tcp);
+	if (basic.key->ip_proto != IPPROTO_GRE)
+		flow_rule_match_ports(flow_rule, &ports);
+	if (basic.key->ip_proto == IPPROTO_TCP)
+		flow_rule_match_tcp(flow_rule, &tcp);
 
 	if (basic.mask->n_proto != htons(0xFFFF) ||
 	    (basic.key->n_proto != htons(ETH_P_IP) && basic.key->n_proto != htons(ETH_P_IPV6)) ||
 	    basic.mask->ip_proto != 0xFF ||
-	    (basic.key->ip_proto != IPPROTO_UDP && basic.key->ip_proto != IPPROTO_TCP)) {
+	    (basic.key->ip_proto != IPPROTO_UDP && basic.key->ip_proto != IPPROTO_TCP &&
+	     basic.key->ip_proto != IPPROTO_GRE)) {
 		ct_dbg("rule uses unexpected basic match (n_proto 0x%04x/0x%04x, ip_proto 0x%02x/0x%02x)",
 		       ntohs(basic.key->n_proto), ntohs(basic.mask->n_proto),
 		       basic.key->ip_proto, basic.mask->ip_proto);
 		return false;
 	}
 
-	if (ports.mask->src != htons(0xFFFF) || ports.mask->dst != htons(0xFFFF)) {
+	if (basic.key->ip_proto != IPPROTO_GRE &&
+	    (ports.mask->src != htons(0xFFFF) || ports.mask->dst != htons(0xFFFF))) {
 		ct_dbg("rule uses ports match (src 0x%04x, dst 0x%04x)",
 		       ports.mask->src, ports.mask->dst);
 		return false;
@@ -291,7 +297,7 @@ mlx5_ct_fs_smfs_ct_rule_add(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec,
 	struct mlx5dr_action *actions[5];
 	struct mlx5dr_rule *rule;
 	int num_actions = 0, err;
-	bool nat, tcp, ipv4;
+	bool nat, tcp, ipv4, gre;
 
 	if (!mlx5_ct_fs_smfs_ct_validate_flow_rule(fs, flow_rule))
 		return ERR_PTR(-EOPNOTSUPP);
@@ -314,8 +320,10 @@ mlx5_ct_fs_smfs_ct_rule_add(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec,
 	ipv4 = mlx5e_tc_get_ip_version(spec, true) == 4;
 	tcp = MLX5_GET(fte_match_param, spec->match_value,
 		       outer_headers.ip_protocol) == IPPROTO_TCP;
+	gre = MLX5_GET(fte_match_param, spec->match_value,
+		       outer_headers.ip_protocol) == IPPROTO_GRE;
 
-	smfs_matcher = mlx5_ct_fs_smfs_matcher_get(fs, nat, ipv4, tcp);
+	smfs_matcher = mlx5_ct_fs_smfs_matcher_get(fs, nat, ipv4, tcp, gre);
 	if (IS_ERR(smfs_matcher)) {
 		err = PTR_ERR(smfs_matcher);
 		goto err_matcher;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 10/11] net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2022-05-18  6:34 ` [net 09/11] net/mlx5e: CT: Fix support for GRE tuples Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  2022-05-18  6:34 ` [net 11/11] net/mlx5: Drain fw_reset when removing device Saeed Mahameed
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Paul Blakey, Oz Shlomo, Saeed Mahameed

From: Paul Blakey <paulb@nvidia.com>

Cited patch sets flow_source to ANY overriding the provided spec
flow_source, avoiding the optimization done by commit c9c079b4deaa
("net/mlx5: CT: Set flow source hint from provided tuple device").

To fix the above, set the dr_rule flow_source from provided flow spec.

Fixes: 3ee61ebb0df1 ("net/mlx5: CT: Add software steering ct flow steering provider")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
index 271261bf1dc2..bec9ed0103a9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
@@ -330,7 +330,7 @@ mlx5_ct_fs_smfs_ct_rule_add(struct mlx5_ct_fs *fs, struct mlx5_flow_spec *spec,
 	}
 
 	rule = mlx5_smfs_rule_create(smfs_matcher->dr_matcher, spec, num_actions, actions,
-				     MLX5_FLOW_CONTEXT_FLOW_SOURCE_ANY_VPORT);
+				     spec->flow_context.flow_source);
 	if (!rule) {
 		err = -EINVAL;
 		goto err_create;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net 11/11] net/mlx5: Drain fw_reset when removing device
  2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2022-05-18  6:34 ` [net 10/11] net/mlx5e: CT: Fix setting flow_source for smfs ct tuples Saeed Mahameed
@ 2022-05-18  6:34 ` Saeed Mahameed
  10 siblings, 0 replies; 13+ messages in thread
From: Saeed Mahameed @ 2022-05-18  6:34 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Shay Drory, Moshe Shemesh, Saeed Mahameed

From: Shay Drory <shayd@nvidia.com>

In case fw sync reset is called in parallel to device removal, device
might stuck in the following deadlock:
         CPU 0                        CPU 1
         -----                        -----
                                  remove_one
                                   uninit_one (locks intf_state_mutex)
mlx5_sync_reset_now_event()
work in fw_reset->wq.
 mlx5_enter_error_state()
  mutex_lock (intf_state_mutex)
                                   cleanup_once
                                    fw_reset_cleanup()
                                     destroy_workqueue(fw_reset->wq)

Drain the fw_reset WQ, and make sure no new work is being queued, before
entering uninit_one().
The Drain is done before devlink_unregister() since fw_reset, in some
flows, is using devlink API devlink_remote_reload_actions_performed().

Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 25 ++++++++++++++++---
 .../ethernet/mellanox/mlx5/core/fw_reset.h    |  1 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |  4 +++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index ca1aba845dd6..81eb67fb95b0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -8,7 +8,8 @@
 enum {
 	MLX5_FW_RESET_FLAGS_RESET_REQUESTED,
 	MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST,
-	MLX5_FW_RESET_FLAGS_PENDING_COMP
+	MLX5_FW_RESET_FLAGS_PENDING_COMP,
+	MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS
 };
 
 struct mlx5_fw_reset {
@@ -208,7 +209,10 @@ static void poll_sync_reset(struct timer_list *t)
 
 	if (fatal_error) {
 		mlx5_core_warn(dev, "Got Device Reset\n");
-		queue_work(fw_reset->wq, &fw_reset->reset_reload_work);
+		if (!test_bit(MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, &fw_reset->reset_flags))
+			queue_work(fw_reset->wq, &fw_reset->reset_reload_work);
+		else
+			mlx5_core_err(dev, "Device is being removed, Drop new reset work\n");
 		return;
 	}
 
@@ -433,9 +437,12 @@ static int fw_reset_event_notifier(struct notifier_block *nb, unsigned long acti
 	struct mlx5_fw_reset *fw_reset = mlx5_nb_cof(nb, struct mlx5_fw_reset, nb);
 	struct mlx5_eqe *eqe = data;
 
+	if (test_bit(MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, &fw_reset->reset_flags))
+		return NOTIFY_DONE;
+
 	switch (eqe->sub_type) {
 	case MLX5_GENERAL_SUBTYPE_FW_LIVE_PATCH_EVENT:
-			queue_work(fw_reset->wq, &fw_reset->fw_live_patch_work);
+		queue_work(fw_reset->wq, &fw_reset->fw_live_patch_work);
 		break;
 	case MLX5_GENERAL_SUBTYPE_PCI_SYNC_FOR_FW_UPDATE_EVENT:
 		mlx5_sync_reset_events_handle(fw_reset, eqe);
@@ -479,6 +486,18 @@ void mlx5_fw_reset_events_stop(struct mlx5_core_dev *dev)
 	mlx5_eq_notifier_unregister(dev, &dev->priv.fw_reset->nb);
 }
 
+void mlx5_drain_fw_reset(struct mlx5_core_dev *dev)
+{
+	struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
+
+	set_bit(MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, &fw_reset->reset_flags);
+	cancel_work_sync(&fw_reset->fw_live_patch_work);
+	cancel_work_sync(&fw_reset->reset_request_work);
+	cancel_work_sync(&fw_reset->reset_reload_work);
+	cancel_work_sync(&fw_reset->reset_now_work);
+	cancel_work_sync(&fw_reset->reset_abort_work);
+}
+
 int mlx5_fw_reset_init(struct mlx5_core_dev *dev)
 {
 	struct mlx5_fw_reset *fw_reset = kzalloc(sizeof(*fw_reset), GFP_KERNEL);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
index 694fc7cb2684..dc141c7e641a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
@@ -16,6 +16,7 @@ int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev);
 int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev);
 void mlx5_fw_reset_events_start(struct mlx5_core_dev *dev);
 void mlx5_fw_reset_events_stop(struct mlx5_core_dev *dev);
+void mlx5_drain_fw_reset(struct mlx5_core_dev *dev);
 int mlx5_fw_reset_init(struct mlx5_core_dev *dev);
 void mlx5_fw_reset_cleanup(struct mlx5_core_dev *dev);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index c6737720bf3a..ef196cb764e2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1627,6 +1627,10 @@ static void remove_one(struct pci_dev *pdev)
 	struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
 	struct devlink *devlink = priv_to_devlink(dev);
 
+	/* mlx5_drain_fw_reset() is using devlink APIs. Hence, we must drain
+	 * fw_reset before unregistering the devlink.
+	 */
+	mlx5_drain_fw_reset(dev);
 	devlink_unregister(devlink);
 	mlx5_sriov_disable(pdev);
 	mlx5_crdump_disable(dev);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table
  2022-05-18  6:34 ` [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table Saeed Mahameed
@ 2022-05-18 11:10   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 13+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-05-18 11:10 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: davem, kuba, pabeni, netdev, maord, kliteyn, saeedm

Hello:

This series was applied to netdev/net.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:

On Tue, 17 May 2022 23:34:17 -0700 you wrote:
> From: Maor Dickman <maord@nvidia.com>
> 
> In order to support multiple destination FTEs with SW steering
> FW table is created with single FTE with multiple actions and
> SW steering rule forward to it. When creating this table, flow
> source isn't set according to the original FTE.
> 
> [...]

Here is the summary with links:
  - [net,01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table
    https://git.kernel.org/netdev/net/c/2c5fc6cd269a
  - [net,02/11] net/mlx5: Initialize flow steering during driver probe
    https://git.kernel.org/netdev/net/c/b33886971dbc
  - [net,03/11] net/mlx5: DR, Ignore modify TTL on RX if device doesn't support it
    https://git.kernel.org/netdev/net/c/785d7ed29551
  - [net,04/11] net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock
    https://git.kernel.org/netdev/net/c/379169740b0a
  - [net,05/11] net/mlx5e: Block rx-gro-hw feature in switchdev mode
    https://git.kernel.org/netdev/net/c/15a5078cab30
  - [net,06/11] net/mlx5e: Properly block LRO when XDP is enabled
    https://git.kernel.org/netdev/net/c/cf6e34c8c22f
  - [net,07/11] net/mlx5e: Properly block HW GRO when XDP is enabled
    https://git.kernel.org/netdev/net/c/b0617e7b3500
  - [net,08/11] net/mlx5e: Remove HW-GRO from reported features
    https://git.kernel.org/netdev/net/c/6bbd723035ba
  - [net,09/11] net/mlx5e: CT: Fix support for GRE tuples
    https://git.kernel.org/netdev/net/c/8e1dcf499a67
  - [net,10/11] net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
    https://git.kernel.org/netdev/net/c/04c551bad371
  - [net,11/11] net/mlx5: Drain fw_reset when removing device
    https://git.kernel.org/netdev/net/c/16d42d313350

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-05-18 11:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-18  6:34 [pull request][net 00/11] mlx5 fixes 2022-05-17 Saeed Mahameed
2022-05-18  6:34 ` [net 01/11] net/mlx5: DR, Fix missing flow_source when creating multi-destination FW table Saeed Mahameed
2022-05-18 11:10   ` patchwork-bot+netdevbpf
2022-05-18  6:34 ` [net 02/11] net/mlx5: Initialize flow steering during driver probe Saeed Mahameed
2022-05-18  6:34 ` [net 03/11] net/mlx5: DR, Ignore modify TTL on RX if device doesn't support it Saeed Mahameed
2022-05-18  6:34 ` [net 04/11] net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock Saeed Mahameed
2022-05-18  6:34 ` [net 05/11] net/mlx5e: Block rx-gro-hw feature in switchdev mode Saeed Mahameed
2022-05-18  6:34 ` [net 06/11] net/mlx5e: Properly block LRO when XDP is enabled Saeed Mahameed
2022-05-18  6:34 ` [net 07/11] net/mlx5e: Properly block HW GRO " Saeed Mahameed
2022-05-18  6:34 ` [net 08/11] net/mlx5e: Remove HW-GRO from reported features Saeed Mahameed
2022-05-18  6:34 ` [net 09/11] net/mlx5e: CT: Fix support for GRE tuples Saeed Mahameed
2022-05-18  6:34 ` [net 10/11] net/mlx5e: CT: Fix setting flow_source for smfs ct tuples Saeed Mahameed
2022-05-18  6:34 ` [net 11/11] net/mlx5: Drain fw_reset when removing device Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.