* [pull request][net 00/13] mlx5 fixes 2022-08-22
@ 2022-08-22 19:59 Saeed Mahameed
2022-08-22 19:59 ` [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps Saeed Mahameed
` (12 more replies)
0 siblings, 13 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev
From: Saeed Mahameed <saeedm@nvidia.com>
This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.
Thanks,
Saeed.
The following changes since commit f1e941dbf80a9b8bab0bffbc4cbe41cc7f4c6fb6:
nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout (2022-08-22 14:51:30 +0100)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2022-08-22
for you to fetch changes up to 35419025cb1ee40f8b4c10ab7dbe567ef70b8da4:
net/mlx5: Unlock on error in mlx5_sriov_enable() (2022-08-22 12:57:10 -0700)
----------------------------------------------------------------
mlx5-fixes-2022-08-22
----------------------------------------------------------------
Aya Levin (1):
net/mlx5e: Fix wrong application of the LRO state
Dan Carpenter (4):
net/mlx5: unlock on error path in esw_vfs_changed_event_handler()
net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup()
net/mlx5e: Fix use after free in mlx5e_fs_init()
net/mlx5: Unlock on error in mlx5_sriov_enable()
Eli Cohen (2):
net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY
net/mlx5: Eswitch, Fix forwarding decision to uplink
Maor Dickman (1):
net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off
Moshe Shemesh (1):
net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
Roi Dayan (1):
net/mlx5e: TC, Add missing policer validation
Roy Novich (1):
net/mlx5: Fix cmd error logging for manage pages cmd
Vlad Buslov (2):
net/mlx5e: Properly disable vlan strip on non-UL reps
net/mlx5: Disable irq when locking lag_lock
.../ethernet/mellanox/mlx5/core/en/tc/act/police.c | 4 ++
.../ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c | 5 +-
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 12 ++---
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 +
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 7 ++-
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 57 +++++++++++++---------
drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ++
.../net/ethernet/mellanox/mlx5/core/pagealloc.c | 9 ++--
drivers/net/ethernet/mellanox/mlx5/core/sriov.c | 2 +-
include/linux/mlx5/driver.h | 1 +
11 files changed, 65 insertions(+), 42 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-24 1:00 ` patchwork-bot+netdevbpf
2022-08-22 19:59 ` [net 02/13] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Saeed Mahameed
` (11 subsequent siblings)
12 siblings, 1 reply; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Vlad Buslov, Roi Dayan
From: Vlad Buslov <vladbu@nvidia.com>
When querying mlx5 non-uplink representors capabilities with ethtool
rx-vlan-offload is marked as "off [fixed]". However, it is actually always
enabled because mlx5e_params->vlan_strip_disable is 0 by default when
initializing struct mlx5e_params instance. Fix the issue by explicitly
setting the vlan_strip_disable to 'true' for non-uplink representors.
Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 0c66774a1720..759f7d3c2cfd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -662,6 +662,8 @@ static void mlx5e_build_rep_params(struct net_device *netdev)
params->mqprio.num_tc = 1;
params->tunneled_offload_en = false;
+ if (rep->vport != MLX5_VPORT_UPLINK)
+ params->vlan_strip_disable = true;
mlx5_query_min_inline(mdev, ¶ms->tx_min_inline_mode);
}
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 02/13] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
2022-08-22 19:59 ` [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 03/13] net/mlx5: Eswitch, Fix forwarding decision to uplink Saeed Mahameed
` (10 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Eli Cohen, Maor Dickman, Mark Bloch
From: Eli Cohen <elic@nvidia.com>
Only set MLX5_LAG_FLAG_NDEVS_READY if both netdevices are registered.
Doing so guarantees that both ldev->pf[MLX5_LAG_P0].dev and
ldev->pf[MLX5_LAG_P1].dev have valid pointers when
MLX5_LAG_FLAG_NDEVS_READY is set.
The core issue is asymmetry in setting MLX5_LAG_FLAG_NDEVS_READY and
clearing it. Setting it is done wrongly when both
ldev->pf[MLX5_LAG_P0].dev and ldev->pf[MLX5_LAG_P1].dev are set;
clearing it is done right when either of ldev->pf[i].netdev is cleared.
Consider the following scenario:
1. PF0 loads and sets ldev->pf[MLX5_LAG_P0].dev to a valid pointer
2. PF1 loads and sets both ldev->pf[MLX5_LAG_P1].dev and
ldev->pf[MLX5_LAG_P1].netdev with valid pointers. This results in
MLX5_LAG_FLAG_NDEVS_READY is set.
3. PF0 is unloaded before setting dev->pf[MLX5_LAG_P0].netdev.
MLX5_LAG_FLAG_NDEVS_READY remains set.
Further execution of mlx5_do_bond() will result in null pointer
dereference when calling mlx5_lag_is_multipath()
This patch fixes the following call trace actually encountered:
[ 1293.475195] BUG: kernel NULL pointer dereference, address: 00000000000009a8
[ 1293.478756] #PF: supervisor read access in kernel mode
[ 1293.481320] #PF: error_code(0x0000) - not-present page
[ 1293.483686] PGD 0 P4D 0
[ 1293.484434] Oops: 0000 [#1] SMP PTI
[ 1293.485377] CPU: 1 PID: 23690 Comm: kworker/u16:2 Not tainted 5.18.0-rc5_for_upstream_min_debug_2022_05_05_10_13 #1
[ 1293.488039] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 1293.490836] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
[ 1293.492448] RIP: 0010:mlx5_lag_is_multipath+0x5/0x50 [mlx5_core]
[ 1293.494044] Code: e8 70 40 ff e0 48 8b 14 24 48 83 05 5c 1a 1b 00 01 e9 19 ff ff ff 48 83 05 47 1a 1b 00 01 eb d7 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 87 a8 09 00 00 48 85 c0 74 26 48 83 05 a7 1b 1b 00 01 41 b8
[ 1293.498673] RSP: 0018:ffff88811b2fbe40 EFLAGS: 00010202
[ 1293.500152] RAX: ffff88818a94e1c0 RBX: ffff888165eca6c0 RCX: 0000000000000000
[ 1293.501841] RDX: 0000000000000001 RSI: ffff88818a94e1c0 RDI: 0000000000000000
[ 1293.503585] RBP: 0000000000000000 R08: ffff888119886740 R09: ffff888165eca73c
[ 1293.505286] R10: 0000000000000018 R11: 0000000000000018 R12: ffff88818a94e1c0
[ 1293.506979] R13: ffff888112729800 R14: 0000000000000000 R15: ffff888112729858
[ 1293.508753] FS: 0000000000000000(0000) GS:ffff88852cc40000(0000) knlGS:0000000000000000
[ 1293.510782] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1293.512265] CR2: 00000000000009a8 CR3: 00000001032d4002 CR4: 0000000000370ea0
[ 1293.514001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1293.515806] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 8a66e4585979 ("net/mlx5: Change ownership model for lag")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 0f34e3c80d1f..f67d29164962 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1234,7 +1234,7 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev,
mlx5_ldev_add_netdev(ldev, dev, netdev);
for (i = 0; i < ldev->ports; i++)
- if (!ldev->pf[i].dev)
+ if (!ldev->pf[i].netdev)
break;
if (i >= ldev->ports)
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 03/13] net/mlx5: Eswitch, Fix forwarding decision to uplink
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
2022-08-22 19:59 ` [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps Saeed Mahameed
2022-08-22 19:59 ` [net 02/13] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 04/13] net/mlx5: Disable irq when locking lag_lock Saeed Mahameed
` (9 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Eli Cohen, Maor Dickman
From: Eli Cohen <elic@nvidia.com>
Make sure to modify the rule for uplink forwarding only for the case
where destination vport number is MLX5_VPORT_UPLINK.
Fixes: 94db33177819 ("net/mlx5: Support multiport eswitch mode")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index ed73132129aa..10b0b260f02b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -427,7 +427,8 @@ esw_setup_vport_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *f
dest[dest_idx].vport.vhca_id =
MLX5_CAP_GEN(esw_attr->dests[attr_idx].mdev, vhca_id);
dest[dest_idx].vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID;
- if (mlx5_lag_mpesw_is_activated(esw->dev))
+ if (dest[dest_idx].vport.num == MLX5_VPORT_UPLINK &&
+ mlx5_lag_mpesw_is_activated(esw->dev))
dest[dest_idx].type = MLX5_FLOW_DESTINATION_TYPE_UPLINK;
}
if (esw_attr->dests[attr_idx].flags & MLX5_ESW_DEST_ENCAP) {
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 04/13] net/mlx5: Disable irq when locking lag_lock
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (2 preceding siblings ...)
2022-08-22 19:59 ` [net 03/13] net/mlx5: Eswitch, Fix forwarding decision to uplink Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 05/13] net/mlx5: Fix cmd error logging for manage pages cmd Saeed Mahameed
` (8 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Vlad Buslov, Mark Bloch
From: Vlad Buslov <vladbu@nvidia.com>
The lag_lock is taken from both process and softirq contexts which results
lockdep warning[0] about potential deadlock. However, just disabling
softirqs by using *_bh spinlock API is not enough since it will cause
warning in some contexts where the lock is obtained with hard irqs
disabled. To fix the issue save current irq state, disable them before
obtaining the lock an re-enable irqs from saved state after releasing it.
[0]:
[Sun Aug 7 13:12:29 2022] ================================
[Sun Aug 7 13:12:29 2022] WARNING: inconsistent lock state
[Sun Aug 7 13:12:29 2022] 5.19.0_for_upstream_debug_2022_08_04_16_06 #1 Not tainted
[Sun Aug 7 13:12:29 2022] --------------------------------
[Sun Aug 7 13:12:29 2022] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[Sun Aug 7 13:12:29 2022] swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[Sun Aug 7 13:12:29 2022] ffffffffa06dc0d8 (lag_lock){+.?.}-{2:2}, at: mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] {SOFTIRQ-ON-W} state was registered at:
[Sun Aug 7 13:12:29 2022] lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] _raw_spin_lock+0x2c/0x40
[Sun Aug 7 13:12:29 2022] mlx5_lag_add_netdev+0x13b/0x480 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_nic_enable+0x114/0x470 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_attach_netdev+0x30e/0x6a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_resume+0x105/0x160 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_probe+0xac3/0x14f0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] auxiliary_bus_probe+0x9d/0xe0
[Sun Aug 7 13:12:29 2022] really_probe+0x1e0/0xaa0
[Sun Aug 7 13:12:29 2022] __driver_probe_device+0x219/0x480
[Sun Aug 7 13:12:29 2022] driver_probe_device+0x49/0x130
[Sun Aug 7 13:12:29 2022] __driver_attach+0x1e4/0x4d0
[Sun Aug 7 13:12:29 2022] bus_for_each_dev+0x11e/0x1a0
[Sun Aug 7 13:12:29 2022] bus_add_driver+0x3f4/0x5a0
[Sun Aug 7 13:12:29 2022] driver_register+0x20f/0x390
[Sun Aug 7 13:12:29 2022] __auxiliary_driver_register+0x14e/0x260
[Sun Aug 7 13:12:29 2022] mlx5e_init+0x38/0x90 [mlx5_core]
[Sun Aug 7 13:12:29 2022] vhost_iotlb_itree_augment_rotate+0xcb/0x180 [vhost_iotlb]
[Sun Aug 7 13:12:29 2022] do_one_initcall+0xc4/0x400
[Sun Aug 7 13:12:29 2022] do_init_module+0x18a/0x620
[Sun Aug 7 13:12:29 2022] load_module+0x563a/0x7040
[Sun Aug 7 13:12:29 2022] __do_sys_finit_module+0x122/0x1d0
[Sun Aug 7 13:12:29 2022] do_syscall_64+0x3d/0x90
[Sun Aug 7 13:12:29 2022] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[Sun Aug 7 13:12:29 2022] irq event stamp: 3596508
[Sun Aug 7 13:12:29 2022] hardirqs last enabled at (3596508): [<ffffffff813687c2>] __local_bh_enable_ip+0xa2/0x100
[Sun Aug 7 13:12:29 2022] hardirqs last disabled at (3596507): [<ffffffff813687da>] __local_bh_enable_ip+0xba/0x100
[Sun Aug 7 13:12:29 2022] softirqs last enabled at (3596488): [<ffffffff81368a2a>] irq_exit_rcu+0x11a/0x170
[Sun Aug 7 13:12:29 2022] softirqs last disabled at (3596495): [<ffffffff81368a2a>] irq_exit_rcu+0x11a/0x170
[Sun Aug 7 13:12:29 2022]
other info that might help us debug this:
[Sun Aug 7 13:12:29 2022] Possible unsafe locking scenario:
[Sun Aug 7 13:12:29 2022] CPU0
[Sun Aug 7 13:12:29 2022] ----
[Sun Aug 7 13:12:29 2022] lock(lag_lock);
[Sun Aug 7 13:12:29 2022] <Interrupt>
[Sun Aug 7 13:12:29 2022] lock(lag_lock);
[Sun Aug 7 13:12:29 2022]
*** DEADLOCK ***
[Sun Aug 7 13:12:29 2022] 4 locks held by swapper/0/0:
[Sun Aug 7 13:12:29 2022] #0: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: mlx5e_napi_poll+0x43/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] #1: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb_list_internal+0x2d7/0xd60
[Sun Aug 7 13:12:29 2022] #2: ffff888144a18b58 (&br->hash_lock){+.-.}-{2:2}, at: br_fdb_update+0x301/0x570
[Sun Aug 7 13:12:29 2022] #3: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: atomic_notifier_call_chain+0x5/0x1d0
[Sun Aug 7 13:12:29 2022]
stack backtrace:
[Sun Aug 7 13:12:29 2022] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0_for_upstream_debug_2022_08_04_16_06 #1
[Sun Aug 7 13:12:29 2022] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[Sun Aug 7 13:12:29 2022] Call Trace:
[Sun Aug 7 13:12:29 2022] <IRQ>
[Sun Aug 7 13:12:29 2022] dump_stack_lvl+0x57/0x7d
[Sun Aug 7 13:12:29 2022] mark_lock.part.0.cold+0x5f/0x92
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? unwind_next_frame+0x1c4/0x1b50
[Sun Aug 7 13:12:29 2022] ? secondary_startup_64_no_verify+0xcd/0xdb
[Sun Aug 7 13:12:29 2022] ? mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? stack_access_ok+0x1d0/0x1d0
[Sun Aug 7 13:12:29 2022] ? start_kernel+0x3a7/0x3c5
[Sun Aug 7 13:12:29 2022] __lock_acquire+0x1260/0x6720
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? register_lock_class+0x1880/0x1880
[Sun Aug 7 13:12:29 2022] ? mark_lock.part.0+0xed/0x3060
[Sun Aug 7 13:12:29 2022] ? stack_trace_save+0x91/0xc0
[Sun Aug 7 13:12:29 2022] lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] ? mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? lockdep_hardirqs_on_prepare+0x400/0x400
[Sun Aug 7 13:12:29 2022] ? __lock_acquire+0xd6f/0x6720
[Sun Aug 7 13:12:29 2022] _raw_spin_lock+0x2c/0x40
[Sun Aug 7 13:12:29 2022] ? mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5_esw_bridge_rep_vport_num_vhca_id_get+0x1a0/0x600 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5_esw_bridge_update_work+0x90/0x90 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] mlx5_esw_bridge_switchdev_event+0x185/0x8f0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5_esw_bridge_port_obj_attr_set+0x3e0/0x3e0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] atomic_notifier_call_chain+0xd7/0x1d0
[Sun Aug 7 13:12:29 2022] br_switchdev_fdb_notify+0xea/0x100
[Sun Aug 7 13:12:29 2022] ? br_switchdev_set_port_flag+0x310/0x310
[Sun Aug 7 13:12:29 2022] fdb_notify+0x11b/0x150
[Sun Aug 7 13:12:29 2022] br_fdb_update+0x34c/0x570
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? br_fdb_add_local+0x50/0x50
[Sun Aug 7 13:12:29 2022] ? br_allowed_ingress+0x5f/0x1070
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] br_handle_frame_finish+0x786/0x18e0
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? __lock_acquire+0xd6f/0x6720
[Sun Aug 7 13:12:29 2022] ? sctp_inet_bind_verify+0x4d/0x190
[Sun Aug 7 13:12:29 2022] ? xlog_unpack_data+0x2e0/0x310
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] br_nf_hook_thresh+0x227/0x380 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? setup_pre_routing+0x460/0x460 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? br_nf_pre_routing_ipv6+0x48b/0x69c [br_netfilter]
[Sun Aug 7 13:12:29 2022] br_nf_pre_routing_finish_ipv6+0x5c2/0xbf0 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] br_nf_pre_routing_ipv6+0x4c6/0x69c [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_validate_ipv6+0x9e0/0x9e0 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_nf_forward_arp+0xb70/0xb70 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_nf_pre_routing+0xacf/0x1160 [br_netfilter]
[Sun Aug 7 13:12:29 2022] br_handle_frame+0x8a9/0x1270
[Sun Aug 7 13:12:29 2022] ? br_handle_frame_finish+0x18e0/0x18e0
[Sun Aug 7 13:12:29 2022] ? register_lock_class+0x1880/0x1880
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? bond_handle_frame+0xf9/0xac0 [bonding]
[Sun Aug 7 13:12:29 2022] ? br_handle_frame_finish+0x18e0/0x18e0
[Sun Aug 7 13:12:29 2022] __netif_receive_skb_core+0x7c0/0x2c70
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] ? generic_xdp_tx+0x5b0/0x5b0
[Sun Aug 7 13:12:29 2022] ? __lock_acquire+0xd6f/0x6720
[Sun Aug 7 13:12:29 2022] ? register_lock_class+0x1880/0x1880
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] __netif_receive_skb_list_core+0x2d7/0x8a0
[Sun Aug 7 13:12:29 2022] ? lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] ? process_backlog+0x960/0x960
[Sun Aug 7 13:12:29 2022] ? lockdep_hardirqs_on_prepare+0x129/0x400
[Sun Aug 7 13:12:29 2022] ? kvm_clock_get_cycles+0x14/0x20
[Sun Aug 7 13:12:29 2022] netif_receive_skb_list_internal+0x5f4/0xd60
[Sun Aug 7 13:12:29 2022] ? do_xdp_generic+0x150/0x150
[Sun Aug 7 13:12:29 2022] ? mlx5e_poll_rx_cq+0xf6b/0x2960 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5e_poll_ico_cq+0x3d/0x1590 [mlx5_core]
[Sun Aug 7 13:12:29 2022] napi_complete_done+0x188/0x710
[Sun Aug 7 13:12:29 2022] mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? __queue_work+0x53c/0xeb0
[Sun Aug 7 13:12:29 2022] __napi_poll+0x9f/0x540
[Sun Aug 7 13:12:29 2022] net_rx_action+0x420/0xb70
[Sun Aug 7 13:12:29 2022] ? napi_threaded_poll+0x470/0x470
[Sun Aug 7 13:12:29 2022] ? __common_interrupt+0x79/0x1a0
[Sun Aug 7 13:12:29 2022] __do_softirq+0x271/0x92c
[Sun Aug 7 13:12:29 2022] irq_exit_rcu+0x11a/0x170
[Sun Aug 7 13:12:29 2022] common_interrupt+0x7d/0xa0
[Sun Aug 7 13:12:29 2022] </IRQ>
[Sun Aug 7 13:12:29 2022] <TASK>
[Sun Aug 7 13:12:29 2022] asm_common_interrupt+0x22/0x40
[Sun Aug 7 13:12:29 2022] RIP: 0010:default_idle+0x42/0x60
[Sun Aug 7 13:12:29 2022] Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 6b f1 22 02 85 c0 7e 07 0f 00 2d 80 3b 4a 00 fb f4 <c3> 48 c7 c7 e0 07 7e 85 e8 21 bd 40 fe eb de 66 66 2e 0f 1f 84 00
[Sun Aug 7 13:12:29 2022] RSP: 0018:ffffffff84407e18 EFLAGS: 00000242
[Sun Aug 7 13:12:29 2022] RAX: 0000000000000001 RBX: ffffffff84ec4a68 RCX: 1ffffffff0afc0fc
[Sun Aug 7 13:12:29 2022] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835b1fac
[Sun Aug 7 13:12:29 2022] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d2c44ac3
[Sun Aug 7 13:12:29 2022] R10: ffffed109a588958 R11: 00000000ffffffff R12: 0000000000000000
[Sun Aug 7 13:12:29 2022] R13: ffffffff84efac20 R14: 0000000000000000 R15: dffffc0000000000
[Sun Aug 7 13:12:29 2022] ? default_idle_call+0xcc/0x460
[Sun Aug 7 13:12:29 2022] default_idle_call+0xec/0x460
[Sun Aug 7 13:12:29 2022] do_idle+0x394/0x450
[Sun Aug 7 13:12:29 2022] ? arch_cpu_idle_exit+0x40/0x40
[Sun Aug 7 13:12:29 2022] cpu_startup_entry+0x19/0x20
[Sun Aug 7 13:12:29 2022] rest_init+0x156/0x250
[Sun Aug 7 13:12:29 2022] arch_call_rest_init+0xf/0x15
[Sun Aug 7 13:12:29 2022] start_kernel+0x3a7/0x3c5
[Sun Aug 7 13:12:29 2022] secondary_startup_64_no_verify+0xcd/0xdb
[Sun Aug 7 13:12:29 2022] </TASK>
Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 55 +++++++++++--------
1 file changed, 33 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index f67d29164962..065102278cb8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1067,30 +1067,32 @@ static void mlx5_ldev_add_netdev(struct mlx5_lag *ldev,
struct net_device *netdev)
{
unsigned int fn = mlx5_get_dev_index(dev);
+ unsigned long flags;
if (fn >= ldev->ports)
return;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev->pf[fn].netdev = netdev;
ldev->tracker.netdev_state[fn].link_up = 0;
ldev->tracker.netdev_state[fn].tx_enabled = 0;
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
}
static void mlx5_ldev_remove_netdev(struct mlx5_lag *ldev,
struct net_device *netdev)
{
+ unsigned long flags;
int i;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
for (i = 0; i < ldev->ports; i++) {
if (ldev->pf[i].netdev == netdev) {
ldev->pf[i].netdev = NULL;
break;
}
}
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
}
static void mlx5_ldev_add_mdev(struct mlx5_lag *ldev,
@@ -1246,12 +1248,13 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev,
bool mlx5_lag_is_roce(struct mlx5_core_dev *dev)
{
struct mlx5_lag *ldev;
+ unsigned long flags;
bool res;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
res = ldev && __mlx5_lag_is_roce(ldev);
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return res;
}
@@ -1260,12 +1263,13 @@ EXPORT_SYMBOL(mlx5_lag_is_roce);
bool mlx5_lag_is_active(struct mlx5_core_dev *dev)
{
struct mlx5_lag *ldev;
+ unsigned long flags;
bool res;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
res = ldev && __mlx5_lag_is_active(ldev);
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return res;
}
@@ -1274,13 +1278,14 @@ EXPORT_SYMBOL(mlx5_lag_is_active);
bool mlx5_lag_is_master(struct mlx5_core_dev *dev)
{
struct mlx5_lag *ldev;
+ unsigned long flags;
bool res;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
res = ldev && __mlx5_lag_is_active(ldev) &&
dev == ldev->pf[MLX5_LAG_P1].dev;
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return res;
}
@@ -1289,12 +1294,13 @@ EXPORT_SYMBOL(mlx5_lag_is_master);
bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev)
{
struct mlx5_lag *ldev;
+ unsigned long flags;
bool res;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
res = ldev && __mlx5_lag_is_sriov(ldev);
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return res;
}
@@ -1303,13 +1309,14 @@ EXPORT_SYMBOL(mlx5_lag_is_sriov);
bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev)
{
struct mlx5_lag *ldev;
+ unsigned long flags;
bool res;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
res = ldev && __mlx5_lag_is_sriov(ldev) &&
test_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &ldev->mode_flags);
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return res;
}
@@ -1352,9 +1359,10 @@ struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev)
{
struct net_device *ndev = NULL;
struct mlx5_lag *ldev;
+ unsigned long flags;
int i;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
if (!(ldev && __mlx5_lag_is_roce(ldev)))
@@ -1373,7 +1381,7 @@ struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev)
dev_hold(ndev);
unlock:
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return ndev;
}
@@ -1383,10 +1391,11 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev,
struct net_device *slave)
{
struct mlx5_lag *ldev;
+ unsigned long flags;
u8 port = 0;
int i;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
if (!(ldev && __mlx5_lag_is_roce(ldev)))
goto unlock;
@@ -1401,7 +1410,7 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev,
port = ldev->v2p_map[port * ldev->buckets];
unlock:
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return port;
}
EXPORT_SYMBOL(mlx5_lag_get_slave_port);
@@ -1422,8 +1431,9 @@ struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev)
{
struct mlx5_core_dev *peer_dev = NULL;
struct mlx5_lag *ldev;
+ unsigned long flags;
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
if (!ldev)
goto unlock;
@@ -1433,7 +1443,7 @@ struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev)
ldev->pf[MLX5_LAG_P1].dev;
unlock:
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
return peer_dev;
}
EXPORT_SYMBOL(mlx5_lag_get_peer_mdev);
@@ -1446,6 +1456,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
int outlen = MLX5_ST_SZ_BYTES(query_cong_statistics_out);
struct mlx5_core_dev **mdev;
struct mlx5_lag *ldev;
+ unsigned long flags;
int num_ports;
int ret, i, j;
void *out;
@@ -1462,7 +1473,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
memset(values, 0, sizeof(*values) * num_counters);
- spin_lock(&lag_lock);
+ spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
if (ldev && __mlx5_lag_is_active(ldev)) {
num_ports = ldev->ports;
@@ -1472,7 +1483,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
num_ports = 1;
mdev[MLX5_LAG_P1] = dev;
}
- spin_unlock(&lag_lock);
+ spin_unlock_irqrestore(&lag_lock, flags);
for (i = 0; i < num_ports; ++i) {
u32 in[MLX5_ST_SZ_DW(query_cong_statistics_in)] = {};
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 05/13] net/mlx5: Fix cmd error logging for manage pages cmd
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (3 preceding siblings ...)
2022-08-22 19:59 ` [net 04/13] net/mlx5: Disable irq when locking lag_lock Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 06/13] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
` (7 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Roy Novich, Moshe Shemesh
From: Roy Novich <royno@nvidia.com>
When the driver unloads, give/reclaim_pages may fail as PF driver in
teardown flow, current code will lead to the following kernel log print
'failed reclaiming pages: err 0'.
Fix it to get same behavior as before the cited commits,
by calling mlx5_cmd_check before handling error state.
mlx5_cmd_check will verify if the returned error is an actual error
needed to be handled by the driver or not and will return an
appropriate value.
Fixes: 8d564292a166 ("net/mlx5: Remove redundant error on reclaim pages")
Fixes: 4dac2f10ada0 ("net/mlx5: Remove redundant notify fail on give pages")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
index ec76a8b1acc1..60596357bfc7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
@@ -376,8 +376,8 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
goto out_dropped;
}
}
+ err = mlx5_cmd_check(dev, err, in, out);
if (err) {
- err = mlx5_cmd_check(dev, err, in, out);
mlx5_core_warn(dev, "func_id 0x%x, npages %d, err %d\n",
func_id, npages, err);
goto out_dropped;
@@ -524,10 +524,13 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
dev->priv.reclaim_pages_discard += npages;
}
/* if triggered by FW event and failed by FW then ignore */
- if (event && err == -EREMOTEIO)
+ if (event && err == -EREMOTEIO) {
err = 0;
+ goto out_free;
+ }
+
+ err = mlx5_cmd_check(dev, err, in, out);
if (err) {
- err = mlx5_cmd_check(dev, err, in, out);
mlx5_core_err(dev, "failed reclaiming pages: err %d\n", err);
goto out_free;
}
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 06/13] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (4 preceding siblings ...)
2022-08-22 19:59 ` [net 05/13] net/mlx5: Fix cmd error logging for manage pages cmd Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 07/13] net/mlx5e: Fix wrong application of the LRO state Saeed Mahameed
` (6 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Moshe Shemesh, Shay Drory
From: Moshe Shemesh <moshe@nvidia.com>
Add a lock_class_key per mlx5 device to avoid a false positive
"possible circular locking dependency" warning by lockdep, on flows
which lock more than one mlx5 device, such as adding SF.
kernel log:
======================================================
WARNING: possible circular locking dependency detected
5.19.0-rc8+ #2 Not tainted
------------------------------------------------------
kworker/u20:0/8 is trying to acquire lock:
ffff88812dfe0d98 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_init_one+0x2e/0x490 [mlx5_core]
but task is already holding lock:
ffff888101aa7898 (&(¬ifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&(¬ifier->n_head)->rwsem){++++}-{3:3}:
down_write+0x90/0x150
blocking_notifier_chain_register+0x53/0xa0
mlx5_sf_table_init+0x369/0x4a0 [mlx5_core]
mlx5_init_one+0x261/0x490 [mlx5_core]
probe_one+0x430/0x680 [mlx5_core]
local_pci_probe+0xd6/0x170
work_for_cpu_fn+0x4e/0xa0
process_one_work+0x7c2/0x1340
worker_thread+0x6f6/0xec0
kthread+0x28f/0x330
ret_from_fork+0x1f/0x30
-> #0 (&dev->intf_state_mutex){+.+.}-{3:3}:
__lock_acquire+0x2fc7/0x6720
lock_acquire+0x1c1/0x550
__mutex_lock+0x12c/0x14b0
mlx5_init_one+0x2e/0x490 [mlx5_core]
mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
auxiliary_bus_probe+0x9d/0xe0
really_probe+0x1e0/0xaa0
__driver_probe_device+0x219/0x480
driver_probe_device+0x49/0x130
__device_attach_driver+0x1b8/0x280
bus_for_each_drv+0x123/0x1a0
__device_attach+0x1a3/0x460
bus_probe_device+0x1a2/0x260
device_add+0x9b1/0x1b40
__auxiliary_device_add+0x88/0xc0
mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
blocking_notifier_call_chain+0xd5/0x130
mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
process_one_work+0x7c2/0x1340
worker_thread+0x59d/0xec0
kthread+0x28f/0x330
ret_from_fork+0x1f/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&(¬ifier->n_head)->rwsem);
lock(&dev->intf_state_mutex);
lock(&(¬ifier->n_head)->rwsem);
lock(&dev->intf_state_mutex);
*** DEADLOCK ***
4 locks held by kworker/u20:0/8:
#0: ffff888150612938 ((wq_completion)mlx5_events){+.+.}-{0:0}, at: process_one_work+0x6e2/0x1340
#1: ffff888100cafdb8 ((work_completion)(&work->work)#3){+.+.}-{0:0}, at: process_one_work+0x70f/0x1340
#2: ffff888101aa7898 (&(¬ifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
#3: ffff88813682d0e8 (&dev->mutex){....}-{3:3}, at:__device_attach+0x76/0x460
stack backtrace:
CPU: 6 PID: 8 Comm: kworker/u20:0 Not tainted 5.19.0-rc8+
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5_events mlx5_vhca_state_work_handler [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x7d
check_noncircular+0x278/0x300
? print_circular_bug+0x460/0x460
? lock_chain_count+0x20/0x20
? register_lock_class+0x1880/0x1880
__lock_acquire+0x2fc7/0x6720
? register_lock_class+0x1880/0x1880
? register_lock_class+0x1880/0x1880
lock_acquire+0x1c1/0x550
? mlx5_init_one+0x2e/0x490 [mlx5_core]
? lockdep_hardirqs_on_prepare+0x400/0x400
__mutex_lock+0x12c/0x14b0
? mlx5_init_one+0x2e/0x490 [mlx5_core]
? mlx5_init_one+0x2e/0x490 [mlx5_core]
? _raw_read_unlock+0x1f/0x30
? mutex_lock_io_nested+0x1320/0x1320
? __ioremap_caller.constprop.0+0x306/0x490
? mlx5_sf_dev_probe+0x269/0x370 [mlx5_core]
? iounmap+0x160/0x160
mlx5_init_one+0x2e/0x490 [mlx5_core]
mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
? mlx5_sf_dev_remove+0x130/0x130 [mlx5_core]
auxiliary_bus_probe+0x9d/0xe0
really_probe+0x1e0/0xaa0
__driver_probe_device+0x219/0x480
? auxiliary_match_id+0xe9/0x140
driver_probe_device+0x49/0x130
__device_attach_driver+0x1b8/0x280
? driver_allows_async_probing+0x140/0x140
bus_for_each_drv+0x123/0x1a0
? bus_for_each_dev+0x1a0/0x1a0
? lockdep_hardirqs_on_prepare+0x286/0x400
? trace_hardirqs_on+0x2d/0x100
__device_attach+0x1a3/0x460
? device_driver_attach+0x1e0/0x1e0
? kobject_uevent_env+0x22d/0xf10
bus_probe_device+0x1a2/0x260
device_add+0x9b1/0x1b40
? dev_set_name+0xab/0xe0
? __fw_devlink_link_to_suppliers+0x260/0x260
? memset+0x20/0x40
? lockdep_init_map_type+0x21a/0x7d0
__auxiliary_device_add+0x88/0xc0
? auxiliary_device_init+0x86/0xa0
mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
blocking_notifier_call_chain+0xd5/0x130
mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
? mlx5_vhca_event_arm+0x100/0x100 [mlx5_core]
? lock_downgrade+0x6e0/0x6e0
? lockdep_hardirqs_on_prepare+0x286/0x400
process_one_work+0x7c2/0x1340
? lockdep_hardirqs_on_prepare+0x400/0x400
? pwq_dec_nr_in_flight+0x230/0x230
? rwlock_bug.part.0+0x90/0x90
worker_thread+0x59d/0xec0
? process_one_work+0x1340/0x1340
kthread+0x28f/0x330
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
</TASK>
Fixes: 6a3273217469 ("net/mlx5: SF, Port function state change support")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ++++
include/linux/mlx5/driver.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index bec8d6d0b5f6..c085b031abfc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1530,7 +1530,9 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
memcpy(&dev->profile, &profile[profile_idx], sizeof(dev->profile));
INIT_LIST_HEAD(&priv->ctx_list);
spin_lock_init(&priv->ctx_lock);
+ lockdep_register_key(&dev->lock_key);
mutex_init(&dev->intf_state_mutex);
+ lockdep_set_class(&dev->intf_state_mutex, &dev->lock_key);
mutex_init(&priv->bfregs.reg_head.lock);
mutex_init(&priv->bfregs.wc_head.lock);
@@ -1597,6 +1599,7 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
mutex_destroy(&priv->bfregs.wc_head.lock);
mutex_destroy(&priv->bfregs.reg_head.lock);
mutex_destroy(&dev->intf_state_mutex);
+ lockdep_unregister_key(&dev->lock_key);
return err;
}
@@ -1618,6 +1621,7 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev)
mutex_destroy(&priv->bfregs.wc_head.lock);
mutex_destroy(&priv->bfregs.reg_head.lock);
mutex_destroy(&dev->intf_state_mutex);
+ lockdep_unregister_key(&dev->lock_key);
}
static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 96b16fbe1aa4..7b7ce602c808 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -779,6 +779,7 @@ struct mlx5_core_dev {
enum mlx5_device_state state;
/* sync interface state */
struct mutex intf_state_mutex;
+ struct lock_class_key lock_key;
unsigned long intf_state;
struct mlx5_priv priv;
struct mlx5_profile profile;
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 07/13] net/mlx5e: Fix wrong application of the LRO state
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (5 preceding siblings ...)
2022-08-22 19:59 ` [net 06/13] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 08/13] net/mlx5e: TC, Add missing policer validation Saeed Mahameed
` (5 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Aya Levin, Tariq Toukan, Maxim Mikityanskiy
From: Aya Levin <ayal@nvidia.com>
Driver caches packet merge type in mlx5e_params instance which must be
in perfect sync with the netdev_feature's bit.
Prior to this patch, in certain conditions (*) LRO state was set in
mlx5e_params, while netdev_feature's bit was off. Causing the LRO to
be applied on the RQs (HW level).
(*) This can happen only on profile init (mlx5e_build_nic_params()),
when RQ expect non-linear SKB and PCI is fast enough in comparison to
link width.
Solution: remove setting of packet merge type from
mlx5e_build_nic_params() as netdev features are not updated.
Fixes: 619a8f2a42f1 ("net/mlx5e: Use linear SKB in Striding RQ")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index d858667736a3..c65b6a2883d3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4769,14 +4769,6 @@ void mlx5e_build_nic_params(struct mlx5e_priv *priv, struct mlx5e_xsk *xsk, u16
/* RQ */
mlx5e_build_rq_params(mdev, params);
- /* HW LRO */
- if (MLX5_CAP_ETH(mdev, lro_cap) &&
- params->rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
- /* No XSK params: checking the availability of striding RQ in general. */
- if (!mlx5e_rx_mpwqe_is_linear_skb(mdev, params, NULL))
- params->packet_merge.type = slow_pci_heuristic(mdev) ?
- MLX5E_PACKET_MERGE_NONE : MLX5E_PACKET_MERGE_LRO;
- }
params->packet_merge.timeout = mlx5e_choose_lro_timeout(mdev, MLX5E_DEFAULT_LRO_TIMEOUT);
/* CQ moderation params */
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 08/13] net/mlx5e: TC, Add missing policer validation
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (6 preceding siblings ...)
2022-08-22 19:59 ` [net 07/13] net/mlx5e: Fix wrong application of the LRO state Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 09/13] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off Saeed Mahameed
` (4 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Roi Dayan, Maor Dickman
From: Roi Dayan <roid@nvidia.com>
There is a missing policer validation when offloading police action
with tc action api. Add it.
Fixes: 7d1a5ce46e47 ("net/mlx5e: TC, Support tc action api for police")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/police.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/police.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/police.c
index 37522352e4b2..c8e5ca65bb6e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/police.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/police.c
@@ -79,6 +79,10 @@ tc_act_police_offload(struct mlx5e_priv *priv,
struct mlx5e_flow_meter_handle *meter;
int err = 0;
+ err = mlx5e_policer_validate(&fl_act->action, act, fl_act->extack);
+ if (err)
+ return err;
+
err = fill_meter_params_from_act(act, ¶ms);
if (err)
return err;
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 09/13] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (7 preceding siblings ...)
2022-08-22 19:59 ` [net 08/13] net/mlx5e: TC, Add missing policer validation Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 10/13] net/mlx5: unlock on error path in esw_vfs_changed_event_handler() Saeed Mahameed
` (3 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Maor Dickman
From: Maor Dickman <maord@nvidia.com>
The cited commit reintroduced the ability to set hw-tc-offload
in switchdev mode by reusing NIC mode calls without modifying it
to support both modes, this can cause an illegal memory access
when trying to turn hw-tc-offload off.
Fix this by using the right TC_FLAG when checking if tc rules
are installed while disabling hw-tc-offload.
Fixes: d3cbd4254df8 ("net/mlx5e: Add ndo_set_feature for uplink representor")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index c65b6a2883d3..02eb2f0fa2ae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3682,7 +3682,9 @@ static int set_feature_hw_tc(struct net_device *netdev, bool enable)
int err = 0;
#if IS_ENABLED(CONFIG_MLX5_CLS_ACT)
- if (!enable && mlx5e_tc_num_filters(priv, MLX5_TC_FLAG(NIC_OFFLOAD))) {
+ int tc_flag = mlx5e_is_uplink_rep(priv) ? MLX5_TC_FLAG(ESW_OFFLOAD) :
+ MLX5_TC_FLAG(NIC_OFFLOAD);
+ if (!enable && mlx5e_tc_num_filters(priv, tc_flag)) {
netdev_err(netdev,
"Active offloaded tc filters, can't turn hw_tc_offload off\n");
return -EINVAL;
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 10/13] net/mlx5: unlock on error path in esw_vfs_changed_event_handler()
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (8 preceding siblings ...)
2022-08-22 19:59 ` [net 09/13] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 11/13] net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup() Saeed Mahameed
` (2 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Dan Carpenter
From: Dan Carpenter <dan.carpenter@oracle.com>
Unlock before returning on this error path.
Fixes: f1bc646c9a06 ("net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_register")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 10b0b260f02b..a9f4c652f859 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -3116,8 +3116,10 @@ esw_vfs_changed_event_handler(struct mlx5_eswitch *esw, const u32 *out)
err = mlx5_eswitch_load_vf_vports(esw, new_num_vfs,
MLX5_VPORT_UC_ADDR_CHANGE);
- if (err)
+ if (err) {
+ devl_unlock(devlink);
return;
+ }
}
esw->esw_funcs.num_vfs = new_num_vfs;
devl_unlock(devlink);
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 11/13] net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup()
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (9 preceding siblings ...)
2022-08-22 19:59 ` [net 10/13] net/mlx5: unlock on error path in esw_vfs_changed_event_handler() Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 12/13] net/mlx5e: Fix use after free in mlx5e_fs_init() Saeed Mahameed
2022-08-22 19:59 ` [net 13/13] net/mlx5: Unlock on error in mlx5_sriov_enable() Saeed Mahameed
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Dan Carpenter, Tariq Toukan
From: Dan Carpenter <dan.carpenter@oracle.com>
Use the list_for_each_entry_safe() macro to prevent dereferencing "obj"
after it has been freed.
Fixes: c4dfe704f53f ("net/mlx5e: kTLS, Recycle objects of device-offloaded TLS TX connections")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
index 0aef69527226..3a1f76eac542 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
@@ -246,7 +246,7 @@ static void mlx5e_tls_priv_tx_cleanup(struct mlx5e_ktls_offload_context_tx *priv
static void mlx5e_tls_priv_tx_list_cleanup(struct mlx5_core_dev *mdev,
struct list_head *list, int size)
{
- struct mlx5e_ktls_offload_context_tx *obj;
+ struct mlx5e_ktls_offload_context_tx *obj, *n;
struct mlx5e_async_ctx *bulk_async;
int i;
@@ -255,7 +255,7 @@ static void mlx5e_tls_priv_tx_list_cleanup(struct mlx5_core_dev *mdev,
return;
i = 0;
- list_for_each_entry(obj, list, list_node) {
+ list_for_each_entry_safe(obj, n, list, list_node) {
mlx5e_tls_priv_tx_cleanup(obj, &bulk_async[i]);
i++;
}
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 12/13] net/mlx5e: Fix use after free in mlx5e_fs_init()
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (10 preceding siblings ...)
2022-08-22 19:59 ` [net 11/13] net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup() Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
2022-08-22 19:59 ` [net 13/13] net/mlx5: Unlock on error in mlx5_sriov_enable() Saeed Mahameed
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Dan Carpenter, Tariq Toukan
From: Dan Carpenter <dan.carpenter@oracle.com>
Call mlx5e_fs_vlan_free(fs) before kvfree(fs).
Fixes: af8bbf730068 ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
index e2a9b9be5c1f..e0ce5a233d0b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
@@ -1395,10 +1395,11 @@ struct mlx5e_flow_steering *mlx5e_fs_init(const struct mlx5e_profile *profile,
}
return fs;
-err_free_fs:
- kvfree(fs);
+
err_free_vlan:
mlx5e_fs_vlan_free(fs);
+err_free_fs:
+ kvfree(fs);
err:
return NULL;
}
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [net 13/13] net/mlx5: Unlock on error in mlx5_sriov_enable()
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
` (11 preceding siblings ...)
2022-08-22 19:59 ` [net 12/13] net/mlx5e: Fix use after free in mlx5e_fs_init() Saeed Mahameed
@ 2022-08-22 19:59 ` Saeed Mahameed
12 siblings, 0 replies; 15+ messages in thread
From: Saeed Mahameed @ 2022-08-22 19:59 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Dan Carpenter
From: Dan Carpenter <dan.carpenter@oracle.com>
Unlock before returning if mlx5_device_enable_sriov() fails.
Fixes: 84a433a40d0e ("net/mlx5: Lock mlx5 devlink reload callbacks")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/sriov.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index ee2e1b7c1310..c0e6c487c63c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -159,11 +159,11 @@ static int mlx5_sriov_enable(struct pci_dev *pdev, int num_vfs)
devl_lock(devlink);
err = mlx5_device_enable_sriov(dev, num_vfs);
+ devl_unlock(devlink);
if (err) {
mlx5_core_warn(dev, "mlx5_device_enable_sriov failed : %d\n", err);
return err;
}
- devl_unlock(devlink);
err = pci_enable_sriov(pdev, num_vfs);
if (err) {
--
2.37.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps
2022-08-22 19:59 ` [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps Saeed Mahameed
@ 2022-08-24 1:00 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-08-24 1:00 UTC (permalink / raw)
To: Saeed Mahameed
Cc: davem, kuba, pabeni, edumazet, saeedm, netdev, vladbu, roid
Hello:
This series was applied to netdev/net.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:
On Mon, 22 Aug 2022 12:59:05 -0700 you wrote:
> From: Vlad Buslov <vladbu@nvidia.com>
>
> When querying mlx5 non-uplink representors capabilities with ethtool
> rx-vlan-offload is marked as "off [fixed]". However, it is actually always
> enabled because mlx5e_params->vlan_strip_disable is 0 by default when
> initializing struct mlx5e_params instance. Fix the issue by explicitly
> setting the vlan_strip_disable to 'true' for non-uplink representors.
>
> [...]
Here is the summary with links:
- [net,01/13] net/mlx5e: Properly disable vlan strip on non-UL reps
https://git.kernel.org/netdev/net/c/f37044fd759b
- [net,02/13] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY
https://git.kernel.org/netdev/net/c/a6e675a66175
- [net,03/13] net/mlx5: Eswitch, Fix forwarding decision to uplink
https://git.kernel.org/netdev/net/c/942fca7e762b
- [net,04/13] net/mlx5: Disable irq when locking lag_lock
https://git.kernel.org/netdev/net/c/8e93f29422ff
- [net,05/13] net/mlx5: Fix cmd error logging for manage pages cmd
https://git.kernel.org/netdev/net/c/090f3e4f4089
- [net,06/13] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
https://git.kernel.org/netdev/net/c/d59b73a66e5e
- [net,07/13] net/mlx5e: Fix wrong application of the LRO state
https://git.kernel.org/netdev/net/c/7b3707fc7904
- [net,08/13] net/mlx5e: TC, Add missing policer validation
https://git.kernel.org/netdev/net/c/f7a4e867f48c
- [net,09/13] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off
https://git.kernel.org/netdev/net/c/550f96432e6f
- [net,10/13] net/mlx5: unlock on error path in esw_vfs_changed_event_handler()
https://git.kernel.org/netdev/net/c/b868c8fe37bd
- [net,11/13] net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup()
https://git.kernel.org/netdev/net/c/6514210b6d0d
- [net,12/13] net/mlx5e: Fix use after free in mlx5e_fs_init()
https://git.kernel.org/netdev/net/c/21234e3a84c7
- [net,13/13] net/mlx5: Unlock on error in mlx5_sriov_enable()
https://git.kernel.org/netdev/net/c/35419025cb1e
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-08-24 1:00 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-22 19:59 [pull request][net 00/13] mlx5 fixes 2022-08-22 Saeed Mahameed
2022-08-22 19:59 ` [net 01/13] net/mlx5e: Properly disable vlan strip on non-UL reps Saeed Mahameed
2022-08-24 1:00 ` patchwork-bot+netdevbpf
2022-08-22 19:59 ` [net 02/13] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Saeed Mahameed
2022-08-22 19:59 ` [net 03/13] net/mlx5: Eswitch, Fix forwarding decision to uplink Saeed Mahameed
2022-08-22 19:59 ` [net 04/13] net/mlx5: Disable irq when locking lag_lock Saeed Mahameed
2022-08-22 19:59 ` [net 05/13] net/mlx5: Fix cmd error logging for manage pages cmd Saeed Mahameed
2022-08-22 19:59 ` [net 06/13] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
2022-08-22 19:59 ` [net 07/13] net/mlx5e: Fix wrong application of the LRO state Saeed Mahameed
2022-08-22 19:59 ` [net 08/13] net/mlx5e: TC, Add missing policer validation Saeed Mahameed
2022-08-22 19:59 ` [net 09/13] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off Saeed Mahameed
2022-08-22 19:59 ` [net 10/13] net/mlx5: unlock on error path in esw_vfs_changed_event_handler() Saeed Mahameed
2022-08-22 19:59 ` [net 11/13] net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup() Saeed Mahameed
2022-08-22 19:59 ` [net 12/13] net/mlx5e: Fix use after free in mlx5e_fs_init() Saeed Mahameed
2022-08-22 19:59 ` [net 13/13] net/mlx5: Unlock on error in mlx5_sriov_enable() Saeed Mahameed
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.