* [net-next V2 01/14] RDMA/mlx5: Free second uplink ib port
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
@ 2023-06-07 21:03 ` Saeed Mahameed
2023-06-09 2:40 ` patchwork-bot+netdevbpf
2023-06-07 21:03 ` [net-next V2 02/14] {net/RDMA}/mlx5: introduce lag_for_each_peer Saeed Mahameed
` (12 subsequent siblings)
13 siblings, 1 reply; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:03 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
The cited patch introduce ib port for the slave device uplink in
case of multiport eswitch. However, this ib port didn't perform
anything when unloaded.
Unload the new ib port properly.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/infiniband/hw/mlx5/ib_rep.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
index ddcfc116b19a..a4db22fe1883 100644
--- a/drivers/infiniband/hw/mlx5/ib_rep.c
+++ b/drivers/infiniband/hw/mlx5/ib_rep.c
@@ -126,7 +126,7 @@ mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
!mlx5_lag_is_master(mdev)) {
struct mlx5_core_dev *peer_mdev;
- if (rep->vport == MLX5_VPORT_UPLINK)
+ if (rep->vport == MLX5_VPORT_UPLINK && !mlx5_lag_is_mpesw(mdev))
return;
peer_mdev = mlx5_lag_get_peer_mdev(mdev);
vport_index += mlx5_eswitch_get_total_vports(peer_mdev);
@@ -146,6 +146,9 @@ mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
struct mlx5_core_dev *peer_mdev;
struct mlx5_eswitch *esw;
+ if (mlx5_lag_is_shared_fdb(mdev) && !mlx5_lag_is_master(mdev))
+ return;
+
if (mlx5_lag_is_shared_fdb(mdev)) {
peer_mdev = mlx5_lag_get_peer_mdev(mdev);
esw = peer_mdev->priv.eswitch;
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [net-next V2 01/14] RDMA/mlx5: Free second uplink ib port
2023-06-07 21:03 ` [net-next V2 01/14] RDMA/mlx5: Free second uplink ib port Saeed Mahameed
@ 2023-06-09 2:40 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-06-09 2:40 UTC (permalink / raw)
To: Saeed Mahameed
Cc: davem, kuba, pabeni, edumazet, saeedm, netdev, tariqt, leonro,
linux-rdma, shayd, mbloch
Hello:
This series was applied to netdev/net-next.git (main)
by Saeed Mahameed <saeedm@nvidia.com>:
On Wed, 7 Jun 2023 14:03:57 -0700 you wrote:
> From: Shay Drory <shayd@nvidia.com>
>
> The cited patch introduce ib port for the slave device uplink in
> case of multiport eswitch. However, this ib port didn't perform
> anything when unloaded.
> Unload the new ib port properly.
>
> [...]
Here is the summary with links:
- [net-next,V2,01/14] RDMA/mlx5: Free second uplink ib port
https://git.kernel.org/netdev/net-next/c/962825e534a9
- [net-next,V2,02/14] {net/RDMA}/mlx5: introduce lag_for_each_peer
https://git.kernel.org/netdev/net-next/c/222dd185833e
- [net-next,V2,03/14] net/mlx5: LAG, check if all eswitches are paired for shared FDB
https://git.kernel.org/netdev/net-next/c/4c103aea4bed
- [net-next,V2,04/14] net/mlx5: LAG, generalize handling of shared FDB
https://git.kernel.org/netdev/net-next/c/86a12124dc02
- [net-next,V2,05/14] net/mlx5: LAG, change mlx5_shared_fdb_supported() to static
https://git.kernel.org/netdev/net-next/c/c83e6ab96ef2
- [net-next,V2,06/14] net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports
https://git.kernel.org/netdev/net-next/c/d61bab396115
- [net-next,V2,07/14] net/mlx5: LAG, block multiport eswitch LAG in case ldev have more than 2 ports
https://git.kernel.org/netdev/net-next/c/7718c1c8ac32
- [net-next,V2,08/14] net/mlx5: Enable 4 ports VF LAG
https://git.kernel.org/netdev/net-next/c/6ec0b55e72a5
- [net-next,V2,09/14] net/mlx5e: Expose catastrophic steering error counters
https://git.kernel.org/netdev/net-next/c/a33682e4e78e
- [net-next,V2,10/14] net/mlx5e: Remove RX page cache leftovers
https://git.kernel.org/netdev/net-next/c/f4692ab13a1f
- [net-next,V2,11/14] net/mlx5e: TC, refactor access to hash key
https://git.kernel.org/netdev/net-next/c/de1f0a650824
- [net-next,V2,12/14] net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure
https://git.kernel.org/netdev/net-next/c/97bd788efb90
- [net-next,V2,13/14] mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager
https://git.kernel.org/netdev/net-next/c/eb8e9fae0a22
- [net-next,V2,14/14] net/mlx5e: simplify condition after napi budget handling change
https://git.kernel.org/netdev/net-next/c/803ea346bd3f
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* [net-next V2 02/14] {net/RDMA}/mlx5: introduce lag_for_each_peer
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
2023-06-07 21:03 ` [net-next V2 01/14] RDMA/mlx5: Free second uplink ib port Saeed Mahameed
@ 2023-06-07 21:03 ` Saeed Mahameed
2023-06-07 21:03 ` [net-next V2 03/14] net/mlx5: LAG, check if all eswitches are paired for shared FDB Saeed Mahameed
` (11 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:03 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
Introduce a generic APIs to iterate over all the devices which are part
of the LAG. This API replace mlx5_lag_get_peer_mdev() which retrieve
only a single peer device from the lag.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/infiniband/hw/mlx5/ib_rep.c | 98 ++++++++++++-------
.../net/ethernet/mellanox/mlx5/core/fs_cmd.c | 24 +++--
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 21 +++-
include/linux/mlx5/driver.h | 8 +-
4 files changed, 100 insertions(+), 51 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
index a4db22fe1883..c7a4ee896121 100644
--- a/drivers/infiniband/hw/mlx5/ib_rep.c
+++ b/drivers/infiniband/hw/mlx5/ib_rep.c
@@ -30,45 +30,65 @@ mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev,
static void mlx5_ib_register_peer_vport_reps(struct mlx5_core_dev *mdev);
+static void mlx5_ib_num_ports_update(struct mlx5_core_dev *dev, u32 *num_ports)
+{
+ struct mlx5_core_dev *peer_dev;
+ int i;
+
+ mlx5_lag_for_each_peer_mdev(dev, peer_dev, i) {
+ u32 peer_num_ports = mlx5_eswitch_get_total_vports(peer_dev);
+
+ if (mlx5_lag_is_mpesw(peer_dev))
+ *num_ports += peer_num_ports;
+ else
+ /* Only 1 ib port is the representor for all uplinks */
+ *num_ports += peer_num_ports - 1;
+ }
+}
+
static int
mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
{
u32 num_ports = mlx5_eswitch_get_total_vports(dev);
+ struct mlx5_core_dev *lag_master = dev;
const struct mlx5_ib_profile *profile;
struct mlx5_core_dev *peer_dev;
struct mlx5_ib_dev *ibdev;
- int second_uplink = false;
- u32 peer_num_ports;
+ int new_uplink = false;
int vport_index;
int ret;
+ int i;
vport_index = rep->vport_index;
if (mlx5_lag_is_shared_fdb(dev)) {
- peer_dev = mlx5_lag_get_peer_mdev(dev);
- peer_num_ports = mlx5_eswitch_get_total_vports(peer_dev);
if (mlx5_lag_is_master(dev)) {
- if (mlx5_lag_is_mpesw(dev))
- num_ports += peer_num_ports;
- else
- num_ports += peer_num_ports - 1;
-
+ mlx5_ib_num_ports_update(dev, &num_ports);
} else {
if (rep->vport == MLX5_VPORT_UPLINK) {
if (!mlx5_lag_is_mpesw(dev))
return 0;
- second_uplink = true;
+ new_uplink = true;
}
+ mlx5_lag_for_each_peer_mdev(dev, peer_dev, i) {
+ u32 peer_n_ports = mlx5_eswitch_get_total_vports(peer_dev);
+
+ if (mlx5_lag_is_master(peer_dev))
+ lag_master = peer_dev;
+ else if (!mlx5_lag_is_mpesw(dev))
+ /* Only 1 ib port is the representor for all uplinks */
+ peer_n_ports--;
- vport_index += peer_num_ports;
- dev = peer_dev;
+ if (mlx5_get_dev_index(peer_dev) < mlx5_get_dev_index(dev))
+ vport_index += peer_n_ports;
+ }
}
}
- if (rep->vport == MLX5_VPORT_UPLINK && !second_uplink)
+ if (rep->vport == MLX5_VPORT_UPLINK && !new_uplink)
profile = &raw_eth_profile;
else
- return mlx5_ib_set_vport_rep(dev, rep, vport_index);
+ return mlx5_ib_set_vport_rep(lag_master, rep, vport_index);
ibdev = ib_alloc_device(mlx5_ib_dev, ib_dev);
if (!ibdev)
@@ -85,8 +105,8 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
vport_index = rep->vport_index;
ibdev->port[vport_index].rep = rep;
ibdev->port[vport_index].roce.netdev =
- mlx5_ib_get_rep_netdev(dev->priv.eswitch, rep->vport);
- ibdev->mdev = dev;
+ mlx5_ib_get_rep_netdev(lag_master->priv.eswitch, rep->vport);
+ ibdev->mdev = lag_master;
ibdev->num_ports = num_ports;
ret = __mlx5_ib_add(ibdev, profile);
@@ -94,8 +114,8 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
goto fail_add;
rep->rep_data[REP_IB].priv = ibdev;
- if (mlx5_lag_is_shared_fdb(dev))
- mlx5_ib_register_peer_vport_reps(dev);
+ if (mlx5_lag_is_shared_fdb(lag_master))
+ mlx5_ib_register_peer_vport_reps(lag_master);
return 0;
@@ -118,23 +138,27 @@ mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
struct mlx5_ib_dev *dev = mlx5_ib_rep_to_dev(rep);
int vport_index = rep->vport_index;
struct mlx5_ib_port *port;
+ int i;
if (WARN_ON(!mdev))
return;
+ if (!dev)
+ return;
+
if (mlx5_lag_is_shared_fdb(mdev) &&
!mlx5_lag_is_master(mdev)) {
- struct mlx5_core_dev *peer_mdev;
-
if (rep->vport == MLX5_VPORT_UPLINK && !mlx5_lag_is_mpesw(mdev))
return;
- peer_mdev = mlx5_lag_get_peer_mdev(mdev);
- vport_index += mlx5_eswitch_get_total_vports(peer_mdev);
+ for (i = 0; i < dev->num_ports; i++) {
+ if (dev->port[i].rep == rep)
+ break;
+ }
+ if (WARN_ON(i == dev->num_ports))
+ return;
+ vport_index = i;
}
- if (!dev)
- return;
-
port = &dev->port[vport_index];
write_lock(&port->roce.netdev_lock);
port->roce.netdev = NULL;
@@ -143,16 +167,18 @@ mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
port->rep = NULL;
if (rep->vport == MLX5_VPORT_UPLINK) {
- struct mlx5_core_dev *peer_mdev;
- struct mlx5_eswitch *esw;
if (mlx5_lag_is_shared_fdb(mdev) && !mlx5_lag_is_master(mdev))
return;
if (mlx5_lag_is_shared_fdb(mdev)) {
- peer_mdev = mlx5_lag_get_peer_mdev(mdev);
- esw = peer_mdev->priv.eswitch;
- mlx5_eswitch_unregister_vport_reps(esw, REP_IB);
+ struct mlx5_core_dev *peer_mdev;
+ struct mlx5_eswitch *esw;
+
+ mlx5_lag_for_each_peer_mdev(mdev, peer_mdev, i) {
+ esw = peer_mdev->priv.eswitch;
+ mlx5_eswitch_unregister_vport_reps(esw, REP_IB);
+ }
}
__mlx5_ib_remove(dev, dev->profile, MLX5_IB_STAGE_MAX);
}
@@ -166,14 +192,14 @@ static const struct mlx5_eswitch_rep_ops rep_ops = {
static void mlx5_ib_register_peer_vport_reps(struct mlx5_core_dev *mdev)
{
- struct mlx5_core_dev *peer_mdev = mlx5_lag_get_peer_mdev(mdev);
+ struct mlx5_core_dev *peer_mdev;
struct mlx5_eswitch *esw;
+ int i;
- if (!peer_mdev)
- return;
-
- esw = peer_mdev->priv.eswitch;
- mlx5_eswitch_register_vport_reps(esw, &rep_ops, REP_IB);
+ mlx5_lag_for_each_peer_mdev(mdev, peer_mdev, i) {
+ esw = peer_mdev->priv.eswitch;
+ mlx5_eswitch_register_vport_reps(esw, &rep_ops, REP_IB);
+ }
}
struct net_device *mlx5_ib_get_rep_netdev(struct mlx5_eswitch *esw,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 11374c3744c5..8a10ed4d8cbb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -244,16 +244,22 @@ static int mlx5_cmd_update_root_ft(struct mlx5_flow_root_namespace *ns,
ft->type == FS_FT_FDB &&
mlx5_lag_is_shared_fdb(dev) &&
mlx5_lag_is_master(dev)) {
- err = mlx5_cmd_set_slave_root_fdb(dev,
- mlx5_lag_get_peer_mdev(dev),
- !disconnect, (!disconnect) ?
- ft->id : 0);
- if (err && !disconnect) {
- MLX5_SET(set_flow_table_root_in, in, op_mod, 0);
- MLX5_SET(set_flow_table_root_in, in, table_id,
- ns->root_ft->id);
- mlx5_cmd_exec_in(dev, set_flow_table_root, in);
+ struct mlx5_core_dev *peer_dev;
+ int i;
+
+ mlx5_lag_for_each_peer_mdev(dev, peer_dev, i) {
+ err = mlx5_cmd_set_slave_root_fdb(dev, peer_dev, !disconnect,
+ (!disconnect) ? ft->id : 0);
+ if (err && !disconnect) {
+ MLX5_SET(set_flow_table_root_in, in, op_mod, 0);
+ MLX5_SET(set_flow_table_root_in, in, table_id,
+ ns->root_ft->id);
+ mlx5_cmd_exec_in(dev, set_flow_table_root, in);
+ }
+ if (err)
+ break;
}
+
}
return err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index c820f7d266de..c55e36e0571d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1519,26 +1519,37 @@ u8 mlx5_lag_get_num_ports(struct mlx5_core_dev *dev)
}
EXPORT_SYMBOL(mlx5_lag_get_num_ports);
-struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev)
+struct mlx5_core_dev *mlx5_lag_get_next_peer_mdev(struct mlx5_core_dev *dev, int *i)
{
struct mlx5_core_dev *peer_dev = NULL;
struct mlx5_lag *ldev;
unsigned long flags;
+ int idx;
spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
if (!ldev)
goto unlock;
- peer_dev = ldev->pf[MLX5_LAG_P1].dev == dev ?
- ldev->pf[MLX5_LAG_P2].dev :
- ldev->pf[MLX5_LAG_P1].dev;
+ if (*i == ldev->ports)
+ goto unlock;
+ for (idx = *i; idx < ldev->ports; idx++)
+ if (ldev->pf[idx].dev != dev)
+ break;
+
+ if (idx == ldev->ports) {
+ *i = idx;
+ goto unlock;
+ }
+ *i = idx + 1;
+
+ peer_dev = ldev->pf[idx].dev;
unlock:
spin_unlock_irqrestore(&lag_lock, flags);
return peer_dev;
}
-EXPORT_SYMBOL(mlx5_lag_get_peer_mdev);
+EXPORT_SYMBOL(mlx5_lag_get_next_peer_mdev);
int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
u64 *values,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 94d2be5848ae..9a744c48eec2 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1174,7 +1174,13 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
u64 *values,
int num_counters,
size_t *offsets);
-struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev);
+struct mlx5_core_dev *mlx5_lag_get_next_peer_mdev(struct mlx5_core_dev *dev, int *i);
+
+#define mlx5_lag_for_each_peer_mdev(dev, peer, i) \
+ for (i = 0, peer = mlx5_lag_get_next_peer_mdev(dev, &i); \
+ peer; \
+ peer = mlx5_lag_get_next_peer_mdev(dev, &i))
+
u8 mlx5_lag_get_num_ports(struct mlx5_core_dev *dev);
struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev);
void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up);
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 03/14] net/mlx5: LAG, check if all eswitches are paired for shared FDB
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
2023-06-07 21:03 ` [net-next V2 01/14] RDMA/mlx5: Free second uplink ib port Saeed Mahameed
2023-06-07 21:03 ` [net-next V2 02/14] {net/RDMA}/mlx5: introduce lag_for_each_peer Saeed Mahameed
@ 2023-06-07 21:03 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 04/14] net/mlx5: LAG, generalize handling of " Saeed Mahameed
` (10 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:03 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
Shared FDB LAG can only work if all eswitches are paired.
Also, whenever two eswitches are paired, devcom is marked as ready.
Therefore, in case of device with two eswitches, checking devcom was
sufficient. However, this is not correct for device with more than
two eswitches, which will be introduced in downstream patch.
Hence, check all eswitches are paired explicitly.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 9 +++++++++
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 4 +++-
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index c42c16d9ccbc..d3608f198e0a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -779,6 +779,13 @@ static inline int mlx5_eswitch_num_vfs(struct mlx5_eswitch *esw)
return 0;
}
+static inline int mlx5_eswitch_get_npeers(struct mlx5_eswitch *esw)
+{
+ if (mlx5_esw_allowed(esw))
+ return esw->num_peers;
+ return 0;
+}
+
static inline struct mlx5_flow_table *
mlx5_eswitch_get_slow_fdb(struct mlx5_eswitch *esw)
{
@@ -826,6 +833,8 @@ static inline void
mlx5_eswitch_offloads_single_fdb_del_one(struct mlx5_eswitch *master_esw,
struct mlx5_eswitch *slave_esw) {}
+static inline int mlx5_eswitch_get_npeers(struct mlx5_eswitch *esw) { return 0; }
+
static inline int
mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw)
{
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index c55e36e0571d..dd8a19d85617 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -828,7 +828,9 @@ bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
MLX5_DEVCOM_ESW_OFFLOADS) &&
MLX5_CAP_GEN(dev1, lag_native_fdb_selection) &&
MLX5_CAP_ESW(dev1, root_ft_on_other_esw) &&
- MLX5_CAP_ESW(dev0, esw_shared_ingress_acl))
+ MLX5_CAP_ESW(dev0, esw_shared_ingress_acl) &&
+ mlx5_eswitch_get_npeers(dev0->priv.eswitch) == MLX5_CAP_GEN(dev0, num_lag_ports) - 1 &&
+ mlx5_eswitch_get_npeers(dev1->priv.eswitch) == MLX5_CAP_GEN(dev1, num_lag_ports) - 1)
return true;
return false;
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 04/14] net/mlx5: LAG, generalize handling of shared FDB
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (2 preceding siblings ...)
2023-06-07 21:03 ` [net-next V2 03/14] net/mlx5: LAG, check if all eswitches are paired for shared FDB Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 05/14] net/mlx5: LAG, change mlx5_shared_fdb_supported() to static Saeed Mahameed
` (9 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
Shared FDB handling is using the assumption that shared FDB can only
be created from two devices.
In order to support shared FDB of more than two devices, iterate over
all LAG ports instead of hard coding only the first two LAG ports
whenever handling shared FDB.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 66 +++++++++++--------
1 file changed, 38 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index dd8a19d85617..00773aab9d20 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -512,8 +512,11 @@ static void mlx5_lag_set_port_sel_mode_offloads(struct mlx5_lag *ldev,
return;
if (MLX5_CAP_PORT_SELECTION(dev0->dev, port_select_flow_table) &&
- tracker->tx_type == NETDEV_LAG_TX_TYPE_HASH)
+ tracker->tx_type == NETDEV_LAG_TX_TYPE_HASH) {
+ if (ldev->ports > 2)
+ ldev->buckets = MLX5_LAG_MAX_HASH_BUCKETS;
set_bit(MLX5_LAG_MODE_FLAG_HASH_BASED, flags);
+ }
}
static int mlx5_lag_set_flags(struct mlx5_lag *ldev, enum mlx5_lag_mode mode,
@@ -782,7 +785,6 @@ void mlx5_disable_lag(struct mlx5_lag *ldev)
{
bool shared_fdb = test_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &ldev->mode_flags);
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
- struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
bool roce_lag;
int err;
int i;
@@ -807,30 +809,35 @@ void mlx5_disable_lag(struct mlx5_lag *ldev)
if (shared_fdb || roce_lag)
mlx5_lag_add_devices(ldev);
- if (shared_fdb) {
- if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV))
- mlx5_eswitch_reload_reps(dev0->priv.eswitch);
- if (!(dev1->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV))
- mlx5_eswitch_reload_reps(dev1->priv.eswitch);
- }
+ if (shared_fdb)
+ for (i = 0; i < ldev->ports; i++)
+ if (!(ldev->pf[i].dev->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV))
+ mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
}
bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
{
- struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
- struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
-
- if (is_mdev_switchdev_mode(dev0) &&
- is_mdev_switchdev_mode(dev1) &&
- mlx5_eswitch_vport_match_metadata_enabled(dev0->priv.eswitch) &&
- mlx5_eswitch_vport_match_metadata_enabled(dev1->priv.eswitch) &&
- mlx5_devcom_comp_is_ready(dev0->priv.devcom,
- MLX5_DEVCOM_ESW_OFFLOADS) &&
- MLX5_CAP_GEN(dev1, lag_native_fdb_selection) &&
- MLX5_CAP_ESW(dev1, root_ft_on_other_esw) &&
- MLX5_CAP_ESW(dev0, esw_shared_ingress_acl) &&
- mlx5_eswitch_get_npeers(dev0->priv.eswitch) == MLX5_CAP_GEN(dev0, num_lag_ports) - 1 &&
- mlx5_eswitch_get_npeers(dev1->priv.eswitch) == MLX5_CAP_GEN(dev1, num_lag_ports) - 1)
+ struct mlx5_core_dev *dev;
+ int i;
+
+ for (i = MLX5_LAG_P1 + 1; i < ldev->ports; i++) {
+ dev = ldev->pf[i].dev;
+ if (is_mdev_switchdev_mode(dev) &&
+ mlx5_eswitch_vport_match_metadata_enabled(dev->priv.eswitch) &&
+ MLX5_CAP_GEN(dev, lag_native_fdb_selection) &&
+ MLX5_CAP_ESW(dev, root_ft_on_other_esw) &&
+ mlx5_eswitch_get_npeers(dev->priv.eswitch) ==
+ MLX5_CAP_GEN(dev, num_lag_ports) - 1)
+ continue;
+ return false;
+ }
+
+ dev = ldev->pf[MLX5_LAG_P1].dev;
+ if (is_mdev_switchdev_mode(dev) &&
+ mlx5_eswitch_vport_match_metadata_enabled(dev->priv.eswitch) &&
+ mlx5_devcom_comp_is_ready(dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS) &&
+ MLX5_CAP_ESW(dev, esw_shared_ingress_acl) &&
+ mlx5_eswitch_get_npeers(dev->priv.eswitch) == MLX5_CAP_GEN(dev, num_lag_ports) - 1)
return true;
return false;
@@ -867,7 +874,6 @@ static bool mlx5_lag_should_disable_lag(struct mlx5_lag *ldev, bool do_bond)
static void mlx5_do_bond(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
- struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
struct lag_tracker tracker = { };
bool do_bond, roce_lag;
int err;
@@ -908,20 +914,24 @@ static void mlx5_do_bond(struct mlx5_lag *ldev)
for (i = 1; i < ldev->ports; i++)
mlx5_nic_vport_enable_roce(ldev->pf[i].dev);
} else if (shared_fdb) {
+ int i;
+
dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV;
mlx5_rescan_drivers_locked(dev0);
- err = mlx5_eswitch_reload_reps(dev0->priv.eswitch);
- if (!err)
- err = mlx5_eswitch_reload_reps(dev1->priv.eswitch);
+ for (i = 0; i < ldev->ports; i++) {
+ err = mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
+ if (err)
+ break;
+ }
if (err) {
dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV;
mlx5_rescan_drivers_locked(dev0);
mlx5_deactivate_lag(ldev);
mlx5_lag_add_devices(ldev);
- mlx5_eswitch_reload_reps(dev0->priv.eswitch);
- mlx5_eswitch_reload_reps(dev1->priv.eswitch);
+ for (i = 0; i < ldev->ports; i++)
+ mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
mlx5_core_err(dev0, "Failed to enable lag\n");
return;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 05/14] net/mlx5: LAG, change mlx5_shared_fdb_supported() to static
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (3 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 04/14] net/mlx5: LAG, generalize handling of " Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 06/14] net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports Saeed Mahameed
` (8 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
mlx5_shared_fdb_supported() is used only in a single file. Change the
function to be static.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 -
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 00773aab9d20..6ce71c42c755 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -815,7 +815,7 @@ void mlx5_disable_lag(struct mlx5_lag *ldev)
mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
}
-bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
+static bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev;
int i;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
index bc1f1dd3e283..d7e7fa2348a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
@@ -111,7 +111,6 @@ int mlx5_activate_lag(struct mlx5_lag *ldev,
bool shared_fdb);
int mlx5_lag_dev_get_netdev_idx(struct mlx5_lag *ldev,
struct net_device *ndev);
-bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev);
char *mlx5_get_str_port_sel_mode(enum mlx5_lag_mode mode, unsigned long flags);
void mlx5_infer_tx_enabled(struct lag_tracker *tracker, u8 num_ports,
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 06/14] net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (4 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 05/14] net/mlx5: LAG, change mlx5_shared_fdb_supported() to static Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 07/14] net/mlx5: LAG, block multiport eswitch " Saeed Mahameed
` (7 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
multipath LAG is not supported over more than two ports. Add a check in
order to block multipath LAG over such configurations.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c
index d85a8dfc153d..976caa8e6922 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mp.c
@@ -14,6 +14,7 @@ static bool __mlx5_lag_is_multipath(struct mlx5_lag *ldev)
return ldev->mode == MLX5_LAG_MODE_MULTIPATH;
}
+#define MLX5_LAG_MULTIPATH_OFFLOADS_SUPPORTED_PORTS 2
static bool mlx5_lag_multipath_check_prereq(struct mlx5_lag *ldev)
{
if (!mlx5_lag_is_ready(ldev))
@@ -22,6 +23,9 @@ static bool mlx5_lag_multipath_check_prereq(struct mlx5_lag *ldev)
if (__mlx5_lag_is_active(ldev) && !__mlx5_lag_is_multipath(ldev))
return false;
+ if (ldev->ports > MLX5_LAG_MULTIPATH_OFFLOADS_SUPPORTED_PORTS)
+ return false;
+
return mlx5_esw_multipath_prereq(ldev->pf[MLX5_LAG_P1].dev,
ldev->pf[MLX5_LAG_P2].dev);
}
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 07/14] net/mlx5: LAG, block multiport eswitch LAG in case ldev have more than 2 ports
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (5 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 06/14] net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 08/14] net/mlx5: Enable 4 ports VF LAG Saeed Mahameed
` (6 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
multiport eswitch LAG is not supported over more than two ports. Add a check in
order to block multiport eswitch LAG over such devices.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
index 0c0ef600f643..0e869a76dfe4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
@@ -65,6 +65,7 @@ static int mlx5_mpesw_metadata_set(struct mlx5_lag *ldev)
return err;
}
+#define MLX5_LAG_MPESW_OFFLOADS_SUPPORTED_PORTS 2
static int enable_mpesw(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
@@ -74,6 +75,9 @@ static int enable_mpesw(struct mlx5_lag *ldev)
if (ldev->mode != MLX5_LAG_MODE_NONE)
return -EINVAL;
+ if (ldev->ports > MLX5_LAG_MPESW_OFFLOADS_SUPPORTED_PORTS)
+ return -EOPNOTSUPP;
+
if (mlx5_eswitch_mode(dev0) != MLX5_ESWITCH_OFFLOADS ||
!MLX5_CAP_PORT_SELECTION(dev0, port_select_flow_table) ||
!MLX5_CAP_GEN(dev0, create_lag_when_not_master_up) ||
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 08/14] net/mlx5: Enable 4 ports VF LAG
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (6 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 07/14] net/mlx5: LAG, block multiport eswitch " Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 09/14] net/mlx5e: Expose catastrophic steering error counters Saeed Mahameed
` (5 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
Now, after all preparation are done, enable 4 ports VF LAG
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c | 5 +++--
drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h | 2 +-
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 6ce71c42c755..ffd7e17b8ebe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -711,7 +711,7 @@ int mlx5_deactivate_lag(struct mlx5_lag *ldev)
return 0;
}
-#define MLX5_LAG_OFFLOADS_SUPPORTED_PORTS 2
+#define MLX5_LAG_OFFLOADS_SUPPORTED_PORTS 4
bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
{
#ifdef CONFIG_MLX5_ESWITCH
@@ -737,7 +737,7 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
if (mlx5_eswitch_mode(ldev->pf[i].dev) != mode)
return false;
- if (mode == MLX5_ESWITCH_OFFLOADS && ldev->ports != MLX5_LAG_OFFLOADS_SUPPORTED_PORTS)
+ if (mode == MLX5_ESWITCH_OFFLOADS && ldev->ports > MLX5_LAG_OFFLOADS_SUPPORTED_PORTS)
return false;
#else
for (i = 0; i < ldev->ports; i++)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index 8472bbb3cd58..78c94b22bdc0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -75,13 +75,14 @@ struct mlx5_devcom *mlx5_devcom_register_device(struct mlx5_core_dev *dev)
if (!mlx5_core_is_pf(dev))
return NULL;
- if (MLX5_CAP_GEN(dev, num_lag_ports) != MLX5_DEVCOM_PORTS_SUPPORTED)
+ if (MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_DEVCOM_PORTS_SUPPORTED)
return NULL;
mlx5_dev_list_lock();
sguid0 = mlx5_query_nic_system_image_guid(dev);
list_for_each_entry(iter, &devcom_list, list) {
- struct mlx5_core_dev *tmp_dev = NULL;
+ /* There is at least one device in iter */
+ struct mlx5_core_dev *tmp_dev;
idx = -1;
for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index bb1970ba8730..d953a01b8eaa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -6,7 +6,7 @@
#include <linux/mlx5/driver.h>
-#define MLX5_DEVCOM_PORTS_SUPPORTED 2
+#define MLX5_DEVCOM_PORTS_SUPPORTED 4
enum mlx5_devcom_components {
MLX5_DEVCOM_ESW_OFFLOADS,
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 09/14] net/mlx5e: Expose catastrophic steering error counters
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (7 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 08/14] net/mlx5: Enable 4 ports VF LAG Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 10/14] net/mlx5e: Remove RX page cache leftovers Saeed Mahameed
` (4 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Lama Kayal, Rahul Rameshbabu
From: Lama Kayal <lkayal@nvidia.com>
Add generated_pkt_steering_fail and handled_pkt_steering_fail to devlink
heatlth reporter.
generated_pkt_steering_fail indicates the number of packets dropped due to
illegal steering operation within the vport steering domain.
handled_pkt_steering_fail indicates the number of packets dropped due to
illegal steering operation, originated by the vport.
Also, update devlink reporter functionality documentation with the newly
exposed counters.
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/devlink.rst | 7 +++++++
.../ethernet/mellanox/mlx5/core/diag/reporter_vnic.c | 10 ++++++++++
include/linux/mlx5/mlx5_ifc.h | 12 ++++++++++--
3 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst
index 3354ca3608ee..a4edf908b707 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst
@@ -290,6 +290,13 @@ Description of the vnic counters:
- nic_receive_steering_discard
number of packets that completed RX flow
steering but were discarded due to a mismatch in flow table.
+- generated_pkt_steering_fail
+ number of packets generated by the VNIC experiencing unexpected steering
+ failure (at any point in steering flow).
+- handled_pkt_steering_fail
+ number of packets handled by the VNIC experiencing unexpected steering
+ failure (at any point in steering flow owned by the VNIC, including the FDB
+ for the eswitch owner).
User commands examples:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c
index 9114661cd967..b0128336ff01 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c
@@ -76,6 +76,16 @@ int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev,
if (err)
return err;
+ err = devlink_fmsg_u64_pair_put(fmsg, "generated_pkt_steering_fail",
+ VNIC_ENV_GET64(&vnic, generated_pkt_steering_fail));
+ if (err)
+ return err;
+
+ err = devlink_fmsg_u64_pair_put(fmsg, "handled_pkt_steering_fail",
+ VNIC_ENV_GET64(&vnic, handled_pkt_steering_fail));
+ if (err)
+ return err;
+
err = devlink_fmsg_obj_nest_end(fmsg);
if (err)
return err;
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index b89778d0d326..af3a92ad2e6b 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1755,7 +1755,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_at_328[0x2];
u8 relaxed_ordering_read[0x1];
u8 log_max_pd[0x5];
- u8 reserved_at_330[0x9];
+ u8 reserved_at_330[0x7];
+ u8 vnic_env_cnt_steering_fail[0x1];
+ u8 reserved_at_338[0x1];
u8 q_counter_aggregation[0x1];
u8 q_counter_other_vport[0x1];
u8 log_max_xrcd[0x5];
@@ -3673,7 +3675,13 @@ struct mlx5_ifc_vnic_diagnostic_statistics_bits {
u8 eth_wqe_too_small[0x20];
- u8 reserved_at_220[0xdc0];
+ u8 reserved_at_220[0xc0];
+
+ u8 generated_pkt_steering_fail[0x40];
+
+ u8 handled_pkt_steering_fail[0x40];
+
+ u8 reserved_at_360[0xc80];
};
struct mlx5_ifc_traffic_counter_bits {
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 10/14] net/mlx5e: Remove RX page cache leftovers
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (8 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 09/14] net/mlx5e: Expose catastrophic steering error counters Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 11/14] net/mlx5e: TC, refactor access to hash key Saeed Mahameed
` (3 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Dragos Tatulea
From: Tariq Toukan <tariqt@nvidia.com>
Remove unused definitions left after the removal
of the RX page cache feature.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 8e999f238194..ceabe57c511a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -594,13 +594,6 @@ struct mlx5e_mpw_info {
#define MLX5E_MAX_RX_FRAGS 4
-/* a single cache unit is capable to serve one napi call (for non-striding rq)
- * or a MPWQE (for striding rq).
- */
-#define MLX5E_CACHE_UNIT (MLX5_MPWRQ_MAX_PAGES_PER_WQE > NAPI_POLL_WEIGHT ? \
- MLX5_MPWRQ_MAX_PAGES_PER_WQE : NAPI_POLL_WEIGHT)
-#define MLX5E_CACHE_SIZE (4 * roundup_pow_of_two(MLX5E_CACHE_UNIT))
-
struct mlx5e_rq;
typedef void (*mlx5e_fp_handle_rx_cqe)(struct mlx5e_rq*, struct mlx5_cqe64*);
typedef struct sk_buff *
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 11/14] net/mlx5e: TC, refactor access to hash key
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (9 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 10/14] net/mlx5e: Remove RX page cache leftovers Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 12/14] net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure Saeed Mahameed
` (2 subsequent siblings)
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Oz Shlomo, Paul Blakey
From: Oz Shlomo <ozsh@nvidia.com>
Currently, a temp object is filled and used as a key for rhashtable_lookup.
Lookups will only works while key remains the first attribute in the
relevant rhashtable node object.
Fix this by passing a key, instead of a object containing the key.
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en/tc/act_stats.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act_stats.c
index 07c1895a2b23..7aa926e542d3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act_stats.c
@@ -25,8 +25,8 @@ struct mlx5e_tc_act_stats {
static const struct rhashtable_params act_counters_ht_params = {
.head_offset = offsetof(struct mlx5e_tc_act_stats, hash),
- .key_offset = 0,
- .key_len = offsetof(struct mlx5e_tc_act_stats, counter),
+ .key_offset = offsetof(struct mlx5e_tc_act_stats, tc_act_cookie),
+ .key_len = sizeof_field(struct mlx5e_tc_act_stats, tc_act_cookie),
.automatic_shrinking = true,
};
@@ -169,14 +169,11 @@ mlx5e_tc_act_stats_fill_stats(struct mlx5e_tc_act_stats_handle *handle,
{
struct rhashtable *ht = &handle->ht;
struct mlx5e_tc_act_stats *item;
- struct mlx5e_tc_act_stats key;
u64 pkts, bytes, lastused;
int err = 0;
- key.tc_act_cookie = fl_act->cookie;
-
rcu_read_lock();
- item = rhashtable_lookup(ht, &key, act_counters_ht_params);
+ item = rhashtable_lookup(ht, &fl_act->cookie, act_counters_ht_params);
if (!item) {
rcu_read_unlock();
err = -ENOENT;
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 12/14] net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (10 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 11/14] net/mlx5e: TC, refactor access to hash key Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 13/14] mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 14/14] net/mlx5e: simplify condition after napi budget handling change Saeed Mahameed
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Jiri Pirko
From: Jiri Pirko <jiri@nvidia.com>
Commit bffaa916588e ("net/mlx5: E-Switch, Add control for inline mode")
added inline mode checking to esw_offloads_start() with a warning
printed out in case there is a problem. Tne inline mode checking was
done even after mlx5_eswitch_enable_locked() call failed, which is
pointless.
Later on, commit 8c98ee77d911 ("net/mlx5e: E-Switch, Add extack messages
to devlink callbacks") converted the error/warning prints to extack
setting, which caused that the inline mode check error to overwrite
possible previous extack message when mlx5_eswitch_enable_locked()
failed. User then gets confusing error message.
Fix this by skipping check of inline mode after
mlx5_eswitch_enable_locked() call failed.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 29de4e759f4f..eafb098db6b0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2178,6 +2178,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw,
"Failed setting eswitch to offloads");
esw->mode = MLX5_ESWITCH_LEGACY;
mlx5_rescan_drivers(esw->dev);
+ return err;
}
if (esw->offloads.inline_mode == MLX5_INLINE_MODE_NONE) {
if (mlx5_eswitch_inline_mode_get(esw,
@@ -2187,7 +2188,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw,
"Inline mode is different between vports");
}
}
- return err;
+ return 0;
}
static void mlx5_esw_offloads_rep_mark_set(struct mlx5_eswitch *esw,
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 13/14] mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (11 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 12/14] net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
2023-06-07 21:04 ` [net-next V2 14/14] net/mlx5e: simplify condition after napi budget handling change Saeed Mahameed
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
linux-rdma, Bodong Wang, Mark Bloch
From: Bodong Wang <bodong@nvidia.com>
Eswitch vport is needed for eswitch manager when creating LAG,
to create egress rules. However, this was not handled when ECPF is
an eswitch manager.
Signed-off-by: Bodong Wang <bodong@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 31956cd9d1bb..ecd8864d5d11 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1601,7 +1601,8 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)
idx++;
}
- if (mlx5_ecpf_vport_exists(dev)) {
+ if (mlx5_ecpf_vport_exists(dev) ||
+ mlx5_core_is_ecpf_esw_manager(dev)) {
err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_ECPF);
if (err)
goto err;
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net-next V2 14/14] net/mlx5e: simplify condition after napi budget handling change
2023-06-07 21:03 [pull request][net-next V2 00/14] mlx5 updates 2023-06-06 Saeed Mahameed
` (12 preceding siblings ...)
2023-06-07 21:04 ` [net-next V2 13/14] mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager Saeed Mahameed
@ 2023-06-07 21:04 ` Saeed Mahameed
13 siblings, 0 replies; 16+ messages in thread
From: Saeed Mahameed @ 2023-06-07 21:04 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky, linux-rdma
From: Jakub Kicinski <kuba@kernel.org>
Since recent commit budget can't be 0 here.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index fbb2d963fb7e..a7d9b7cb4297 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -207,7 +207,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
}
ch_stats->aff_change++;
aff_change = true;
- if (budget && work_done == budget)
+ if (work_done == budget)
work_done--;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread