All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks
@ 2022-07-28 15:53 Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 1/9] net: devlink: remove region snapshot ID tracking dependency on devlink->lock Moshe Shemesh
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Prepare mlx4 and mlx5 drivers to have all devlink callbacks called with
devlink instance locked. Change mlx4 driver to use devl_ API where
needed to have devlink reload callbacks locked. Change mlx5 driver to
use devl_ API where needed to have devlink reload and devlink health
callbacks locked.

As mlx5 is the only driver which needed changes to enable calling health
callbacks with devlink instance locked, this patchset also removes
DEVLINK_NL_FLAG_NO_LOCK flag from devlink health callbacks.

This patchset will be followed by a patchset that will remove
DEVLINK_NL_FLAG_NO_LOCK flag from devlink and will remove devlink_mutex.

Jiri Pirko (2):
  net: devlink: remove region snapshot ID tracking dependency on
    devlink->lock
  net: devlink: remove region snapshots list dependency on devlink->lock

Moshe Shemesh (7):
  net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload
  net/mlx5: Lock mlx5 devlink reload callbacks
  net/mlx4: Use devl_ API for devlink region create / destroy
  net/mlx4: Use devl_ API for devlink port register / unregister
  net/mlx4: Lock mlx4 devlink reload callback
  net/mlx5: Lock mlx5 devlink health recovery callback
  devlink: Hold the instance lock in health callbacks

 drivers/net/ethernet/mellanox/mlx4/catas.c    |   5 +
 drivers/net/ethernet/mellanox/mlx4/crdump.c   |  20 +--
 drivers/net/ethernet/mellanox/mlx4/main.c     |  44 +++++-
 drivers/net/ethernet/mellanox/mlx5/core/dev.c |  19 +--
 .../net/ethernet/mellanox/mlx5/core/devlink.c |  59 +++++---
 .../net/ethernet/mellanox/mlx5/core/eswitch.c |  18 +--
 .../ethernet/mellanox/mlx5/core/fw_reset.c    |  10 +-
 .../net/ethernet/mellanox/mlx5/core/health.c  |   4 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |  38 ++++-
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
 .../net/ethernet/mellanox/mlx5/core/sriov.c   |   6 +
 net/core/devlink.c                            | 135 ++++++++++--------
 12 files changed, 227 insertions(+), 133 deletions(-)

-- 
2.18.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 1/9] net: devlink: remove region snapshot ID tracking dependency on devlink->lock
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 2/9] net: devlink: remove region snapshots list " Moshe Shemesh
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev

From: Jiri Pirko <jiri@nvidia.com>

After mlx4 driver is converted to do locked reload, functions to get/put
regions snapshot ID may be called from both locked and unlocked context.

So resolve this by removing dependency on devlink->lock for region
snapshot ID tracking by using internal xa_lock() to maintain
shapshot_ids xa_array consistency.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2:
- changed GFP_KERNEL to GFP_ATOMIC in __xa_store() calls
---
 net/core/devlink.c | 64 ++++++++++++++++++++++++----------------------
 1 file changed, 33 insertions(+), 31 deletions(-)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index ca4c9939d569..64d150516e45 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -5690,21 +5690,28 @@ static int __devlink_snapshot_id_increment(struct devlink *devlink, u32 id)
 {
 	unsigned long count;
 	void *p;
+	int err;
 
-	devl_assert_locked(devlink);
-
+	xa_lock(&devlink->snapshot_ids);
 	p = xa_load(&devlink->snapshot_ids, id);
-	if (WARN_ON(!p))
-		return -EINVAL;
+	if (WARN_ON(!p)) {
+		err = -EINVAL;
+		goto unlock;
+	}
 
-	if (WARN_ON(!xa_is_value(p)))
-		return -EINVAL;
+	if (WARN_ON(!xa_is_value(p))) {
+		err = -EINVAL;
+		goto unlock;
+	}
 
 	count = xa_to_value(p);
 	count++;
 
-	return xa_err(xa_store(&devlink->snapshot_ids, id, xa_mk_value(count),
-			       GFP_KERNEL));
+	err = xa_err(__xa_store(&devlink->snapshot_ids, id, xa_mk_value(count),
+				GFP_ATOMIC));
+unlock:
+	xa_unlock(&devlink->snapshot_ids);
+	return err;
 }
 
 /**
@@ -5727,25 +5734,26 @@ static void __devlink_snapshot_id_decrement(struct devlink *devlink, u32 id)
 	unsigned long count;
 	void *p;
 
-	devl_assert_locked(devlink);
-
+	xa_lock(&devlink->snapshot_ids);
 	p = xa_load(&devlink->snapshot_ids, id);
 	if (WARN_ON(!p))
-		return;
+		goto unlock;
 
 	if (WARN_ON(!xa_is_value(p)))
-		return;
+		goto unlock;
 
 	count = xa_to_value(p);
 
 	if (count > 1) {
 		count--;
-		xa_store(&devlink->snapshot_ids, id, xa_mk_value(count),
-			 GFP_KERNEL);
+		__xa_store(&devlink->snapshot_ids, id, xa_mk_value(count),
+			   GFP_ATOMIC);
 	} else {
 		/* If this was the last user, we can erase this id */
-		xa_erase(&devlink->snapshot_ids, id);
+		__xa_erase(&devlink->snapshot_ids, id);
 	}
+unlock:
+	xa_unlock(&devlink->snapshot_ids);
 }
 
 /**
@@ -5766,13 +5774,17 @@ static void __devlink_snapshot_id_decrement(struct devlink *devlink, u32 id)
  */
 static int __devlink_snapshot_id_insert(struct devlink *devlink, u32 id)
 {
-	devl_assert_locked(devlink);
+	int err;
 
-	if (xa_load(&devlink->snapshot_ids, id))
+	xa_lock(&devlink->snapshot_ids);
+	if (xa_load(&devlink->snapshot_ids, id)) {
+		xa_unlock(&devlink->snapshot_ids);
 		return -EEXIST;
-
-	return xa_err(xa_store(&devlink->snapshot_ids, id, xa_mk_value(0),
-			       GFP_KERNEL));
+	}
+	err = xa_err(__xa_store(&devlink->snapshot_ids, id, xa_mk_value(0),
+				GFP_ATOMIC));
+	xa_unlock(&devlink->snapshot_ids);
+	return err;
 }
 
 /**
@@ -5793,8 +5805,6 @@ static int __devlink_snapshot_id_insert(struct devlink *devlink, u32 id)
  */
 static int __devlink_region_snapshot_id_get(struct devlink *devlink, u32 *id)
 {
-	devl_assert_locked(devlink);
-
 	return xa_alloc(&devlink->snapshot_ids, id, xa_mk_value(1),
 			xa_limit_32b, GFP_KERNEL);
 }
@@ -11226,13 +11236,7 @@ EXPORT_SYMBOL_GPL(devlink_region_destroy);
  */
 int devlink_region_snapshot_id_get(struct devlink *devlink, u32 *id)
 {
-	int err;
-
-	devl_lock(devlink);
-	err = __devlink_region_snapshot_id_get(devlink, id);
-	devl_unlock(devlink);
-
-	return err;
+	return __devlink_region_snapshot_id_get(devlink, id);
 }
 EXPORT_SYMBOL_GPL(devlink_region_snapshot_id_get);
 
@@ -11248,9 +11252,7 @@ EXPORT_SYMBOL_GPL(devlink_region_snapshot_id_get);
  */
 void devlink_region_snapshot_id_put(struct devlink *devlink, u32 id)
 {
-	devl_lock(devlink);
 	__devlink_snapshot_id_decrement(devlink, id);
-	devl_unlock(devlink);
 }
 EXPORT_SYMBOL_GPL(devlink_region_snapshot_id_put);
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 2/9] net: devlink: remove region snapshots list dependency on devlink->lock
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 1/9] net: devlink: remove region snapshot ID tracking dependency on devlink->lock Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 3/9] net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload Moshe Shemesh
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev

From: Jiri Pirko <jiri@nvidia.com>

After mlx4 driver is converted to do locked reload,
devlink_region_snapshot_create() may be called from both locked and
unlocked context.

Note that in mlx4 region snapshots could be created on any command
failure. That can happen in any flow that involves commands to FW,
which means most of the driver flows.

So resolve this by removing dependency on devlink->lock for region
snapshots list consistency and introduce new mutex to ensure it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2:
- enhanced patch description
---
 net/core/devlink.c | 41 +++++++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index 64d150516e45..274dafd8a594 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -695,6 +695,10 @@ struct devlink_region {
 		const struct devlink_region_ops *ops;
 		const struct devlink_port_region_ops *port_ops;
 	};
+	struct mutex snapshot_lock; /* protects snapshot_list,
+				     * max_snapshots and cur_snapshots
+				     * consistency.
+				     */
 	struct list_head snapshot_list;
 	u32 max_snapshots;
 	u32 cur_snapshots;
@@ -5817,7 +5821,7 @@ static int __devlink_region_snapshot_id_get(struct devlink *devlink, u32 *id)
  *	Multiple snapshots can be created on a region.
  *	The @snapshot_id should be obtained using the getter function.
  *
- *	Must be called only while holding the devlink instance lock.
+ *	Must be called only while holding the region snapshot lock.
  *
  *	@region: devlink region of the snapshot
  *	@data: snapshot data
@@ -5831,7 +5835,7 @@ __devlink_region_snapshot_create(struct devlink_region *region,
 	struct devlink_snapshot *snapshot;
 	int err;
 
-	devl_assert_locked(devlink);
+	lockdep_assert_held(&region->snapshot_lock);
 
 	/* check if region can hold one more snapshot */
 	if (region->cur_snapshots == region->max_snapshots)
@@ -5869,7 +5873,7 @@ static void devlink_region_snapshot_del(struct devlink_region *region,
 {
 	struct devlink *devlink = region->devlink;
 
-	devl_assert_locked(devlink);
+	lockdep_assert_held(&region->snapshot_lock);
 
 	devlink_nl_region_notify(region, snapshot, DEVLINK_CMD_REGION_DEL);
 	region->cur_snapshots--;
@@ -6048,11 +6052,15 @@ static int devlink_nl_cmd_region_del(struct sk_buff *skb,
 	if (!region)
 		return -EINVAL;
 
+	mutex_lock(&region->snapshot_lock);
 	snapshot = devlink_region_snapshot_get_by_id(region, snapshot_id);
-	if (!snapshot)
+	if (!snapshot) {
+		mutex_unlock(&region->snapshot_lock);
 		return -EINVAL;
+	}
 
 	devlink_region_snapshot_del(region, snapshot);
+	mutex_unlock(&region->snapshot_lock);
 	return 0;
 }
 
@@ -6100,9 +6108,12 @@ devlink_nl_cmd_region_new(struct sk_buff *skb, struct genl_info *info)
 		return -EOPNOTSUPP;
 	}
 
+	mutex_lock(&region->snapshot_lock);
+
 	if (region->cur_snapshots == region->max_snapshots) {
 		NL_SET_ERR_MSG_MOD(info->extack, "The region has reached the maximum number of stored snapshots");
-		return -ENOSPC;
+		err = -ENOSPC;
+		goto unlock;
 	}
 
 	snapshot_id_attr = info->attrs[DEVLINK_ATTR_REGION_SNAPSHOT_ID];
@@ -6111,17 +6122,18 @@ devlink_nl_cmd_region_new(struct sk_buff *skb, struct genl_info *info)
 
 		if (devlink_region_snapshot_get_by_id(region, snapshot_id)) {
 			NL_SET_ERR_MSG_MOD(info->extack, "The requested snapshot id is already in use");
-			return -EEXIST;
+			err = -EEXIST;
+			goto unlock;
 		}
 
 		err = __devlink_snapshot_id_insert(devlink, snapshot_id);
 		if (err)
-			return err;
+			goto unlock;
 	} else {
 		err = __devlink_region_snapshot_id_get(devlink, &snapshot_id);
 		if (err) {
 			NL_SET_ERR_MSG_MOD(info->extack, "Failed to allocate a new snapshot id");
-			return err;
+			goto unlock;
 		}
 	}
 
@@ -6159,16 +6171,20 @@ devlink_nl_cmd_region_new(struct sk_buff *skb, struct genl_info *info)
 			goto err_notify;
 	}
 
+	mutex_unlock(&region->snapshot_lock);
 	return 0;
 
 err_snapshot_create:
 	region->ops->destructor(data);
 err_snapshot_capture:
 	__devlink_snapshot_id_decrement(devlink, snapshot_id);
+	mutex_unlock(&region->snapshot_lock);
 	return err;
 
 err_notify:
 	devlink_region_snapshot_del(region, snapshot);
+unlock:
+	mutex_unlock(&region->snapshot_lock);
 	return err;
 }
 
@@ -11094,6 +11110,7 @@ struct devlink_region *devl_region_create(struct devlink *devlink,
 	region->ops = ops;
 	region->size = region_size;
 	INIT_LIST_HEAD(&region->snapshot_list);
+	mutex_init(&region->snapshot_lock);
 	list_add_tail(&region->list, &devlink->region_list);
 	devlink_nl_region_notify(region, NULL, DEVLINK_CMD_REGION_NEW);
 
@@ -11167,6 +11184,7 @@ devlink_port_region_create(struct devlink_port *port,
 	region->port_ops = ops;
 	region->size = region_size;
 	INIT_LIST_HEAD(&region->snapshot_list);
+	mutex_init(&region->snapshot_lock);
 	list_add_tail(&region->list, &port->region_list);
 	devlink_nl_region_notify(region, NULL, DEVLINK_CMD_REGION_NEW);
 
@@ -11196,6 +11214,7 @@ void devl_region_destroy(struct devlink_region *region)
 		devlink_region_snapshot_del(region, snapshot);
 
 	list_del(&region->list);
+	mutex_destroy(&region->snapshot_lock);
 
 	devlink_nl_region_notify(region, NULL, DEVLINK_CMD_REGION_DEL);
 	kfree(region);
@@ -11271,13 +11290,11 @@ EXPORT_SYMBOL_GPL(devlink_region_snapshot_id_put);
 int devlink_region_snapshot_create(struct devlink_region *region,
 				   u8 *data, u32 snapshot_id)
 {
-	struct devlink *devlink = region->devlink;
 	int err;
 
-	devl_lock(devlink);
+	mutex_lock(&region->snapshot_lock);
 	err = __devlink_region_snapshot_create(region, data, snapshot_id);
-	devl_unlock(devlink);
-
+	mutex_unlock(&region->snapshot_lock);
 	return err;
 }
 EXPORT_SYMBOL_GPL(devlink_region_snapshot_create);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 3/9] net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 1/9] net: devlink: remove region snapshot ID tracking dependency on devlink->lock Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 2/9] net: devlink: remove region snapshots list " Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 4/9] net/mlx5: Lock mlx5 devlink reload callbacks Moshe Shemesh
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Refactor fw reset code to have the unload driver part done on
mlx5_fw_reset_complete_reload(), so if it was called by the PF which
initiated the reload fw activate flow, the unload part will be handled
by the mlx5_devlink_reload_fw_activate() callback itself and not by the
reset event work.

This will be used by the downstream patch to invoke devlink reload
callbacks with devlink lock held.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c  | 11 ++++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 10 +++-------
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index f85166e587f2..41bb50d94caa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -104,7 +104,16 @@ static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netli
 	if (err)
 		return err;
 
-	return mlx5_fw_reset_wait_reset_done(dev);
+	err = mlx5_fw_reset_wait_reset_done(dev);
+	if (err)
+		return err;
+
+	mlx5_unload_one(dev);
+	err = mlx5_health_wait_pci_up(dev);
+	if (err)
+		NL_SET_ERR_MSG_MOD(extack, "FW activate aborted, PCI reads fail after reset");
+
+	return err;
 }
 
 static int mlx5_devlink_trigger_fw_live_patch(struct devlink *devlink,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 052af4901c0b..e8896f368362 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -149,6 +149,9 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev)
 	if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) {
 		complete(&fw_reset->done);
 	} else {
+		mlx5_unload_one(dev);
+		if (mlx5_health_wait_pci_up(dev))
+			mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n");
 		mlx5_load_one(dev, false);
 		devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0,
 							BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
@@ -183,15 +186,9 @@ static void mlx5_sync_reset_reload_work(struct work_struct *work)
 	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
 						      reset_reload_work);
 	struct mlx5_core_dev *dev = fw_reset->dev;
-	int err;
 
 	mlx5_sync_reset_clear_reset_requested(dev, false);
 	mlx5_enter_error_state(dev, true);
-	mlx5_unload_one(dev);
-	err = mlx5_health_wait_pci_up(dev);
-	if (err)
-		mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n");
-	fw_reset->ret = err;
 	mlx5_fw_reset_complete_reload(dev);
 }
 
@@ -395,7 +392,6 @@ static void mlx5_sync_reset_now_event(struct work_struct *work)
 	}
 
 	mlx5_enter_error_state(dev, true);
-	mlx5_unload_one(dev);
 done:
 	fw_reset->ret = err;
 	mlx5_fw_reset_complete_reload(dev);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 4/9] net/mlx5: Lock mlx5 devlink reload callbacks
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (2 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 3/9] net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 5/9] net/mlx4: Use devl_ API for devlink region create / destroy Moshe Shemesh
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Change devlink instance locks in mlx5 driver to have devlink reload
callbacks locked, while keeping all driver paths which lead to devl_ API
functions called by the driver locked.

Add mlx5_load_one_devl_locked() and mlx5_unload_one_devl_locked() which
are used by the paths which are already locked such as devlink reload
callbacks.

This patch makes the driver use devl_ API also for traps register as
these functions are called from the driver paths parallel to reload that
requires locking now.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/dev.c | 19 ++-----
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 50 ++++++++++++-------
 .../net/ethernet/mellanox/mlx5/core/eswitch.c | 18 ++-----
 .../net/ethernet/mellanox/mlx5/core/main.c    | 34 ++++++++++++-
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  2 +
 .../net/ethernet/mellanox/mlx5/core/sriov.c   |  6 +++
 6 files changed, 79 insertions(+), 50 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index ccf2068d2e79..0571e40c6ee5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -335,13 +335,12 @@ static void del_adev(struct auxiliary_device *adev)
 
 int mlx5_attach_device(struct mlx5_core_dev *dev)
 {
-	struct devlink *devlink = priv_to_devlink(dev);
 	struct mlx5_priv *priv = &dev->priv;
 	struct auxiliary_device *adev;
 	struct auxiliary_driver *adrv;
 	int ret = 0, i;
 
-	devl_lock(devlink);
+	devl_assert_locked(priv_to_devlink(dev));
 	mutex_lock(&mlx5_intf_mutex);
 	priv->flags &= ~MLX5_PRIV_FLAGS_DETACH;
 	priv->flags |= MLX5_PRIV_FLAGS_MLX5E_LOCKED_FLOW;
@@ -394,20 +393,18 @@ int mlx5_attach_device(struct mlx5_core_dev *dev)
 	}
 	priv->flags &= ~MLX5_PRIV_FLAGS_MLX5E_LOCKED_FLOW;
 	mutex_unlock(&mlx5_intf_mutex);
-	devl_unlock(devlink);
 	return ret;
 }
 
 void mlx5_detach_device(struct mlx5_core_dev *dev)
 {
-	struct devlink *devlink = priv_to_devlink(dev);
 	struct mlx5_priv *priv = &dev->priv;
 	struct auxiliary_device *adev;
 	struct auxiliary_driver *adrv;
 	pm_message_t pm = {};
 	int i;
 
-	devl_lock(devlink);
+	devl_assert_locked(priv_to_devlink(dev));
 	mutex_lock(&mlx5_intf_mutex);
 	priv->flags |= MLX5_PRIV_FLAGS_MLX5E_LOCKED_FLOW;
 	for (i = ARRAY_SIZE(mlx5_adev_devices) - 1; i >= 0; i--) {
@@ -441,21 +438,17 @@ void mlx5_detach_device(struct mlx5_core_dev *dev)
 	priv->flags &= ~MLX5_PRIV_FLAGS_MLX5E_LOCKED_FLOW;
 	priv->flags |= MLX5_PRIV_FLAGS_DETACH;
 	mutex_unlock(&mlx5_intf_mutex);
-	devl_unlock(devlink);
 }
 
 int mlx5_register_device(struct mlx5_core_dev *dev)
 {
-	struct devlink *devlink;
 	int ret;
 
-	devlink = priv_to_devlink(dev);
-	devl_lock(devlink);
+	devl_assert_locked(priv_to_devlink(dev));
 	mutex_lock(&mlx5_intf_mutex);
 	dev->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV;
 	ret = mlx5_rescan_drivers_locked(dev);
 	mutex_unlock(&mlx5_intf_mutex);
-	devl_unlock(devlink);
 	if (ret)
 		mlx5_unregister_device(dev);
 
@@ -464,15 +457,11 @@ int mlx5_register_device(struct mlx5_core_dev *dev)
 
 void mlx5_unregister_device(struct mlx5_core_dev *dev)
 {
-	struct devlink *devlink;
-
-	devlink = priv_to_devlink(dev);
-	devl_lock(devlink);
+	devl_assert_locked(priv_to_devlink(dev));
 	mutex_lock(&mlx5_intf_mutex);
 	dev->priv.flags = MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV;
 	mlx5_rescan_drivers_locked(dev);
 	mutex_unlock(&mlx5_intf_mutex);
-	devl_unlock(devlink);
 }
 
 static int add_drivers(struct mlx5_core_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index 41bb50d94caa..1c05a7091698 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -108,7 +108,7 @@ static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netli
 	if (err)
 		return err;
 
-	mlx5_unload_one(dev);
+	mlx5_unload_one_devl_locked(dev);
 	err = mlx5_health_wait_pci_up(dev);
 	if (err)
 		NL_SET_ERR_MSG_MOD(extack, "FW activate aborted, PCI reads fail after reset");
@@ -143,6 +143,7 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
 	struct pci_dev *pdev = dev->pdev;
 	bool sf_dev_allocated;
+	int ret = 0;
 
 	sf_dev_allocated = mlx5_sf_dev_allocated(dev);
 	if (sf_dev_allocated) {
@@ -163,19 +164,25 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 		NL_SET_ERR_MSG_MOD(extack, "reload while VFs are present is unfavorable");
 	}
 
+	devl_lock(devlink);
 	switch (action) {
 	case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
-		mlx5_unload_one(dev);
-		return 0;
+		mlx5_unload_one_devl_locked(dev);
+		break;
 	case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
 		if (limit == DEVLINK_RELOAD_LIMIT_NO_RESET)
-			return mlx5_devlink_trigger_fw_live_patch(devlink, extack);
-		return mlx5_devlink_reload_fw_activate(devlink, extack);
+			ret = mlx5_devlink_trigger_fw_live_patch(devlink, extack);
+		else
+			ret = mlx5_devlink_reload_fw_activate(devlink, extack);
+		break;
 	default:
 		/* Unsupported action should not get to this function */
 		WARN_ON(1);
-		return -EOPNOTSUPP;
+		ret = -EOPNOTSUPP;
 	}
+
+	devl_unlock(devlink);
+	return ret;
 }
 
 static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_action action,
@@ -183,24 +190,29 @@ static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_a
 				  struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
+	int ret = 0;
 
+	devl_lock(devlink);
 	*actions_performed = BIT(action);
 	switch (action) {
 	case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
-		return mlx5_load_one(dev, false);
+		ret = mlx5_load_one_devl_locked(dev, false);
+		break;
 	case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
 		if (limit == DEVLINK_RELOAD_LIMIT_NO_RESET)
 			break;
 		/* On fw_activate action, also driver is reloaded and reinit performed */
 		*actions_performed |= BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
-		return mlx5_load_one(dev, false);
+		ret = mlx5_load_one_devl_locked(dev, false);
+		break;
 	default:
 		/* Unsupported action should not get to this function */
 		WARN_ON(1);
-		return -EOPNOTSUPP;
+		ret = -EOPNOTSUPP;
 	}
 
-	return 0;
+	devl_unlock(devlink);
+	return ret;
 }
 
 static struct mlx5_devlink_trap *mlx5_find_trap_by_id(struct mlx5_core_dev *dev, int trap_id)
@@ -837,28 +849,28 @@ static int mlx5_devlink_traps_register(struct devlink *devlink)
 	struct mlx5_core_dev *core_dev = devlink_priv(devlink);
 	int err;
 
-	err = devlink_trap_groups_register(devlink, mlx5_trap_groups_arr,
-					   ARRAY_SIZE(mlx5_trap_groups_arr));
+	err = devl_trap_groups_register(devlink, mlx5_trap_groups_arr,
+					ARRAY_SIZE(mlx5_trap_groups_arr));
 	if (err)
 		return err;
 
-	err = devlink_traps_register(devlink, mlx5_traps_arr, ARRAY_SIZE(mlx5_traps_arr),
-				     &core_dev->priv);
+	err = devl_traps_register(devlink, mlx5_traps_arr, ARRAY_SIZE(mlx5_traps_arr),
+				  &core_dev->priv);
 	if (err)
 		goto err_trap_group;
 	return 0;
 
 err_trap_group:
-	devlink_trap_groups_unregister(devlink, mlx5_trap_groups_arr,
-				       ARRAY_SIZE(mlx5_trap_groups_arr));
+	devl_trap_groups_unregister(devlink, mlx5_trap_groups_arr,
+				    ARRAY_SIZE(mlx5_trap_groups_arr));
 	return err;
 }
 
 static void mlx5_devlink_traps_unregister(struct devlink *devlink)
 {
-	devlink_traps_unregister(devlink, mlx5_traps_arr, ARRAY_SIZE(mlx5_traps_arr));
-	devlink_trap_groups_unregister(devlink, mlx5_trap_groups_arr,
-				       ARRAY_SIZE(mlx5_trap_groups_arr));
+	devl_traps_unregister(devlink, mlx5_traps_arr, ARRAY_SIZE(mlx5_traps_arr));
+	devl_trap_groups_unregister(devlink, mlx5_trap_groups_arr,
+				    ARRAY_SIZE(mlx5_trap_groups_arr));
 }
 
 int mlx5_devlink_register(struct devlink *devlink)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 30a6c9fbf1b6..6aa58044b949 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1300,20 +1300,19 @@ int mlx5_eswitch_enable_locked(struct mlx5_eswitch *esw, int num_vfs)
  */
 int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs)
 {
-	struct devlink *devlink;
 	bool toggle_lag;
 	int ret;
 
 	if (!mlx5_esw_allowed(esw))
 		return 0;
 
+	devl_assert_locked(priv_to_devlink(esw->dev));
+
 	toggle_lag = !mlx5_esw_is_fdb_created(esw);
 
 	if (toggle_lag)
 		mlx5_lag_disable_change(esw->dev);
 
-	devlink = priv_to_devlink(esw->dev);
-	devl_lock(devlink);
 	down_write(&esw->mode_lock);
 	if (!mlx5_esw_is_fdb_created(esw)) {
 		ret = mlx5_eswitch_enable_locked(esw, num_vfs);
@@ -1327,7 +1326,6 @@ int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs)
 			esw->esw_funcs.num_vfs = num_vfs;
 	}
 	up_write(&esw->mode_lock);
-	devl_unlock(devlink);
 
 	if (toggle_lag)
 		mlx5_lag_enable_change(esw->dev);
@@ -1338,13 +1336,10 @@ int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs)
 /* When disabling sriov, free driver level resources. */
 void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw, bool clear_vf)
 {
-	struct devlink *devlink;
-
 	if (!mlx5_esw_allowed(esw))
 		return;
 
-	devlink = priv_to_devlink(esw->dev);
-	devl_lock(devlink);
+	devl_assert_locked(priv_to_devlink(esw->dev));
 	down_write(&esw->mode_lock);
 	/* If driver is unloaded, this function is called twice by remove_one()
 	 * and mlx5_unload(). Prevent the second call.
@@ -1373,7 +1368,6 @@ void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw, bool clear_vf)
 
 unlock:
 	up_write(&esw->mode_lock);
-	devl_unlock(devlink);
 }
 
 /* Free resources for corresponding eswitch mode. It is called by devlink
@@ -1407,18 +1401,14 @@ void mlx5_eswitch_disable_locked(struct mlx5_eswitch *esw)
 
 void mlx5_eswitch_disable(struct mlx5_eswitch *esw)
 {
-	struct devlink *devlink;
-
 	if (!mlx5_esw_allowed(esw))
 		return;
 
+	devl_assert_locked(priv_to_devlink(esw->dev));
 	mlx5_lag_disable_change(esw->dev);
-	devlink = priv_to_devlink(esw->dev);
-	devl_lock(devlink);
 	down_write(&esw->mode_lock);
 	mlx5_eswitch_disable_locked(esw);
 	up_write(&esw->mode_lock);
-	devl_unlock(devlink);
 	mlx5_lag_enable_change(esw->dev);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 8b621c1ddd14..01fcb23eb69a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1304,8 +1304,10 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
 
 int mlx5_init_one(struct mlx5_core_dev *dev)
 {
+	struct devlink *devlink = priv_to_devlink(dev);
 	int err = 0;
 
+	devl_lock(devlink);
 	mutex_lock(&dev->intf_state_mutex);
 	dev->state = MLX5_DEVICE_STATE_UP;
 
@@ -1334,6 +1336,7 @@ int mlx5_init_one(struct mlx5_core_dev *dev)
 		goto err_register;
 
 	mutex_unlock(&dev->intf_state_mutex);
+	devl_unlock(devlink);
 	return 0;
 
 err_register:
@@ -1348,11 +1351,15 @@ int mlx5_init_one(struct mlx5_core_dev *dev)
 err_function:
 	dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
 	mutex_unlock(&dev->intf_state_mutex);
+	devl_unlock(devlink);
 	return err;
 }
 
 void mlx5_uninit_one(struct mlx5_core_dev *dev)
 {
+	struct devlink *devlink = priv_to_devlink(dev);
+
+	devl_lock(devlink);
 	mutex_lock(&dev->intf_state_mutex);
 
 	mlx5_unregister_device(dev);
@@ -1371,13 +1378,15 @@ void mlx5_uninit_one(struct mlx5_core_dev *dev)
 	mlx5_function_teardown(dev, true);
 out:
 	mutex_unlock(&dev->intf_state_mutex);
+	devl_unlock(devlink);
 }
 
-int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery)
+int mlx5_load_one_devl_locked(struct mlx5_core_dev *dev, bool recovery)
 {
 	int err = 0;
 	u64 timeout;
 
+	devl_assert_locked(priv_to_devlink(dev));
 	mutex_lock(&dev->intf_state_mutex);
 	if (test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
 		mlx5_core_warn(dev, "interface is up, NOP\n");
@@ -1419,8 +1428,20 @@ int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery)
 	return err;
 }
 
-void mlx5_unload_one(struct mlx5_core_dev *dev)
+int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery)
 {
+	struct devlink *devlink = priv_to_devlink(dev);
+	int ret;
+
+	devl_lock(devlink);
+	ret = mlx5_load_one_devl_locked(dev, recovery);
+	devl_unlock(devlink);
+	return ret;
+}
+
+void mlx5_unload_one_devl_locked(struct mlx5_core_dev *dev)
+{
+	devl_assert_locked(priv_to_devlink(dev));
 	mutex_lock(&dev->intf_state_mutex);
 
 	mlx5_detach_device(dev);
@@ -1438,6 +1459,15 @@ void mlx5_unload_one(struct mlx5_core_dev *dev)
 	mutex_unlock(&dev->intf_state_mutex);
 }
 
+void mlx5_unload_one(struct mlx5_core_dev *dev)
+{
+	struct devlink *devlink = priv_to_devlink(dev);
+
+	devl_lock(devlink);
+	mlx5_unload_one_devl_locked(dev);
+	devl_unlock(devlink);
+}
+
 static const int types[] = {
 	MLX5_CAP_GENERAL,
 	MLX5_CAP_GENERAL_2,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 9cc7afea2758..ad61b86d5769 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -290,7 +290,9 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev);
 int mlx5_init_one(struct mlx5_core_dev *dev);
 void mlx5_uninit_one(struct mlx5_core_dev *dev);
 void mlx5_unload_one(struct mlx5_core_dev *dev);
+void mlx5_unload_one_devl_locked(struct mlx5_core_dev *dev);
 int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery);
+int mlx5_load_one_devl_locked(struct mlx5_core_dev *dev, bool recovery);
 
 int mlx5_vport_get_other_func_cap(struct mlx5_core_dev *dev, u16 function_id, void *out);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index 5757cd6e1819..ee2e1b7c1310 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -154,13 +154,16 @@ mlx5_device_disable_sriov(struct mlx5_core_dev *dev, int num_vfs, bool clear_vf)
 static int mlx5_sriov_enable(struct pci_dev *pdev, int num_vfs)
 {
 	struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
+	struct devlink *devlink = priv_to_devlink(dev);
 	int err;
 
+	devl_lock(devlink);
 	err = mlx5_device_enable_sriov(dev, num_vfs);
 	if (err) {
 		mlx5_core_warn(dev, "mlx5_device_enable_sriov failed : %d\n", err);
 		return err;
 	}
+	devl_unlock(devlink);
 
 	err = pci_enable_sriov(pdev, num_vfs);
 	if (err) {
@@ -173,10 +176,13 @@ static int mlx5_sriov_enable(struct pci_dev *pdev, int num_vfs)
 void mlx5_sriov_disable(struct pci_dev *pdev)
 {
 	struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
+	struct devlink *devlink = priv_to_devlink(dev);
 	int num_vfs = pci_num_vf(dev->pdev);
 
 	pci_disable_sriov(pdev);
+	devl_lock(devlink);
 	mlx5_device_disable_sriov(dev, num_vfs, true);
+	devl_unlock(devlink);
 }
 
 int mlx5_core_sriov_configure(struct pci_dev *pdev, int num_vfs)
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 5/9] net/mlx4: Use devl_ API for devlink region create / destroy
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (3 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 4/9] net/mlx5: Lock mlx5 devlink reload callbacks Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 6/9] net/mlx4: Use devl_ API for devlink port register / unregister Moshe Shemesh
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Use devl_ API to call devl_region_create() and devl_region_destroy()
instead of devlink_region_create() and devlink_region_destroy().
Add devlink instance lock in mlx4 driver paths to these functions.

This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx4/crdump.c | 20 ++++++++++----------
 drivers/net/ethernet/mellanox/mlx4/main.c   |  7 +++++++
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/crdump.c b/drivers/net/ethernet/mellanox/mlx4/crdump.c
index ac5468b77488..82a07a31cde7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/crdump.c
+++ b/drivers/net/ethernet/mellanox/mlx4/crdump.c
@@ -226,10 +226,10 @@ int mlx4_crdump_init(struct mlx4_dev *dev)
 
 	/* Create cr-space region */
 	crdump->region_crspace =
-		devlink_region_create(devlink,
-				      &region_cr_space_ops,
-				      MAX_NUM_OF_DUMPS_TO_STORE,
-				      pci_resource_len(pdev, 0));
+		devl_region_create(devlink,
+				   &region_cr_space_ops,
+				   MAX_NUM_OF_DUMPS_TO_STORE,
+				   pci_resource_len(pdev, 0));
 	if (IS_ERR(crdump->region_crspace))
 		mlx4_warn(dev, "crdump: create devlink region %s err %ld\n",
 			  region_cr_space_str,
@@ -237,10 +237,10 @@ int mlx4_crdump_init(struct mlx4_dev *dev)
 
 	/* Create fw-health region */
 	crdump->region_fw_health =
-		devlink_region_create(devlink,
-				      &region_fw_health_ops,
-				      MAX_NUM_OF_DUMPS_TO_STORE,
-				      HEALTH_BUFFER_SIZE);
+		devl_region_create(devlink,
+				   &region_fw_health_ops,
+				   MAX_NUM_OF_DUMPS_TO_STORE,
+				   HEALTH_BUFFER_SIZE);
 	if (IS_ERR(crdump->region_fw_health))
 		mlx4_warn(dev, "crdump: create devlink region %s err %ld\n",
 			  region_fw_health_str,
@@ -253,6 +253,6 @@ void mlx4_crdump_end(struct mlx4_dev *dev)
 {
 	struct mlx4_fw_crdump *crdump = &dev->persist->crdump;
 
-	devlink_region_destroy(crdump->region_fw_health);
-	devlink_region_destroy(crdump->region_crspace);
+	devl_region_destroy(crdump->region_fw_health);
+	devl_region_destroy(crdump->region_crspace);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index b187c210d4d6..f3d13190b959 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -3732,6 +3732,7 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
 	int prb_vf[MLX4_MAX_PORTS + 1] = {0, 0, 0};
 	const int param_map[MLX4_MAX_PORTS + 1][MLX4_MAX_PORTS + 1] = {
 		{2, 0, 0}, {0, 1, 2}, {0, 1, 2} };
+	struct devlink *devlink = priv_to_devlink(priv);
 	unsigned total_vfs = 0;
 	unsigned int i;
 
@@ -3844,7 +3845,9 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
 		}
 	}
 
+	devl_lock(devlink);
 	err = mlx4_crdump_init(&priv->dev);
+	devl_unlock(devlink);
 	if (err)
 		goto err_release_regions;
 
@@ -3862,7 +3865,9 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
 	mlx4_catas_end(&priv->dev);
 
 err_crdump:
+	devl_lock(devlink);
 	mlx4_crdump_end(&priv->dev);
+	devl_unlock(devlink);
 
 err_release_regions:
 	pci_release_regions(pdev);
@@ -4161,7 +4166,9 @@ static void mlx4_remove_one(struct pci_dev *pdev)
 	else
 		mlx4_info(dev, "%s: interface is down\n", __func__);
 	mlx4_catas_end(dev);
+	devl_lock(devlink);
 	mlx4_crdump_end(dev);
+	devl_unlock(devlink);
 	if (dev->flags & MLX4_FLAG_SRIOV && !active_vfs) {
 		mlx4_warn(dev, "Disabling SR-IOV\n");
 		pci_disable_sriov(pdev);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 6/9] net/mlx4: Use devl_ API for devlink port register / unregister
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (4 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 5/9] net/mlx4: Use devl_ API for devlink region create / destroy Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 7/9] net/mlx4: Lock mlx4 devlink reload callback Moshe Shemesh
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Use devl_ API to call devl_port_register() and devl_port_unregister()
instead of devlink_port_register() and devlink_port_unregister(). Add
devlink instance lock in mlx4 driver paths to these functions.

This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index f3d13190b959..0166d003f22c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -3033,7 +3033,7 @@ static int mlx4_init_port_info(struct mlx4_dev *dev, int port)
 	struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
 	int err;
 
-	err = devlink_port_register(devlink, &info->devlink_port, port);
+	err = devl_port_register(devlink, &info->devlink_port, port);
 	if (err)
 		return err;
 
@@ -3071,7 +3071,7 @@ static int mlx4_init_port_info(struct mlx4_dev *dev, int port)
 	err = device_create_file(&dev->persist->pdev->dev, &info->port_attr);
 	if (err) {
 		mlx4_err(dev, "Failed to create file for port %d\n", port);
-		devlink_port_unregister(&info->devlink_port);
+		devl_port_unregister(&info->devlink_port);
 		info->port = -1;
 		return err;
 	}
@@ -3093,7 +3093,7 @@ static int mlx4_init_port_info(struct mlx4_dev *dev, int port)
 		mlx4_err(dev, "Failed to create mtu file for port %d\n", port);
 		device_remove_file(&info->dev->persist->pdev->dev,
 				   &info->port_attr);
-		devlink_port_unregister(&info->devlink_port);
+		devl_port_unregister(&info->devlink_port);
 		info->port = -1;
 		return err;
 	}
@@ -3109,7 +3109,7 @@ static void mlx4_cleanup_port_info(struct mlx4_port_info *info)
 	device_remove_file(&info->dev->persist->pdev->dev, &info->port_attr);
 	device_remove_file(&info->dev->persist->pdev->dev,
 			   &info->port_mtu_attr);
-	devlink_port_unregister(&info->devlink_port);
+	devl_port_unregister(&info->devlink_port);
 
 #ifdef CONFIG_RFS_ACCEL
 	free_irq_cpu_rmap(info->rmap);
@@ -3333,6 +3333,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 			 int total_vfs, int *nvfs, struct mlx4_priv *priv,
 			 int reset_flow)
 {
+	struct devlink *devlink = priv_to_devlink(priv);
 	struct mlx4_dev *dev;
 	unsigned sum = 0;
 	int err;
@@ -3629,6 +3630,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 		}
 	}
 
+	devl_lock(devlink);
 	for (port = 1; port <= dev->caps.num_ports; port++) {
 		err = mlx4_init_port_info(dev, port);
 		if (err)
@@ -3642,6 +3644,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 	if (err)
 		goto err_port;
 
+	devl_unlock(devlink);
 	mlx4_request_modules(dev);
 
 	mlx4_sense_init(dev);
@@ -3658,6 +3661,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 err_port:
 	for (--port; port >= 1; --port)
 		mlx4_cleanup_port_info(&priv->port[port]);
+	devl_unlock(devlink);
 
 	mlx4_cleanup_default_counters(dev);
 	if (!mlx4_is_slave(dev))
@@ -4061,8 +4065,10 @@ static void mlx4_unload_one(struct pci_dev *pdev)
 	struct mlx4_dev  *dev  = persist->dev;
 	struct mlx4_priv *priv = mlx4_priv(dev);
 	int               pci_dev_data;
+	struct devlink *devlink;
 	int p, i;
 
+	devlink = priv_to_devlink(priv);
 	if (priv->removed)
 		return;
 
@@ -4078,10 +4084,12 @@ static void mlx4_unload_one(struct pci_dev *pdev)
 	mlx4_stop_sense(dev);
 	mlx4_unregister_device(dev);
 
+	devl_lock(devlink);
 	for (p = 1; p <= dev->caps.num_ports; p++) {
 		mlx4_cleanup_port_info(&priv->port[p]);
 		mlx4_CLOSE_PORT(dev, p);
 	}
+	devl_unlock(devlink);
 
 	if (mlx4_is_master(dev))
 		mlx4_free_resource_tracker(dev,
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 7/9] net/mlx4: Lock mlx4 devlink reload callback
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (5 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 6/9] net/mlx4: Use devl_ API for devlink port register / unregister Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 8/9] net/mlx5: Lock mlx5 devlink health recovery callback Moshe Shemesh
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Change devlink instance locks in mlx4 driver to have devlink reload
callback locked, while keeping all driver paths which leads to devl_ API
functions called by the mlx4 driver locked.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx4/catas.c |  5 +++
 drivers/net/ethernet/mellanox/mlx4/main.c  | 45 +++++++++++++++-------
 2 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/catas.c b/drivers/net/ethernet/mellanox/mlx4/catas.c
index 5b11557f1ae4..0eb7b83637d8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/catas.c
+++ b/drivers/net/ethernet/mellanox/mlx4/catas.c
@@ -204,9 +204,13 @@ void mlx4_enter_error_state(struct mlx4_dev_persistent *persist)
 
 static void mlx4_handle_error_state(struct mlx4_dev_persistent *persist)
 {
+	struct mlx4_dev *dev = persist->dev;
+	struct devlink *devlink;
 	int err = 0;
 
 	mlx4_enter_error_state(persist);
+	devlink = priv_to_devlink(mlx4_priv(dev));
+	devl_lock(devlink);
 	mutex_lock(&persist->interface_state_mutex);
 	if (persist->interface_state & MLX4_INTERFACE_STATE_UP &&
 	    !(persist->interface_state & MLX4_INTERFACE_STATE_DELETION)) {
@@ -215,6 +219,7 @@ static void mlx4_handle_error_state(struct mlx4_dev_persistent *persist)
 			  err);
 	}
 	mutex_unlock(&persist->interface_state_mutex);
+	devl_unlock(devlink);
 }
 
 static void dump_err_buf(struct mlx4_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 0166d003f22c..2c764d1d897d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -3342,6 +3342,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 	struct mlx4_dev_cap *dev_cap = NULL;
 	int existing_vfs = 0;
 
+	devl_assert_locked(devlink);
 	dev = &priv->dev;
 
 	INIT_LIST_HEAD(&priv->ctx_list);
@@ -3630,7 +3631,6 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 		}
 	}
 
-	devl_lock(devlink);
 	for (port = 1; port <= dev->caps.num_ports; port++) {
 		err = mlx4_init_port_info(dev, port);
 		if (err)
@@ -3644,7 +3644,6 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 	if (err)
 		goto err_port;
 
-	devl_unlock(devlink);
 	mlx4_request_modules(dev);
 
 	mlx4_sense_init(dev);
@@ -3661,7 +3660,6 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
 err_port:
 	for (--port; port >= 1; --port)
 		mlx4_cleanup_port_info(&priv->port[port]);
-	devl_unlock(devlink);
 
 	mlx4_cleanup_default_counters(dev);
 	if (!mlx4_is_slave(dev))
@@ -3736,7 +3734,6 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
 	int prb_vf[MLX4_MAX_PORTS + 1] = {0, 0, 0};
 	const int param_map[MLX4_MAX_PORTS + 1][MLX4_MAX_PORTS + 1] = {
 		{2, 0, 0}, {0, 1, 2}, {0, 1, 2} };
-	struct devlink *devlink = priv_to_devlink(priv);
 	unsigned total_vfs = 0;
 	unsigned int i;
 
@@ -3849,9 +3846,7 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
 		}
 	}
 
-	devl_lock(devlink);
 	err = mlx4_crdump_init(&priv->dev);
-	devl_unlock(devlink);
 	if (err)
 		goto err_release_regions;
 
@@ -3869,9 +3864,7 @@ static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
 	mlx4_catas_end(&priv->dev);
 
 err_crdump:
-	devl_lock(devlink);
 	mlx4_crdump_end(&priv->dev);
-	devl_unlock(devlink);
 
 err_release_regions:
 	pci_release_regions(pdev);
@@ -3965,9 +3958,11 @@ static int mlx4_devlink_reload_down(struct devlink *devlink, bool netns_change,
 		NL_SET_ERR_MSG_MOD(extack, "Namespace change is not supported");
 		return -EOPNOTSUPP;
 	}
+	devl_lock(devlink);
 	if (persist->num_vfs)
 		mlx4_warn(persist->dev, "Reload performed on PF, will cause reset on operating Virtual Functions\n");
 	mlx4_restart_one_down(persist->pdev);
+	devl_unlock(devlink);
 	return 0;
 }
 
@@ -3980,8 +3975,10 @@ static int mlx4_devlink_reload_up(struct devlink *devlink, enum devlink_reload_a
 	struct mlx4_dev_persistent *persist = dev->persist;
 	int err;
 
+	devl_lock(devlink);
 	*actions_performed = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
 	err = mlx4_restart_one_up(persist->pdev, true, devlink);
+	devl_unlock(devlink);
 	if (err)
 		mlx4_err(persist->dev, "mlx4_restart_one_up failed, ret=%d\n",
 			 err);
@@ -4008,6 +4005,7 @@ static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 	devlink = devlink_alloc(&mlx4_devlink_ops, sizeof(*priv), &pdev->dev);
 	if (!devlink)
 		return -ENOMEM;
+	devl_lock(devlink);
 	priv = devlink_priv(devlink);
 
 	dev       = &priv->dev;
@@ -4035,6 +4033,7 @@ static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	pci_save_state(pdev);
 	devlink_set_features(devlink, DEVLINK_F_RELOAD);
+	devl_unlock(devlink);
 	devlink_register(devlink);
 	return 0;
 
@@ -4044,6 +4043,7 @@ static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 err_devlink_unregister:
 	kfree(dev->persist);
 err_devlink_free:
+	devl_unlock(devlink);
 	devlink_free(devlink);
 	return ret;
 }
@@ -4069,6 +4069,7 @@ static void mlx4_unload_one(struct pci_dev *pdev)
 	int p, i;
 
 	devlink = priv_to_devlink(priv);
+	devl_assert_locked(devlink);
 	if (priv->removed)
 		return;
 
@@ -4084,12 +4085,10 @@ static void mlx4_unload_one(struct pci_dev *pdev)
 	mlx4_stop_sense(dev);
 	mlx4_unregister_device(dev);
 
-	devl_lock(devlink);
 	for (p = 1; p <= dev->caps.num_ports; p++) {
 		mlx4_cleanup_port_info(&priv->port[p]);
 		mlx4_CLOSE_PORT(dev, p);
 	}
-	devl_unlock(devlink);
 
 	if (mlx4_is_master(dev))
 		mlx4_free_resource_tracker(dev,
@@ -4150,6 +4149,7 @@ static void mlx4_remove_one(struct pci_dev *pdev)
 
 	devlink_unregister(devlink);
 
+	devl_lock(devlink);
 	if (mlx4_is_slave(dev))
 		persist->interface_state |= MLX4_INTERFACE_STATE_NOWAIT;
 
@@ -4174,9 +4174,7 @@ static void mlx4_remove_one(struct pci_dev *pdev)
 	else
 		mlx4_info(dev, "%s: interface is down\n", __func__);
 	mlx4_catas_end(dev);
-	devl_lock(devlink);
 	mlx4_crdump_end(dev);
-	devl_unlock(devlink);
 	if (dev->flags & MLX4_FLAG_SRIOV && !active_vfs) {
 		mlx4_warn(dev, "Disabling SR-IOV\n");
 		pci_disable_sriov(pdev);
@@ -4187,6 +4185,7 @@ static void mlx4_remove_one(struct pci_dev *pdev)
 	devlink_params_unregister(devlink, mlx4_devlink_params,
 				  ARRAY_SIZE(mlx4_devlink_params));
 	kfree(dev->persist);
+	devl_unlock(devlink);
 	devlink_free(devlink);
 }
 
@@ -4307,15 +4306,20 @@ static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
 					      pci_channel_state_t state)
 {
 	struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+	struct mlx4_dev *dev = persist->dev;
+	struct devlink *devlink;
 
 	mlx4_err(persist->dev, "mlx4_pci_err_detected was called\n");
 	mlx4_enter_error_state(persist);
 
+	devlink = priv_to_devlink(mlx4_priv(dev));
+	devl_lock(devlink);
 	mutex_lock(&persist->interface_state_mutex);
 	if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
 		mlx4_unload_one(pdev);
 
 	mutex_unlock(&persist->interface_state_mutex);
+	devl_unlock(devlink);
 	if (state == pci_channel_io_perm_failure)
 		return PCI_ERS_RESULT_DISCONNECT;
 
@@ -4348,6 +4352,7 @@ static void mlx4_pci_resume(struct pci_dev *pdev)
 	struct mlx4_dev	 *dev  = persist->dev;
 	struct mlx4_priv *priv = mlx4_priv(dev);
 	int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+	struct devlink *devlink;
 	int total_vfs;
 	int err;
 
@@ -4355,6 +4360,8 @@ static void mlx4_pci_resume(struct pci_dev *pdev)
 	total_vfs = dev->persist->num_vfs;
 	memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
 
+	devlink = priv_to_devlink(priv);
+	devl_lock(devlink);
 	mutex_lock(&persist->interface_state_mutex);
 	if (!(persist->interface_state & MLX4_INTERFACE_STATE_UP)) {
 		err = mlx4_load_one(pdev, priv->pci_dev_data, total_vfs, nvfs,
@@ -4373,19 +4380,23 @@ static void mlx4_pci_resume(struct pci_dev *pdev)
 	}
 end:
 	mutex_unlock(&persist->interface_state_mutex);
-
+	devl_unlock(devlink);
 }
 
 static void mlx4_shutdown(struct pci_dev *pdev)
 {
 	struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
 	struct mlx4_dev *dev = persist->dev;
+	struct devlink *devlink;
 
 	mlx4_info(persist->dev, "mlx4_shutdown was called\n");
+	devlink = priv_to_devlink(mlx4_priv(dev));
+	devl_lock(devlink);
 	mutex_lock(&persist->interface_state_mutex);
 	if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
 		mlx4_unload_one(pdev);
 	mutex_unlock(&persist->interface_state_mutex);
+	devl_unlock(devlink);
 	mlx4_pci_disable_device(dev);
 }
 
@@ -4400,12 +4411,16 @@ static int __maybe_unused mlx4_suspend(struct device *dev_d)
 	struct pci_dev *pdev = to_pci_dev(dev_d);
 	struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
 	struct mlx4_dev	*dev = persist->dev;
+	struct devlink *devlink;
 
 	mlx4_err(dev, "suspend was called\n");
+	devlink = priv_to_devlink(mlx4_priv(dev));
+	devl_lock(devlink);
 	mutex_lock(&persist->interface_state_mutex);
 	if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
 		mlx4_unload_one(pdev);
 	mutex_unlock(&persist->interface_state_mutex);
+	devl_unlock(devlink);
 
 	return 0;
 }
@@ -4417,6 +4432,7 @@ static int __maybe_unused mlx4_resume(struct device *dev_d)
 	struct mlx4_dev	*dev = persist->dev;
 	struct mlx4_priv *priv = mlx4_priv(dev);
 	int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+	struct devlink *devlink;
 	int total_vfs;
 	int ret = 0;
 
@@ -4424,6 +4440,8 @@ static int __maybe_unused mlx4_resume(struct device *dev_d)
 	total_vfs = dev->persist->num_vfs;
 	memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
 
+	devlink = priv_to_devlink(priv);
+	devl_lock(devlink);
 	mutex_lock(&persist->interface_state_mutex);
 	if (!(persist->interface_state & MLX4_INTERFACE_STATE_UP)) {
 		ret = mlx4_load_one(pdev, priv->pci_dev_data, total_vfs,
@@ -4437,6 +4455,7 @@ static int __maybe_unused mlx4_resume(struct device *dev_d)
 		}
 	}
 	mutex_unlock(&persist->interface_state_mutex);
+	devl_unlock(devlink);
 
 	return ret;
 }
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 8/9] net/mlx5: Lock mlx5 devlink health recovery callback
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (6 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 7/9] net/mlx4: Lock mlx4 devlink reload callback Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-28 15:53 ` [PATCH net-next v2 9/9] devlink: Hold the instance lock in health callbacks Moshe Shemesh
  2022-07-29  5:30 ` [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks patchwork-bot+netdevbpf
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Change devlink instance locks in mlx5 driver to have devlink health
recovery callback locked, while keeping all driver paths which lead to
devl_ API functions called by the driver locked.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 12 +++++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/main.c   |  4 ++--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 659021c31cbd..6e154b5c2bc6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -622,8 +622,14 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
 			       struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
+	struct devlink *devlink = priv_to_devlink(dev);
+	int ret;
 
-	return mlx5_health_try_recover(dev);
+	devl_lock(devlink);
+	ret = mlx5_health_try_recover(dev);
+	devl_unlock(devlink);
+
+	return ret;
 }
 
 static int
@@ -666,16 +672,20 @@ static void mlx5_fw_fatal_reporter_err_work(struct work_struct *work)
 	struct mlx5_fw_reporter_ctx fw_reporter_ctx;
 	struct mlx5_core_health *health;
 	struct mlx5_core_dev *dev;
+	struct devlink *devlink;
 	struct mlx5_priv *priv;
 
 	health = container_of(work, struct mlx5_core_health, fatal_report_work);
 	priv = container_of(health, struct mlx5_priv, health);
 	dev = container_of(priv, struct mlx5_core_dev, priv);
+	devlink = priv_to_devlink(dev);
 
 	enter_error_state(dev, false);
 	if (IS_ERR_OR_NULL(health->fw_fatal_reporter)) {
+		devl_lock(devlink);
 		if (mlx5_health_try_recover(dev))
 			mlx5_core_err(dev, "health recovery failed\n");
+		devl_unlock(devlink);
 		return;
 	}
 	fw_reporter_ctx.err_synd = health->synd;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 01fcb23eb69a..1de9b39a6359 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1932,7 +1932,7 @@ MODULE_DEVICE_TABLE(pci, mlx5_core_pci_table);
 void mlx5_disable_device(struct mlx5_core_dev *dev)
 {
 	mlx5_error_sw_reset(dev);
-	mlx5_unload_one(dev);
+	mlx5_unload_one_devl_locked(dev);
 }
 
 int mlx5_recover_device(struct mlx5_core_dev *dev)
@@ -1943,7 +1943,7 @@ int mlx5_recover_device(struct mlx5_core_dev *dev)
 			return -EIO;
 	}
 
-	return mlx5_load_one(dev, true);
+	return mlx5_load_one_devl_locked(dev, true);
 }
 
 static struct pci_driver mlx5_core_driver = {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 9/9] devlink: Hold the instance lock in health callbacks
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (7 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 8/9] net/mlx5: Lock mlx5 devlink health recovery callback Moshe Shemesh
@ 2022-07-28 15:53 ` Moshe Shemesh
  2022-07-29  5:30 ` [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks patchwork-bot+netdevbpf
  9 siblings, 0 replies; 11+ messages in thread
From: Moshe Shemesh @ 2022-07-28 15:53 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Eric Dumazet, Paolo Abeni
  Cc: Jiri Pirko, netdev, Moshe Shemesh

Let the core take the devlink instance lock around health callbacks and
remove the now redundant locking in the drivers.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
---
v1->v2:
- added lock while calling devlink_health_reporter_recover() from
  devlink core (bug found by test)
---
 .../net/ethernet/mellanox/mlx5/core/health.c  |  8 +----
 net/core/devlink.c                            | 30 +++++++++----------
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 6e154b5c2bc6..2cf2c9948446 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -622,14 +622,8 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
 			       struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
-	struct devlink *devlink = priv_to_devlink(dev);
-	int ret;
 
-	devl_lock(devlink);
-	ret = mlx5_health_try_recover(dev);
-	devl_unlock(devlink);
-
-	return ret;
+	return mlx5_health_try_recover(dev);
 }
 
 static int
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 274dafd8a594..ee9cbc4f755e 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -7553,6 +7553,7 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
 	enum devlink_health_reporter_state prev_health_state;
 	struct devlink *devlink = reporter->devlink;
 	unsigned long recover_ts_threshold;
+	int ret;
 
 	/* write a log message of the current error */
 	WARN_ON(!msg);
@@ -7586,11 +7587,14 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
 		mutex_unlock(&reporter->dump_lock);
 	}
 
-	if (reporter->auto_recover)
-		return devlink_health_reporter_recover(reporter,
-						       priv_ctx, NULL);
+	if (!reporter->auto_recover)
+		return 0;
 
-	return 0;
+	devl_lock(devlink);
+	ret = devlink_health_reporter_recover(reporter, priv_ctx, NULL);
+	devl_unlock(devlink);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(devlink_health_report);
 
@@ -9264,8 +9268,7 @@ static const struct genl_small_ops devlink_nl_ops[] = {
 		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
 		.doit = devlink_nl_cmd_health_reporter_get_doit,
 		.dumpit = devlink_nl_cmd_health_reporter_get_dumpit,
-		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT |
-				  DEVLINK_NL_FLAG_NO_LOCK,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT,
 		/* can be retrieved by unprivileged users */
 	},
 	{
@@ -9273,24 +9276,21 @@ static const struct genl_small_ops devlink_nl_ops[] = {
 		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
 		.doit = devlink_nl_cmd_health_reporter_set_doit,
 		.flags = GENL_ADMIN_PERM,
-		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT |
-				  DEVLINK_NL_FLAG_NO_LOCK,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT,
 	},
 	{
 		.cmd = DEVLINK_CMD_HEALTH_REPORTER_RECOVER,
 		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
 		.doit = devlink_nl_cmd_health_reporter_recover_doit,
 		.flags = GENL_ADMIN_PERM,
-		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT |
-				  DEVLINK_NL_FLAG_NO_LOCK,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT,
 	},
 	{
 		.cmd = DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE,
 		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
 		.doit = devlink_nl_cmd_health_reporter_diagnose_doit,
 		.flags = GENL_ADMIN_PERM,
-		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT |
-				  DEVLINK_NL_FLAG_NO_LOCK,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT,
 	},
 	{
 		.cmd = DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET,
@@ -9304,16 +9304,14 @@ static const struct genl_small_ops devlink_nl_ops[] = {
 		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
 		.doit = devlink_nl_cmd_health_reporter_dump_clear_doit,
 		.flags = GENL_ADMIN_PERM,
-		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT |
-				  DEVLINK_NL_FLAG_NO_LOCK,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT,
 	},
 	{
 		.cmd = DEVLINK_CMD_HEALTH_REPORTER_TEST,
 		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
 		.doit = devlink_nl_cmd_health_reporter_test_doit,
 		.flags = GENL_ADMIN_PERM,
-		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT |
-				  DEVLINK_NL_FLAG_NO_LOCK,
+		.internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT,
 	},
 	{
 		.cmd = DEVLINK_CMD_FLASH_UPDATE,
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks
  2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
                   ` (8 preceding siblings ...)
  2022-07-28 15:53 ` [PATCH net-next v2 9/9] devlink: Hold the instance lock in health callbacks Moshe Shemesh
@ 2022-07-29  5:30 ` patchwork-bot+netdevbpf
  9 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-07-29  5:30 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: davem, kuba, saeedm, leon, tariqt, edumazet, pabeni, jiri, netdev

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 28 Jul 2022 18:53:41 +0300 you wrote:
> Prepare mlx4 and mlx5 drivers to have all devlink callbacks called with
> devlink instance locked. Change mlx4 driver to use devl_ API where
> needed to have devlink reload callbacks locked. Change mlx5 driver to
> use devl_ API where needed to have devlink reload and devlink health
> callbacks locked.
> 
> As mlx5 is the only driver which needed changes to enable calling health
> callbacks with devlink instance locked, this patchset also removes
> DEVLINK_NL_FLAG_NO_LOCK flag from devlink health callbacks.
> 
> [...]

Here is the summary with links:
  - [net-next,v2,1/9] net: devlink: remove region snapshot ID tracking dependency on devlink->lock
    https://git.kernel.org/netdev/net-next/c/5502e8712c9b
  - [net-next,v2,2/9] net: devlink: remove region snapshots list dependency on devlink->lock
    https://git.kernel.org/netdev/net-next/c/2dec18ad826f
  - [net-next,v2,3/9] net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload
    https://git.kernel.org/netdev/net-next/c/c12f4c6ac3b4
  - [net-next,v2,4/9] net/mlx5: Lock mlx5 devlink reload callbacks
    https://git.kernel.org/netdev/net-next/c/84a433a40d0e
  - [net-next,v2,5/9] net/mlx4: Use devl_ API for devlink region create / destroy
    https://git.kernel.org/netdev/net-next/c/9cb7e94a78b5
  - [net-next,v2,6/9] net/mlx4: Use devl_ API for devlink port register / unregister
    https://git.kernel.org/netdev/net-next/c/a8c05514b2f8
  - [net-next,v2,7/9] net/mlx4: Lock mlx4 devlink reload callback
    https://git.kernel.org/netdev/net-next/c/60d7ceea4b2a
  - [net-next,v2,8/9] net/mlx5: Lock mlx5 devlink health recovery callback
    https://git.kernel.org/netdev/net-next/c/d3dbdc9f8ddc
  - [net-next,v2,9/9] devlink: Hold the instance lock in health callbacks
    https://git.kernel.org/netdev/net-next/c/c90005b5f75c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-07-29  5:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-28 15:53 [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 1/9] net: devlink: remove region snapshot ID tracking dependency on devlink->lock Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 2/9] net: devlink: remove region snapshots list " Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 3/9] net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 4/9] net/mlx5: Lock mlx5 devlink reload callbacks Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 5/9] net/mlx4: Use devl_ API for devlink region create / destroy Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 6/9] net/mlx4: Use devl_ API for devlink port register / unregister Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 7/9] net/mlx4: Lock mlx4 devlink reload callback Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 8/9] net/mlx5: Lock mlx5 devlink health recovery callback Moshe Shemesh
2022-07-28 15:53 ` [PATCH net-next v2 9/9] devlink: Hold the instance lock in health callbacks Moshe Shemesh
2022-07-29  5:30 ` [PATCH net-next v2 0/9] Take devlink lock on mlx4 and mlx5 callbacks patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.