netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister
@ 2023-01-06  6:33 Jakub Kicinski
  2023-01-06  6:33 ` [PATCH net-next 1/9] devlink: bump the instance index directly when iterating Jakub Kicinski
                   ` (11 more replies)
  0 siblings, 12 replies; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

Move the registration and unregistration of the devlink instances
under their instance locks. Don't perform the netdev-style wait
for all references when unregistering the instance.

Instead the devlink instance refcount will only ensure that
the memory of the instance is not freed. All places which acquire
access to devlink instances via a reference must check that the
instance is still registered under the instance lock.

This fixes the problem of the netdev code accessing devlink
instances before they are registered.

RFC: https://lore.kernel.org/all/20221217011953.152487-1-kuba@kernel.org/
 - rewrite the cover letter
 - rewrite the commit message for patch 1
 - un-export and rename devl_is_alive
 - squash the netdevsim patches

Jakub Kicinski (9):
  devlink: bump the instance index directly when iterating
  devlink: update the code in netns move to latest helpers
  devlink: protect devlink->dev by the instance lock
  devlink: always check if the devlink instance is registered
  devlink: remove the registration guarantee of references
  devlink: don't require setting features before registration
  devlink: allow registering parameters after the instance
  netdevsim: rename a label
  netdevsim: move devlink registration under the instance lock

 drivers/net/netdevsim/dev.c |  15 +++--
 include/net/devlink.h       |   2 +
 net/devlink/core.c          | 121 ++++++++++++++++--------------------
 net/devlink/devl_internal.h |  28 ++++-----
 net/devlink/leftover.c      |  64 ++++++++++++-------
 net/devlink/netlink.c       |  19 ++++--
 6 files changed, 137 insertions(+), 112 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH net-next 1/9] devlink: bump the instance index directly when iterating
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
@ 2023-01-06  6:33 ` Jakub Kicinski
  2023-01-06 12:17   ` Jiri Pirko
  2023-01-06  6:33 ` [PATCH net-next 2/9] devlink: update the code in netns move to latest helpers Jakub Kicinski
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

xa_find_after() is designed to handle multi-index entries correctly.
If a xarray has two entries one which spans indexes 0-3 and one at
index 4 xa_find_after(0) will return the entry at index 4.

Having to juggle the two callbacks, however, is unnecessary in case
of the devlink xarray, as there is 1:1 relationship with indexes.

Always use xa_find() and increment the index manually.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/devlink/core.c          | 31 +++++++++----------------------
 net/devlink/devl_internal.h | 17 ++++-------------
 2 files changed, 13 insertions(+), 35 deletions(-)

diff --git a/net/devlink/core.c b/net/devlink/core.c
index 371d6821315d..88c88b8053e2 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -91,16 +91,13 @@ void devlink_put(struct devlink *devlink)
 		call_rcu(&devlink->rcu, __devlink_put_rcu);
 }
 
-struct devlink *
-devlinks_xa_find_get(struct net *net, unsigned long *indexp,
-		     void * (*xa_find_fn)(struct xarray *, unsigned long *,
-					  unsigned long, xa_mark_t))
+struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp)
 {
-	struct devlink *devlink;
+	struct devlink *devlink = NULL;
 
 	rcu_read_lock();
 retry:
-	devlink = xa_find_fn(&devlinks, indexp, ULONG_MAX, DEVLINK_REGISTERED);
+	devlink = xa_find(&devlinks, indexp, ULONG_MAX, DEVLINK_REGISTERED);
 	if (!devlink)
 		goto unlock;
 
@@ -109,31 +106,21 @@ devlinks_xa_find_get(struct net *net, unsigned long *indexp,
 	 * This prevents live-lock of devlink_unregister() wait for completion.
 	 */
 	if (xa_get_mark(&devlinks, *indexp, DEVLINK_UNREGISTERING))
-		goto retry;
+		goto next;
 
-	/* For a possible retry, the xa_find_after() should be always used */
-	xa_find_fn = xa_find_after;
 	if (!devlink_try_get(devlink))
-		goto retry;
+		goto next;
 	if (!net_eq(devlink_net(devlink), net)) {
 		devlink_put(devlink);
-		goto retry;
+		goto next;
 	}
 unlock:
 	rcu_read_unlock();
 	return devlink;
-}
-
-struct devlink *
-devlinks_xa_find_get_first(struct net *net, unsigned long *indexp)
-{
-	return devlinks_xa_find_get(net, indexp, xa_find);
-}
 
-struct devlink *
-devlinks_xa_find_get_next(struct net *net, unsigned long *indexp)
-{
-	return devlinks_xa_find_get(net, indexp, xa_find_after);
+next:
+	(*indexp)++;
+	goto retry;
 }
 
 /**
diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
index adf9f6c177db..14767e809178 100644
--- a/net/devlink/devl_internal.h
+++ b/net/devlink/devl_internal.h
@@ -82,18 +82,9 @@ extern struct genl_family devlink_nl_family;
  * in loop body in order to release the reference.
  */
 #define devlinks_xa_for_each_registered_get(net, index, devlink)	\
-	for (index = 0,							\
-	     devlink = devlinks_xa_find_get_first(net, &index);	\
-	     devlink; devlink = devlinks_xa_find_get_next(net, &index))
-
-struct devlink *
-devlinks_xa_find_get(struct net *net, unsigned long *indexp,
-		     void * (*xa_find_fn)(struct xarray *, unsigned long *,
-					  unsigned long, xa_mark_t));
-struct devlink *
-devlinks_xa_find_get_first(struct net *net, unsigned long *indexp);
-struct devlink *
-devlinks_xa_find_get_next(struct net *net, unsigned long *indexp);
+	for (index = 0; (devlink = devlinks_xa_find_get(net, &index)); index++)
+
+struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp);
 
 /* Netlink */
 #define DEVLINK_NL_FLAG_NEED_PORT		BIT(0)
@@ -135,7 +126,7 @@ struct devlink_gen_cmd {
  */
 #define devlink_dump_for_each_instance_get(msg, state, devlink)		\
 	for (; (devlink = devlinks_xa_find_get(sock_net(msg->sk),	\
-					       &state->instance, xa_find)); \
+					       &state->instance));	\
 	     state->instance++, state->idx = 0)
 
 extern const struct genl_small_ops devlink_nl_ops[56];
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 2/9] devlink: update the code in netns move to latest helpers
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
  2023-01-06  6:33 ` [PATCH net-next 1/9] devlink: bump the instance index directly when iterating Jakub Kicinski
@ 2023-01-06  6:33 ` Jakub Kicinski
  2023-01-06  6:33 ` [PATCH net-next 3/9] devlink: protect devlink->dev by the instance lock Jakub Kicinski
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski,
	Jiri Pirko

devlink_pernet_pre_exit() is the only obvious place which takes
the instance lock without using the devl_ helpers. Update the code
and move the error print after releasing the reference
(having unlock and put together feels slightly idiomatic).

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/devlink/core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/devlink/core.c b/net/devlink/core.c
index 88c88b8053e2..d3b8336946fd 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -299,15 +299,16 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
 	 */
 	devlinks_xa_for_each_registered_get(net, index, devlink) {
 		WARN_ON(!(devlink->features & DEVLINK_F_RELOAD));
-		mutex_lock(&devlink->lock);
+		devl_lock(devlink);
 		err = devlink_reload(devlink, &init_net,
 				     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
 				     DEVLINK_RELOAD_LIMIT_UNSPEC,
 				     &actions_performed, NULL);
-		mutex_unlock(&devlink->lock);
+		devl_unlock(devlink);
+		devlink_put(devlink);
+
 		if (err && err != -EOPNOTSUPP)
 			pr_warn("Failed to reload devlink instance into init_net\n");
-		devlink_put(devlink);
 	}
 }
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 3/9] devlink: protect devlink->dev by the instance lock
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
  2023-01-06  6:33 ` [PATCH net-next 1/9] devlink: bump the instance index directly when iterating Jakub Kicinski
  2023-01-06  6:33 ` [PATCH net-next 2/9] devlink: update the code in netns move to latest helpers Jakub Kicinski
@ 2023-01-06  6:33 ` Jakub Kicinski
  2023-01-06 12:18   ` Jiri Pirko
  2023-01-06  6:33 ` [PATCH net-next 4/9] devlink: always check if the devlink instance is registered Jakub Kicinski
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

devlink->dev is assumed to be always valid as long as any
outstanding reference to the devlink instance exists.

In prep for weakening of the references take the instance lock.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/devlink/devl_internal.h | 3 ++-
 net/devlink/leftover.c      | 7 +++----
 net/devlink/netlink.c       | 9 ++++++---
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
index 14767e809178..6342552e5f99 100644
--- a/net/devlink/devl_internal.h
+++ b/net/devlink/devl_internal.h
@@ -131,7 +131,8 @@ struct devlink_gen_cmd {
 
 extern const struct genl_small_ops devlink_nl_ops[56];
 
-struct devlink *devlink_get_from_attrs(struct net *net, struct nlattr **attrs);
+struct devlink *
+devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs);
 
 void devlink_notify_unregister(struct devlink *devlink);
 void devlink_notify_register(struct devlink *devlink);
diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
index e6d6c7f74ae7..bec408da4dbe 100644
--- a/net/devlink/leftover.c
+++ b/net/devlink/leftover.c
@@ -6314,12 +6314,10 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 
 	start_offset = state->start_offset;
 
-	devlink = devlink_get_from_attrs(sock_net(cb->skb->sk), attrs);
+	devlink = devlink_get_from_attrs_lock(sock_net(cb->skb->sk), attrs);
 	if (IS_ERR(devlink))
 		return PTR_ERR(devlink);
 
-	devl_lock(devlink);
-
 	if (!attrs[DEVLINK_ATTR_REGION_NAME]) {
 		NL_SET_ERR_MSG(cb->extack, "No region name provided");
 		err = -EINVAL;
@@ -7735,9 +7733,10 @@ devlink_health_reporter_get_from_cb(struct netlink_callback *cb)
 	struct nlattr **attrs = info->attrs;
 	struct devlink *devlink;
 
-	devlink = devlink_get_from_attrs(sock_net(cb->skb->sk), attrs);
+	devlink = devlink_get_from_attrs_lock(sock_net(cb->skb->sk), attrs);
 	if (IS_ERR(devlink))
 		return NULL;
+	devl_unlock(devlink);
 
 	reporter = devlink_health_reporter_get_from_attrs(devlink, attrs);
 	devlink_put(devlink);
diff --git a/net/devlink/netlink.c b/net/devlink/netlink.c
index a552e723f4a6..69111746f5d9 100644
--- a/net/devlink/netlink.c
+++ b/net/devlink/netlink.c
@@ -82,7 +82,8 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_REGION_DIRECT] = { .type = NLA_FLAG },
 };
 
-struct devlink *devlink_get_from_attrs(struct net *net, struct nlattr **attrs)
+struct devlink *
+devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs)
 {
 	struct devlink *devlink;
 	unsigned long index;
@@ -96,9 +97,11 @@ struct devlink *devlink_get_from_attrs(struct net *net, struct nlattr **attrs)
 	devname = nla_data(attrs[DEVLINK_ATTR_DEV_NAME]);
 
 	devlinks_xa_for_each_registered_get(net, index, devlink) {
+		devl_lock(devlink);
 		if (strcmp(devlink->dev->bus->name, busname) == 0 &&
 		    strcmp(dev_name(devlink->dev), devname) == 0)
 			return devlink;
+		devl_unlock(devlink);
 		devlink_put(devlink);
 	}
 
@@ -113,10 +116,10 @@ static int devlink_nl_pre_doit(const struct genl_split_ops *ops,
 	struct devlink *devlink;
 	int err;
 
-	devlink = devlink_get_from_attrs(genl_info_net(info), info->attrs);
+	devlink = devlink_get_from_attrs_lock(genl_info_net(info), info->attrs);
 	if (IS_ERR(devlink))
 		return PTR_ERR(devlink);
-	devl_lock(devlink);
+
 	info->user_ptr[0] = devlink;
 	if (ops->internal_flags & DEVLINK_NL_FLAG_NEED_PORT) {
 		devlink_port = devlink_port_get_from_info(devlink, info);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 4/9] devlink: always check if the devlink instance is registered
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (2 preceding siblings ...)
  2023-01-06  6:33 ` [PATCH net-next 3/9] devlink: protect devlink->dev by the instance lock Jakub Kicinski
@ 2023-01-06  6:33 ` Jakub Kicinski
  2023-01-06 12:41   ` Jiri Pirko
  2023-01-06 17:03   ` Jacob Keller
  2023-01-06  6:33 ` [PATCH net-next 5/9] devlink: remove the registration guarantee of references Jakub Kicinski
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

Always check under the instance lock whether the devlink instance
is still / already registered.

This is a no-op for the most part, as the unregistration path currently
waits for all references. On the init path, however, we may temporarily
open up a race with netdev code, if netdevs are registered before the
devlink instance. This is temporary, the next change fixes it, and this
commit has been split out for the ease of review.

Note that in case of iterating over sub-objects which have their
own lock (regions and line cards) we assume an implicit dependency
between those objects existing and devlink unregistration.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/devlink/core.c          | 19 +++++++++++++++----
 net/devlink/devl_internal.h |  8 ++++++++
 net/devlink/leftover.c      | 35 +++++++++++++++++++++++++++++------
 net/devlink/netlink.c       | 10 ++++++++--
 4 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/net/devlink/core.c b/net/devlink/core.c
index d3b8336946fd..c53c996edf1d 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -67,6 +67,15 @@ void devl_unlock(struct devlink *devlink)
 }
 EXPORT_SYMBOL_GPL(devl_unlock);
 
+/**
+ * devlink_try_get() - try to obtain a reference on a devlink instance
+ * @devlink: instance to reference
+ *
+ * Obtain a reference on a devlink instance. A reference on a devlink instance
+ * only implies that it's safe to take the instance lock. It does not imply
+ * that the instance is registered, use devl_is_registered() after taking
+ * the instance lock to check registration status.
+ */
 struct devlink *__must_check devlink_try_get(struct devlink *devlink)
 {
 	if (refcount_inc_not_zero(&devlink->refcount))
@@ -300,10 +309,12 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
 	devlinks_xa_for_each_registered_get(net, index, devlink) {
 		WARN_ON(!(devlink->features & DEVLINK_F_RELOAD));
 		devl_lock(devlink);
-		err = devlink_reload(devlink, &init_net,
-				     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
-				     DEVLINK_RELOAD_LIMIT_UNSPEC,
-				     &actions_performed, NULL);
+		err = 0;
+		if (devl_is_registered(devlink))
+			err = devlink_reload(devlink, &init_net,
+					     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
+					     DEVLINK_RELOAD_LIMIT_UNSPEC,
+					     &actions_performed, NULL);
 		devl_unlock(devlink);
 		devlink_put(devlink);
 
diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
index 6342552e5f99..01a00df81d0e 100644
--- a/net/devlink/devl_internal.h
+++ b/net/devlink/devl_internal.h
@@ -86,6 +86,14 @@ extern struct genl_family devlink_nl_family;
 
 struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp);
 
+static inline bool devl_is_registered(struct devlink *devlink)
+{
+	/* To prevent races the caller must hold the instance lock
+	 * or another lock taken during unregistration.
+	 */
+	return xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
+}
+
 /* Netlink */
 #define DEVLINK_NL_FLAG_NEED_PORT		BIT(0)
 #define DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT	BIT(1)
diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
index bec408da4dbe..491f821c8b77 100644
--- a/net/devlink/leftover.c
+++ b/net/devlink/leftover.c
@@ -2130,6 +2130,9 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
 		int idx = 0;
 
 		mutex_lock(&devlink->linecards_lock);
+		if (!devl_is_registered(devlink))
+			goto next_devlink;
+
 		list_for_each_entry(linecard, &devlink->linecard_list, list) {
 			if (idx < state->idx) {
 				idx++;
@@ -2151,6 +2154,7 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
 			}
 			idx++;
 		}
+next_devlink:
 		mutex_unlock(&devlink->linecards_lock);
 		devlink_put(devlink);
 	}
@@ -7809,6 +7813,12 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
 		int idx = 0;
 
 		mutex_lock(&devlink->reporters_lock);
+		if (!devl_is_registered(devlink)) {
+			mutex_unlock(&devlink->reporters_lock);
+			devlink_put(devlink);
+			continue;
+		}
+
 		list_for_each_entry(reporter, &devlink->reporter_list,
 				    list) {
 			if (idx < state->idx) {
@@ -7830,6 +7840,9 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
 		mutex_unlock(&devlink->reporters_lock);
 
 		devl_lock(devlink);
+		if (!devl_is_registered(devlink))
+			goto next_devlink;
+
 		xa_for_each(&devlink->ports, port_index, port) {
 			mutex_lock(&port->reporters_lock);
 			list_for_each_entry(reporter, &port->reporter_list, list) {
@@ -7853,6 +7866,7 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
 			}
 			mutex_unlock(&port->reporters_lock);
 		}
+next_devlink:
 		devl_unlock(devlink);
 		devlink_put(devlink);
 	}
@@ -12218,7 +12232,8 @@ void devlink_compat_running_version(struct devlink *devlink,
 		return;
 
 	devl_lock(devlink);
-	__devlink_compat_running_version(devlink, buf, len);
+	if (devl_is_registered(devlink))
+		__devlink_compat_running_version(devlink, buf, len);
 	devl_unlock(devlink);
 }
 
@@ -12227,20 +12242,28 @@ int devlink_compat_flash_update(struct devlink *devlink, const char *file_name)
 	struct devlink_flash_update_params params = {};
 	int ret;
 
-	if (!devlink->ops->flash_update)
-		return -EOPNOTSUPP;
+	devl_lock(devlink);
+	if (!devl_is_registered(devlink)) {
+		ret = -ENODEV;
+		goto out_unlock;
+	}
+
+	if (!devlink->ops->flash_update) {
+		ret = -EOPNOTSUPP;
+		goto out_unlock;
+	}
 
 	ret = request_firmware(&params.fw, file_name, devlink->dev);
 	if (ret)
-		return ret;
+		goto out_unlock;
 
-	devl_lock(devlink);
 	devlink_flash_update_begin_notify(devlink);
 	ret = devlink->ops->flash_update(devlink, &params, NULL);
 	devlink_flash_update_end_notify(devlink);
-	devl_unlock(devlink);
 
 	release_firmware(params.fw);
+out_unlock:
+	devl_unlock(devlink);
 
 	return ret;
 }
diff --git a/net/devlink/netlink.c b/net/devlink/netlink.c
index 69111746f5d9..b5b8ac6db2d1 100644
--- a/net/devlink/netlink.c
+++ b/net/devlink/netlink.c
@@ -98,7 +98,8 @@ devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs)
 
 	devlinks_xa_for_each_registered_get(net, index, devlink) {
 		devl_lock(devlink);
-		if (strcmp(devlink->dev->bus->name, busname) == 0 &&
+		if (devl_is_registered(devlink) &&
+		    strcmp(devlink->dev->bus->name, busname) == 0 &&
 		    strcmp(dev_name(devlink->dev), devname) == 0)
 			return devlink;
 		devl_unlock(devlink);
@@ -211,7 +212,12 @@ int devlink_nl_instance_iter_dump(struct sk_buff *msg,
 
 	devlink_dump_for_each_instance_get(msg, state, devlink) {
 		devl_lock(devlink);
-		err = cmd->dump_one(msg, devlink, cb);
+
+		if (devl_is_registered(devlink))
+			err = cmd->dump_one(msg, devlink, cb);
+		else
+			err = 0;
+
 		devl_unlock(devlink);
 		devlink_put(devlink);
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 5/9] devlink: remove the registration guarantee of references
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (3 preceding siblings ...)
  2023-01-06  6:33 ` [PATCH net-next 4/9] devlink: always check if the devlink instance is registered Jakub Kicinski
@ 2023-01-06  6:33 ` Jakub Kicinski
  2023-01-06 12:42   ` Jiri Pirko
  2023-01-06  6:33 ` [PATCH net-next 6/9] devlink: don't require setting features before registration Jakub Kicinski
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

The objective of exposing the devlink instance locks to
drivers was to let them use these locks to prevent user space
from accessing the device before it's fully initialized.
This is difficult because devlink_unregister() waits for all
references to be released, meaning that devlink_unregister()
can't itself be called under the instance lock.

To avoid this issue devlink_register() was moved after subobject
registration a while ago. Unfortunately the netdev paths get
a hold of the devlink instances _before_ they are registered.
Ideally netdev should wait for devlink init to finish (synchronizing
on the instance lock). This can't work because we don't know if the
instance will _ever_ be registered (in case of failures it may not).
The other option of returning an error until devlink_register()
is called is unappealing (user space would get a notification
netdev exist but would have to wait arbitrary amount of time
before accessing some of its attributes).

Weaken the guarantees of the devlink references.

Holding a reference will now only guarantee that the memory
of the object is around. Another way of looking at it is that
the reference now protects the object not its "registered" status.
Use devlink instance lock to synchronize unregistration.

This implies that releasing of the "main" reference of the devlink
instance moves from devlink_unregister() to devlink_free().

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 include/net/devlink.h       |  2 ++
 net/devlink/core.c          | 64 ++++++++++++++++---------------------
 net/devlink/devl_internal.h |  2 --
 3 files changed, 30 insertions(+), 38 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 6a2e4f21779f..425ecef431b7 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1647,6 +1647,8 @@ static inline struct devlink *devlink_alloc(const struct devlink_ops *ops,
 	return devlink_alloc_ns(ops, priv_size, &init_net, dev);
 }
 void devlink_set_features(struct devlink *devlink, u64 features);
+int devl_register(struct devlink *devlink);
+void devl_unregister(struct devlink *devlink);
 void devlink_register(struct devlink *devlink);
 void devlink_unregister(struct devlink *devlink);
 void devlink_free(struct devlink *devlink);
diff --git a/net/devlink/core.c b/net/devlink/core.c
index c53c996edf1d..7cf0b3efbb2f 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -83,21 +83,10 @@ struct devlink *__must_check devlink_try_get(struct devlink *devlink)
 	return NULL;
 }
 
-static void __devlink_put_rcu(struct rcu_head *head)
-{
-	struct devlink *devlink = container_of(head, struct devlink, rcu);
-
-	complete(&devlink->comp);
-}
-
 void devlink_put(struct devlink *devlink)
 {
 	if (refcount_dec_and_test(&devlink->refcount))
-		/* Make sure unregister operation that may await the completion
-		 * is unblocked only after all users are after the end of
-		 * RCU grace period.
-		 */
-		call_rcu(&devlink->rcu, __devlink_put_rcu);
+		kfree_rcu(devlink, rcu);
 }
 
 struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp)
@@ -110,13 +99,6 @@ struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp)
 	if (!devlink)
 		goto unlock;
 
-	/* In case devlink_unregister() was already called and "unregistering"
-	 * mark was set, do not allow to get a devlink reference here.
-	 * This prevents live-lock of devlink_unregister() wait for completion.
-	 */
-	if (xa_get_mark(&devlinks, *indexp, DEVLINK_UNREGISTERING))
-		goto next;
-
 	if (!devlink_try_get(devlink))
 		goto next;
 	if (!net_eq(devlink_net(devlink), net)) {
@@ -152,37 +134,48 @@ void devlink_set_features(struct devlink *devlink, u64 features)
 EXPORT_SYMBOL_GPL(devlink_set_features);
 
 /**
- *	devlink_register - Register devlink instance
- *
- *	@devlink: devlink
+ * devl_register - Register devlink instance
+ * @devlink: devlink
  */
-void devlink_register(struct devlink *devlink)
+int devl_register(struct devlink *devlink)
 {
 	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-	/* Make sure that we are in .probe() routine */
+	devl_assert_locked(devlink);
 
 	xa_set_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
 	devlink_notify_register(devlink);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devl_register);
+
+void devlink_register(struct devlink *devlink)
+{
+	devl_lock(devlink);
+	devl_register(devlink);
+	devl_unlock(devlink);
 }
 EXPORT_SYMBOL_GPL(devlink_register);
 
 /**
- *	devlink_unregister - Unregister devlink instance
- *
- *	@devlink: devlink
+ * devl_unregister - Unregister devlink instance
+ * @devlink: devlink
  */
-void devlink_unregister(struct devlink *devlink)
+void devl_unregister(struct devlink *devlink)
 {
 	ASSERT_DEVLINK_REGISTERED(devlink);
-	/* Make sure that we are in .remove() routine */
-
-	xa_set_mark(&devlinks, devlink->index, DEVLINK_UNREGISTERING);
-	devlink_put(devlink);
-	wait_for_completion(&devlink->comp);
+	devl_assert_locked(devlink);
 
 	devlink_notify_unregister(devlink);
 	xa_clear_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
-	xa_clear_mark(&devlinks, devlink->index, DEVLINK_UNREGISTERING);
+}
+EXPORT_SYMBOL_GPL(devl_unregister);
+
+void devlink_unregister(struct devlink *devlink)
+{
+	devl_lock(devlink);
+	devl_unregister(devlink);
+	devl_unlock(devlink);
 }
 EXPORT_SYMBOL_GPL(devlink_unregister);
 
@@ -246,7 +239,6 @@ struct devlink *devlink_alloc_ns(const struct devlink_ops *ops,
 	mutex_init(&devlink->reporters_lock);
 	mutex_init(&devlink->linecards_lock);
 	refcount_set(&devlink->refcount, 1);
-	init_completion(&devlink->comp);
 
 	return devlink;
 
@@ -292,7 +284,7 @@ void devlink_free(struct devlink *devlink)
 
 	xa_erase(&devlinks, devlink->index);
 
-	kfree(devlink);
+	devlink_put(devlink);
 }
 EXPORT_SYMBOL_GPL(devlink_free);
 
diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
index 01a00df81d0e..5d2bbe295659 100644
--- a/net/devlink/devl_internal.h
+++ b/net/devlink/devl_internal.h
@@ -12,7 +12,6 @@
 #include <net/net_namespace.h>
 
 #define DEVLINK_REGISTERED XA_MARK_1
-#define DEVLINK_UNREGISTERING XA_MARK_2
 
 #define DEVLINK_RELOAD_STATS_ARRAY_SIZE \
 	(__DEVLINK_RELOAD_LIMIT_MAX * __DEVLINK_RELOAD_ACTION_MAX)
@@ -52,7 +51,6 @@ struct devlink {
 	struct lock_class_key lock_key;
 	u8 reload_failed:1;
 	refcount_t refcount;
-	struct completion comp;
 	struct rcu_head rcu;
 	struct notifier_block netdevice_nb;
 	char priv[] __aligned(NETDEV_ALIGN);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 6/9] devlink: don't require setting features before registration
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (4 preceding siblings ...)
  2023-01-06  6:33 ` [PATCH net-next 5/9] devlink: remove the registration guarantee of references Jakub Kicinski
@ 2023-01-06  6:33 ` Jakub Kicinski
  2023-01-06 12:43   ` Jiri Pirko
  2023-01-06  6:34 ` [PATCH net-next 7/9] devlink: allow registering parameters after the instance Jakub Kicinski
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

Requiring devlink_set_features() to be run before devlink is
registered is overzealous. devlink_set_features() itself is
a leftover from old workarounds which were trying to prevent
initiating reload before probe was complete.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/devlink/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/devlink/core.c b/net/devlink/core.c
index 7cf0b3efbb2f..a31a317626d7 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -125,8 +125,6 @@ struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp)
  */
 void devlink_set_features(struct devlink *devlink, u64 features)
 {
-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-
 	WARN_ON(features & DEVLINK_F_RELOAD &&
 		!devlink_reload_supported(devlink->ops));
 	devlink->features = features;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (5 preceding siblings ...)
  2023-01-06  6:33 ` [PATCH net-next 6/9] devlink: don't require setting features before registration Jakub Kicinski
@ 2023-01-06  6:34 ` Jakub Kicinski
  2023-01-06 12:55   ` Jiri Pirko
  2023-01-11 13:21   ` Jiri Pirko
  2023-01-06  6:34 ` [PATCH net-next 8/9] netdevsim: rename a label Jakub Kicinski
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

It's most natural to register the instance first and then its
subobjects. Now that we can use the instance lock to protect
the atomicity of all init - it should also be safe.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/devlink/leftover.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
index 491f821c8b77..1e23b2da78cc 100644
--- a/net/devlink/leftover.c
+++ b/net/devlink/leftover.c
@@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
-	ASSERT_DEVLINK_REGISTERED(devlink);
+
+	/* devlink_notify_register() / devlink_notify_unregister()
+	 * will replay the notifications if the params are added/removed
+	 * outside of the lifetime of the instance.
+	 */
+	if (!devl_is_registered(devlink))
+		return;
 
 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
 	if (!msg)
@@ -10915,8 +10921,6 @@ int devlink_params_register(struct devlink *devlink,
 	const struct devlink_param *param = params;
 	int i, err;
 
-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-
 	for (i = 0; i < params_count; i++, param++) {
 		err = devlink_param_register(devlink, param);
 		if (err)
@@ -10947,8 +10951,6 @@ void devlink_params_unregister(struct devlink *devlink,
 	const struct devlink_param *param = params;
 	int i;
 
-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-
 	for (i = 0; i < params_count; i++, param++)
 		devlink_param_unregister(devlink, param);
 }
@@ -10968,8 +10970,6 @@ int devlink_param_register(struct devlink *devlink,
 {
 	struct devlink_param_item *param_item;
 
-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-
 	WARN_ON(devlink_param_verify(param));
 	WARN_ON(devlink_param_find_by_name(&devlink->param_list, param->name));
 
@@ -10985,6 +10985,7 @@ int devlink_param_register(struct devlink *devlink,
 	param_item->param = param;
 
 	list_add_tail(&param_item->list, &devlink->param_list);
+	devlink_param_notify(devlink, 0, param_item, DEVLINK_CMD_PARAM_NEW);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(devlink_param_register);
@@ -10999,11 +11000,10 @@ void devlink_param_unregister(struct devlink *devlink,
 {
 	struct devlink_param_item *param_item;
 
-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-
 	param_item =
 		devlink_param_find_by_name(&devlink->param_list, param->name);
 	WARN_ON(!param_item);
+	devlink_param_notify(devlink, 0, param_item, DEVLINK_CMD_PARAM_DEL);
 	list_del(&param_item->list);
 	kfree(param_item);
 }
@@ -11063,8 +11063,6 @@ int devlink_param_driverinit_value_set(struct devlink *devlink, u32 param_id,
 {
 	struct devlink_param_item *param_item;
 
-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);
-
 	param_item = devlink_param_find_by_id(&devlink->param_list, param_id);
 	if (!param_item)
 		return -EINVAL;
@@ -11078,6 +11076,8 @@ int devlink_param_driverinit_value_set(struct devlink *devlink, u32 param_id,
 	else
 		param_item->driverinit_value = init_val;
 	param_item->driverinit_value_valid = true;
+
+	devlink_param_notify(devlink, 0, param_item, DEVLINK_CMD_PARAM_NEW);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(devlink_param_driverinit_value_set);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 8/9] netdevsim: rename a label
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (6 preceding siblings ...)
  2023-01-06  6:34 ` [PATCH net-next 7/9] devlink: allow registering parameters after the instance Jakub Kicinski
@ 2023-01-06  6:34 ` Jakub Kicinski
  2023-01-06 12:56   ` Jiri Pirko
  2023-01-06  6:34 ` [PATCH net-next 9/9] netdevsim: move devlink registration under the instance lock Jakub Kicinski
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

err_dl_unregister should unregister the devlink instance.
Looks like renaming it was missed in one of the reshufflings.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/netdevsim/dev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index b962fc8e1397..d25f6e86d901 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -1563,7 +1563,7 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
 	err = devlink_params_register(devlink, nsim_devlink_params,
 				      ARRAY_SIZE(nsim_devlink_params));
 	if (err)
-		goto err_dl_unregister;
+		goto err_resource_unregister;
 	nsim_devlink_set_params_init_values(nsim_dev, devlink);
 
 	err = nsim_dev_dummy_region_init(nsim_dev, devlink);
@@ -1629,7 +1629,7 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
 err_params_unregister:
 	devlink_params_unregister(devlink, nsim_devlink_params,
 				  ARRAY_SIZE(nsim_devlink_params));
-err_dl_unregister:
+err_resource_unregister:
 	devl_resources_unregister(devlink);
 err_vfc_free:
 	kfree(nsim_dev->vfconfigs);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH net-next 9/9] netdevsim: move devlink registration under the instance lock
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (7 preceding siblings ...)
  2023-01-06  6:34 ` [PATCH net-next 8/9] netdevsim: rename a label Jakub Kicinski
@ 2023-01-06  6:34 ` Jakub Kicinski
  2023-01-06 15:49   ` Jiri Pirko
  2023-01-06 13:10 ` [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister patchwork-bot+netdevbpf
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06  6:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, jacob.e.keller, jiri, Jakub Kicinski

To prevent races with netdev code accessing free devlink instances
move the registration under the devlink instance lock.
Core now waits for the instance to be registered before accessing it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/netdevsim/dev.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index d25f6e86d901..738784fda117 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -1556,10 +1556,14 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
 		goto err_devlink_unlock;
 	}
 
-	err = nsim_dev_resources_register(devlink);
+	err = devl_register(devlink);
 	if (err)
 		goto err_vfc_free;
 
+	err = nsim_dev_resources_register(devlink);
+	if (err)
+		goto err_dl_unregister;
+
 	err = devlink_params_register(devlink, nsim_devlink_params,
 				      ARRAY_SIZE(nsim_devlink_params));
 	if (err)
@@ -1607,7 +1611,6 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
 	nsim_dev->esw_mode = DEVLINK_ESWITCH_MODE_LEGACY;
 	devlink_set_features(devlink, DEVLINK_F_RELOAD);
 	devl_unlock(devlink);
-	devlink_register(devlink);
 	return 0;
 
 err_hwstats_exit:
@@ -1631,6 +1634,8 @@ int nsim_drv_probe(struct nsim_bus_dev *nsim_bus_dev)
 				  ARRAY_SIZE(nsim_devlink_params));
 err_resource_unregister:
 	devl_resources_unregister(devlink);
+err_dl_unregister:
+	devl_unregister(devlink);
 err_vfc_free:
 	kfree(nsim_dev->vfconfigs);
 err_devlink_unlock:
@@ -1668,7 +1673,6 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
 	struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev);
 	struct devlink *devlink = priv_to_devlink(nsim_dev);
 
-	devlink_unregister(devlink);
 	devl_lock(devlink);
 	nsim_dev_reload_destroy(nsim_dev);
 
@@ -1677,6 +1681,7 @@ void nsim_drv_remove(struct nsim_bus_dev *nsim_bus_dev)
 	devlink_params_unregister(devlink, nsim_devlink_params,
 				  ARRAY_SIZE(nsim_devlink_params));
 	devl_resources_unregister(devlink);
+	devl_unregister(devlink);
 	kfree(nsim_dev->vfconfigs);
 	kfree(nsim_dev->fa_cookie);
 	devl_unlock(devlink);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 1/9] devlink: bump the instance index directly when iterating
  2023-01-06  6:33 ` [PATCH net-next 1/9] devlink: bump the instance index directly when iterating Jakub Kicinski
@ 2023-01-06 12:17   ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:17 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:33:54AM CET, kuba@kernel.org wrote:
>xa_find_after() is designed to handle multi-index entries correctly.
>If a xarray has two entries one which spans indexes 0-3 and one at
>index 4 xa_find_after(0) will return the entry at index 4.
>
>Having to juggle the two callbacks, however, is unnecessary in case
>of the devlink xarray, as there is 1:1 relationship with indexes.
>
>Always use xa_find() and increment the index manually.
>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 3/9] devlink: protect devlink->dev by the instance lock
  2023-01-06  6:33 ` [PATCH net-next 3/9] devlink: protect devlink->dev by the instance lock Jakub Kicinski
@ 2023-01-06 12:18   ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:18 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:33:56AM CET, kuba@kernel.org wrote:
>devlink->dev is assumed to be always valid as long as any
>outstanding reference to the devlink instance exists.
>
>In prep for weakening of the references take the instance lock.
>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 4/9] devlink: always check if the devlink instance is registered
  2023-01-06  6:33 ` [PATCH net-next 4/9] devlink: always check if the devlink instance is registered Jakub Kicinski
@ 2023-01-06 12:41   ` Jiri Pirko
  2023-01-06 17:03   ` Jacob Keller
  1 sibling, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:41 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:33:57AM CET, kuba@kernel.org wrote:
>Always check under the instance lock whether the devlink instance
>is still / already registered.
>
>This is a no-op for the most part, as the unregistration path currently
>waits for all references. On the init path, however, we may temporarily
>open up a race with netdev code, if netdevs are registered before the
>devlink instance. This is temporary, the next change fixes it, and this
>commit has been split out for the ease of review.
>
>Note that in case of iterating over sub-objects which have their
>own lock (regions and line cards) we assume an implicit dependency
>between those objects existing and devlink unregistration.
>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>---
> net/devlink/core.c          | 19 +++++++++++++++----
> net/devlink/devl_internal.h |  8 ++++++++
> net/devlink/leftover.c      | 35 +++++++++++++++++++++++++++++------
> net/devlink/netlink.c       | 10 ++++++++--
> 4 files changed, 60 insertions(+), 12 deletions(-)
>
>diff --git a/net/devlink/core.c b/net/devlink/core.c
>index d3b8336946fd..c53c996edf1d 100644
>--- a/net/devlink/core.c
>+++ b/net/devlink/core.c
>@@ -67,6 +67,15 @@ void devl_unlock(struct devlink *devlink)
> }
> EXPORT_SYMBOL_GPL(devl_unlock);
> 
>+/**
>+ * devlink_try_get() - try to obtain a reference on a devlink instance
>+ * @devlink: instance to reference
>+ *
>+ * Obtain a reference on a devlink instance. A reference on a devlink instance
>+ * only implies that it's safe to take the instance lock. It does not imply
>+ * that the instance is registered, use devl_is_registered() after taking
>+ * the instance lock to check registration status.
>+ */
> struct devlink *__must_check devlink_try_get(struct devlink *devlink)
> {
> 	if (refcount_inc_not_zero(&devlink->refcount))
>@@ -300,10 +309,12 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
> 	devlinks_xa_for_each_registered_get(net, index, devlink) {
> 		WARN_ON(!(devlink->features & DEVLINK_F_RELOAD));
> 		devl_lock(devlink);
>-		err = devlink_reload(devlink, &init_net,
>-				     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
>-				     DEVLINK_RELOAD_LIMIT_UNSPEC,
>-				     &actions_performed, NULL);
>+		err = 0;
>+		if (devl_is_registered(devlink))
>+			err = devlink_reload(devlink, &init_net,
>+					     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
>+					     DEVLINK_RELOAD_LIMIT_UNSPEC,
>+					     &actions_performed, NULL);
> 		devl_unlock(devlink);
> 		devlink_put(devlink);
> 
>diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
>index 6342552e5f99..01a00df81d0e 100644
>--- a/net/devlink/devl_internal.h
>+++ b/net/devlink/devl_internal.h
>@@ -86,6 +86,14 @@ extern struct genl_family devlink_nl_family;
> 
> struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp);
> 
>+static inline bool devl_is_registered(struct devlink *devlink)
>+{
>+	/* To prevent races the caller must hold the instance lock
>+	 * or another lock taken during unregistration.
>+	 */
>+	return xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
>+}
>+
> /* Netlink */
> #define DEVLINK_NL_FLAG_NEED_PORT		BIT(0)
> #define DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT	BIT(1)
>diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
>index bec408da4dbe..491f821c8b77 100644
>--- a/net/devlink/leftover.c
>+++ b/net/devlink/leftover.c
>@@ -2130,6 +2130,9 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
> 		int idx = 0;
> 
> 		mutex_lock(&devlink->linecards_lock);
>+		if (!devl_is_registered(devlink))
>+			goto next_devlink;
>+
> 		list_for_each_entry(linecard, &devlink->linecard_list, list) {
> 			if (idx < state->idx) {
> 				idx++;
>@@ -2151,6 +2154,7 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
> 			}
> 			idx++;
> 		}
>+next_devlink:
> 		mutex_unlock(&devlink->linecards_lock);
> 		devlink_put(devlink);
> 	}
>@@ -7809,6 +7813,12 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
> 		int idx = 0;
> 
> 		mutex_lock(&devlink->reporters_lock);
>+		if (!devl_is_registered(devlink)) {

Good. I have patchset to remove this and linecard lock prepared. That
makes things smoother.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>



>+			mutex_unlock(&devlink->reporters_lock);
>+			devlink_put(devlink);
>+			continue;
>+		}
>+
> 		list_for_each_entry(reporter, &devlink->reporter_list,
> 				    list) {
> 			if (idx < state->idx) {
>@@ -7830,6 +7840,9 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
> 		mutex_unlock(&devlink->reporters_lock);
> 
> 		devl_lock(devlink);
>+		if (!devl_is_registered(devlink))
>+			goto next_devlink;
>+
> 		xa_for_each(&devlink->ports, port_index, port) {
> 			mutex_lock(&port->reporters_lock);
> 			list_for_each_entry(reporter, &port->reporter_list, list) {
>@@ -7853,6 +7866,7 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
> 			}
> 			mutex_unlock(&port->reporters_lock);
> 		}
>+next_devlink:
> 		devl_unlock(devlink);
> 		devlink_put(devlink);
> 	}
>@@ -12218,7 +12232,8 @@ void devlink_compat_running_version(struct devlink *devlink,
> 		return;
> 
> 	devl_lock(devlink);
>-	__devlink_compat_running_version(devlink, buf, len);
>+	if (devl_is_registered(devlink))
>+		__devlink_compat_running_version(devlink, buf, len);
> 	devl_unlock(devlink);
> }
> 
>@@ -12227,20 +12242,28 @@ int devlink_compat_flash_update(struct devlink *devlink, const char *file_name)
> 	struct devlink_flash_update_params params = {};
> 	int ret;
> 
>-	if (!devlink->ops->flash_update)
>-		return -EOPNOTSUPP;
>+	devl_lock(devlink);
>+	if (!devl_is_registered(devlink)) {
>+		ret = -ENODEV;
>+		goto out_unlock;
>+	}
>+
>+	if (!devlink->ops->flash_update) {
>+		ret = -EOPNOTSUPP;
>+		goto out_unlock;
>+	}
> 
> 	ret = request_firmware(&params.fw, file_name, devlink->dev);
> 	if (ret)
>-		return ret;
>+		goto out_unlock;
> 
>-	devl_lock(devlink);
> 	devlink_flash_update_begin_notify(devlink);
> 	ret = devlink->ops->flash_update(devlink, &params, NULL);
> 	devlink_flash_update_end_notify(devlink);
>-	devl_unlock(devlink);
> 
> 	release_firmware(params.fw);
>+out_unlock:
>+	devl_unlock(devlink);
> 
> 	return ret;
> }
>diff --git a/net/devlink/netlink.c b/net/devlink/netlink.c
>index 69111746f5d9..b5b8ac6db2d1 100644
>--- a/net/devlink/netlink.c
>+++ b/net/devlink/netlink.c
>@@ -98,7 +98,8 @@ devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs)
> 
> 	devlinks_xa_for_each_registered_get(net, index, devlink) {
> 		devl_lock(devlink);
>-		if (strcmp(devlink->dev->bus->name, busname) == 0 &&
>+		if (devl_is_registered(devlink) &&
>+		    strcmp(devlink->dev->bus->name, busname) == 0 &&
> 		    strcmp(dev_name(devlink->dev), devname) == 0)
> 			return devlink;
> 		devl_unlock(devlink);
>@@ -211,7 +212,12 @@ int devlink_nl_instance_iter_dump(struct sk_buff *msg,
> 
> 	devlink_dump_for_each_instance_get(msg, state, devlink) {
> 		devl_lock(devlink);
>-		err = cmd->dump_one(msg, devlink, cb);
>+
>+		if (devl_is_registered(devlink))
>+			err = cmd->dump_one(msg, devlink, cb);
>+		else
>+			err = 0;
>+
> 		devl_unlock(devlink);
> 		devlink_put(devlink);
> 
>-- 
>2.38.1
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 5/9] devlink: remove the registration guarantee of references
  2023-01-06  6:33 ` [PATCH net-next 5/9] devlink: remove the registration guarantee of references Jakub Kicinski
@ 2023-01-06 12:42   ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:42 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:33:58AM CET, kuba@kernel.org wrote:
>The objective of exposing the devlink instance locks to
>drivers was to let them use these locks to prevent user space
>from accessing the device before it's fully initialized.
>This is difficult because devlink_unregister() waits for all
>references to be released, meaning that devlink_unregister()
>can't itself be called under the instance lock.
>
>To avoid this issue devlink_register() was moved after subobject
>registration a while ago. Unfortunately the netdev paths get
>a hold of the devlink instances _before_ they are registered.
>Ideally netdev should wait for devlink init to finish (synchronizing
>on the instance lock). This can't work because we don't know if the
>instance will _ever_ be registered (in case of failures it may not).
>The other option of returning an error until devlink_register()
>is called is unappealing (user space would get a notification
>netdev exist but would have to wait arbitrary amount of time
>before accessing some of its attributes).
>
>Weaken the guarantees of the devlink references.
>
>Holding a reference will now only guarantee that the memory
>of the object is around. Another way of looking at it is that
>the reference now protects the object not its "registered" status.
>Use devlink instance lock to synchronize unregistration.
>
>This implies that releasing of the "main" reference of the devlink
>instance moves from devlink_unregister() to devlink_free().
>
>Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 6/9] devlink: don't require setting features before registration
  2023-01-06  6:33 ` [PATCH net-next 6/9] devlink: don't require setting features before registration Jakub Kicinski
@ 2023-01-06 12:43   ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:43 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:33:59AM CET, kuba@kernel.org wrote:
>Requiring devlink_set_features() to be run before devlink is
>registered is overzealous. devlink_set_features() itself is
>a leftover from old workarounds which were trying to prevent
>initiating reload before probe was complete.
>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>

I have a patch prepared to remove this entirely. Until then, this is
fine.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-06  6:34 ` [PATCH net-next 7/9] devlink: allow registering parameters after the instance Jakub Kicinski
@ 2023-01-06 12:55   ` Jiri Pirko
  2023-01-06 21:22     ` Jakub Kicinski
  2023-01-11 13:21   ` Jiri Pirko
  1 sibling, 1 reply; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:55 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:34:00AM CET, kuba@kernel.org wrote:
>It's most natural to register the instance first and then its
>subobjects. Now that we can use the instance lock to protect
>the atomicity of all init - it should also be safe.
>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>---
> net/devlink/leftover.c | 22 +++++++++++-----------
> 1 file changed, 11 insertions(+), 11 deletions(-)
>
>diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
>index 491f821c8b77..1e23b2da78cc 100644
>--- a/net/devlink/leftover.c
>+++ b/net/devlink/leftover.c
>@@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
> 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
> 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
> 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
>-	ASSERT_DEVLINK_REGISTERED(devlink);
>+
>+	/* devlink_notify_register() / devlink_notify_unregister()
>+	 * will replay the notifications if the params are added/removed
>+	 * outside of the lifetime of the instance.
>+	 */
>+	if (!devl_is_registered(devlink))
>+		return;

This helper would be nice to use on other places as well.
Like devlink_trap_group_notify(), devlink_trap_notify() and others. I
will take care of that in a follow-up.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 8/9] netdevsim: rename a label
  2023-01-06  6:34 ` [PATCH net-next 8/9] netdevsim: rename a label Jakub Kicinski
@ 2023-01-06 12:56   ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 12:56 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:34:01AM CET, kuba@kernel.org wrote:
>err_dl_unregister should unregister the devlink instance.
>Looks like renaming it was missed in one of the reshufflings.
>
>Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (8 preceding siblings ...)
  2023-01-06  6:34 ` [PATCH net-next 9/9] netdevsim: move devlink registration under the instance lock Jakub Kicinski
@ 2023-01-06 13:10 ` patchwork-bot+netdevbpf
  2023-01-06 15:49 ` Jiri Pirko
  2023-01-06 17:06 ` Jacob Keller
  11 siblings, 0 replies; 45+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-01-06 13:10 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller, jiri

Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu,  5 Jan 2023 22:33:53 -0800 you wrote:
> Move the registration and unregistration of the devlink instances
> under their instance locks. Don't perform the netdev-style wait
> for all references when unregistering the instance.
> 
> Instead the devlink instance refcount will only ensure that
> the memory of the instance is not freed. All places which acquire
> access to devlink instances via a reference must check that the
> instance is still registered under the instance lock.
> 
> [...]

Here is the summary with links:
  - [net-next,1/9] devlink: bump the instance index directly when iterating
    https://git.kernel.org/netdev/net-next/c/d77278196441
  - [net-next,2/9] devlink: update the code in netns move to latest helpers
    https://git.kernel.org/netdev/net-next/c/7a54a5195b2a
  - [net-next,3/9] devlink: protect devlink->dev by the instance lock
    https://git.kernel.org/netdev/net-next/c/870c7ad4a52b
  - [net-next,4/9] devlink: always check if the devlink instance is registered
    https://git.kernel.org/netdev/net-next/c/ed539ba614a0
  - [net-next,5/9] devlink: remove the registration guarantee of references
    https://git.kernel.org/netdev/net-next/c/9053637e0da7
  - [net-next,6/9] devlink: don't require setting features before registration
    https://git.kernel.org/netdev/net-next/c/6ef8f7da9275
  - [net-next,7/9] devlink: allow registering parameters after the instance
    https://git.kernel.org/netdev/net-next/c/1d18bb1a4ddd
  - [net-next,8/9] netdevsim: rename a label
    https://git.kernel.org/netdev/net-next/c/5c5ea1d09fd8
  - [net-next,9/9] netdevsim: move devlink registration under the instance lock
    https://git.kernel.org/netdev/net-next/c/82a3aef2e6af

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 9/9] netdevsim: move devlink registration under the instance lock
  2023-01-06  6:34 ` [PATCH net-next 9/9] netdevsim: move devlink registration under the instance lock Jakub Kicinski
@ 2023-01-06 15:49   ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 15:49 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:34:02AM CET, kuba@kernel.org wrote:
>To prevent races with netdev code accessing free devlink instances
>move the registration under the devlink instance lock.
>Core now waits for the instance to be registered before accessing it.

This sentense sounds a bit confusing to me. "to be registered" does not
sound correct.

"Core now waits for the instance lock to be unlocked before executing
 access the instance" perhaps would be more accurate?


>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>

One way or another. Code looks fine:
Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (9 preceding siblings ...)
  2023-01-06 13:10 ` [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister patchwork-bot+netdevbpf
@ 2023-01-06 15:49 ` Jiri Pirko
  2023-01-06 17:06 ` Jacob Keller
  11 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-06 15:49 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:33:53AM CET, kuba@kernel.org wrote:
>Move the registration and unregistration of the devlink instances
>under their instance locks. Don't perform the netdev-style wait
>for all references when unregistering the instance.
>
>Instead the devlink instance refcount will only ensure that
>the memory of the instance is not freed. All places which acquire
>access to devlink instances via a reference must check that the
>instance is still registered under the instance lock.
>
>This fixes the problem of the netdev code accessing devlink
>instances before they are registered.

Nice work. Thanks!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 4/9] devlink: always check if the devlink instance is registered
  2023-01-06  6:33 ` [PATCH net-next 4/9] devlink: always check if the devlink instance is registered Jakub Kicinski
  2023-01-06 12:41   ` Jiri Pirko
@ 2023-01-06 17:03   ` Jacob Keller
  2023-01-06 21:19     ` Jakub Kicinski
  1 sibling, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2023-01-06 17:03 UTC (permalink / raw)
  To: Jakub Kicinski, davem; +Cc: netdev, edumazet, pabeni, jiri



On 1/5/2023 10:33 PM, Jakub Kicinski wrote:
> Always check under the instance lock whether the devlink instance
> is still / already registered.
> 
> This is a no-op for the most part, as the unregistration path currently
> waits for all references. On the init path, however, we may temporarily
> open up a race with netdev code, if netdevs are registered before the
> devlink instance. This is temporary, the next change fixes it, and this
> commit has been split out for the ease of review.
> 
> Note that in case of iterating over sub-objects which have their
> own lock (regions and line cards) we assume an implicit dependency
> between those objects existing and devlink unregistration.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  net/devlink/core.c          | 19 +++++++++++++++----
>  net/devlink/devl_internal.h |  8 ++++++++
>  net/devlink/leftover.c      | 35 +++++++++++++++++++++++++++++------
>  net/devlink/netlink.c       | 10 ++++++++--
>  4 files changed, 60 insertions(+), 12 deletions(-)
> 
> diff --git a/net/devlink/core.c b/net/devlink/core.c
> index d3b8336946fd..c53c996edf1d 100644
> --- a/net/devlink/core.c
> +++ b/net/devlink/core.c
> @@ -67,6 +67,15 @@ void devl_unlock(struct devlink *devlink)
>  }
>  EXPORT_SYMBOL_GPL(devl_unlock);
>  
> +/**
> + * devlink_try_get() - try to obtain a reference on a devlink instance
> + * @devlink: instance to reference
> + *
> + * Obtain a reference on a devlink instance. A reference on a devlink instance
> + * only implies that it's safe to take the instance lock. It does not imply
> + * that the instance is registered, use devl_is_registered() after taking
> + * the instance lock to check registration status.
> + */
>  struct devlink *__must_check devlink_try_get(struct devlink *devlink)
>  {
>  	if (refcount_inc_not_zero(&devlink->refcount))
> @@ -300,10 +309,12 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
>  	devlinks_xa_for_each_registered_get(net, index, devlink) {
>  		WARN_ON(!(devlink->features & DEVLINK_F_RELOAD));
>  		devl_lock(devlink);
> -		err = devlink_reload(devlink, &init_net,
> -				     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
> -				     DEVLINK_RELOAD_LIMIT_UNSPEC,
> -				     &actions_performed, NULL);
> +		err = 0;
> +		if (devl_is_registered(devlink))
> +			err = devlink_reload(devlink, &init_net,
> +					     DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
> +					     DEVLINK_RELOAD_LIMIT_UNSPEC,
> +					     &actions_performed, NULL);
>  		devl_unlock(devlink);
>  		devlink_put(devlink);
>  
> diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
> index 6342552e5f99..01a00df81d0e 100644
> --- a/net/devlink/devl_internal.h
> +++ b/net/devlink/devl_internal.h
> @@ -86,6 +86,14 @@ extern struct genl_family devlink_nl_family;
>  
>  struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp);
>  
> +static inline bool devl_is_registered(struct devlink *devlink)
> +{
> +	/* To prevent races the caller must hold the instance lock
> +	 * or another lock taken during unregistration.
> +	 */

Why not just lockdep_assert here on the instance lock? I guess this
comment implies that another lock could be used instead but it seems
weird to allow that? I guess because of things like the linecards_lock
as opposed to the instance lock?

Thanks,
Jake

> +	return xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
> +}
> +
>  /* Netlink */
>  #define DEVLINK_NL_FLAG_NEED_PORT		BIT(0)
>  #define DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT	BIT(1)
> diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
> index bec408da4dbe..491f821c8b77 100644
> --- a/net/devlink/leftover.c
> +++ b/net/devlink/leftover.c
> @@ -2130,6 +2130,9 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
>  		int idx = 0;
>  
>  		mutex_lock(&devlink->linecards_lock);
> +		if (!devl_is_registered(devlink))
> +			goto next_devlink;
> +
>  		list_for_each_entry(linecard, &devlink->linecard_list, list) {
>  			if (idx < state->idx) {
>  				idx++;
> @@ -2151,6 +2154,7 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
>  			}
>  			idx++;
>  		}
> +next_devlink:
>  		mutex_unlock(&devlink->linecards_lock);
>  		devlink_put(devlink);
>  	}
> @@ -7809,6 +7813,12 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
>  		int idx = 0;
>  
>  		mutex_lock(&devlink->reporters_lock);
> +		if (!devl_is_registered(devlink)) {
> +			mutex_unlock(&devlink->reporters_lock);
> +			devlink_put(devlink);
> +			continue;
> +		}
> +
>  		list_for_each_entry(reporter, &devlink->reporter_list,
>  				    list) {
>  			if (idx < state->idx) {
> @@ -7830,6 +7840,9 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
>  		mutex_unlock(&devlink->reporters_lock);
>  
>  		devl_lock(devlink);
> +		if (!devl_is_registered(devlink))
> +			goto next_devlink;
> +
>  		xa_for_each(&devlink->ports, port_index, port) {
>  			mutex_lock(&port->reporters_lock);
>  			list_for_each_entry(reporter, &port->reporter_list, list) {
> @@ -7853,6 +7866,7 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
>  			}
>  			mutex_unlock(&port->reporters_lock);
>  		}
> +next_devlink:
>  		devl_unlock(devlink);
>  		devlink_put(devlink);
>  	}
> @@ -12218,7 +12232,8 @@ void devlink_compat_running_version(struct devlink *devlink,
>  		return;
>  
>  	devl_lock(devlink);
> -	__devlink_compat_running_version(devlink, buf, len);
> +	if (devl_is_registered(devlink))
> +		__devlink_compat_running_version(devlink, buf, len);
>  	devl_unlock(devlink);
>  }
>  
> @@ -12227,20 +12242,28 @@ int devlink_compat_flash_update(struct devlink *devlink, const char *file_name)
>  	struct devlink_flash_update_params params = {};
>  	int ret;
>  
> -	if (!devlink->ops->flash_update)
> -		return -EOPNOTSUPP;
> +	devl_lock(devlink);
> +	if (!devl_is_registered(devlink)) {
> +		ret = -ENODEV;
> +		goto out_unlock;
> +	}
> +
> +	if (!devlink->ops->flash_update) {
> +		ret = -EOPNOTSUPP;
> +		goto out_unlock;
> +	}
>  
>  	ret = request_firmware(&params.fw, file_name, devlink->dev);
>  	if (ret)
> -		return ret;
> +		goto out_unlock;
>  
> -	devl_lock(devlink);
>  	devlink_flash_update_begin_notify(devlink);
>  	ret = devlink->ops->flash_update(devlink, &params, NULL);
>  	devlink_flash_update_end_notify(devlink);
> -	devl_unlock(devlink);
>  
>  	release_firmware(params.fw);
> +out_unlock:
> +	devl_unlock(devlink);
>  
>  	return ret;
>  }
> diff --git a/net/devlink/netlink.c b/net/devlink/netlink.c
> index 69111746f5d9..b5b8ac6db2d1 100644
> --- a/net/devlink/netlink.c
> +++ b/net/devlink/netlink.c
> @@ -98,7 +98,8 @@ devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs)
>  
>  	devlinks_xa_for_each_registered_get(net, index, devlink) {
>  		devl_lock(devlink);
> -		if (strcmp(devlink->dev->bus->name, busname) == 0 &&
> +		if (devl_is_registered(devlink) &&
> +		    strcmp(devlink->dev->bus->name, busname) == 0 &&
>  		    strcmp(dev_name(devlink->dev), devname) == 0)
>  			return devlink;
>  		devl_unlock(devlink);
> @@ -211,7 +212,12 @@ int devlink_nl_instance_iter_dump(struct sk_buff *msg,
>  
>  	devlink_dump_for_each_instance_get(msg, state, devlink) {
>  		devl_lock(devlink);
> -		err = cmd->dump_one(msg, devlink, cb);
> +
> +		if (devl_is_registered(devlink))
> +			err = cmd->dump_one(msg, devlink, cb);
> +		else
> +			err = 0;
> +
>  		devl_unlock(devlink);
>  		devlink_put(devlink);
>  

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister
  2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
                   ` (10 preceding siblings ...)
  2023-01-06 15:49 ` Jiri Pirko
@ 2023-01-06 17:06 ` Jacob Keller
  11 siblings, 0 replies; 45+ messages in thread
From: Jacob Keller @ 2023-01-06 17:06 UTC (permalink / raw)
  To: Jakub Kicinski, davem; +Cc: netdev, edumazet, pabeni, jiri



On 1/5/2023 10:33 PM, Jakub Kicinski wrote:
> Move the registration and unregistration of the devlink instances
> under their instance locks. Don't perform the netdev-style wait
> for all references when unregistering the instance.
> 
> Instead the devlink instance refcount will only ensure that
> the memory of the instance is not freed. All places which acquire
> access to devlink instances via a reference must check that the
> instance is still registered under the instance lock.
> 
> This fixes the problem of the netdev code accessing devlink
> instances before they are registered.
> 
> RFC: https://lore.kernel.org/all/20221217011953.152487-1-kuba@kernel.org/
>  - rewrite the cover letter
>  - rewrite the commit message for patch 1
>  - un-export and rename devl_is_alive
>  - squash the netdevsim patches
> 
> Jakub Kicinski (9):
>   devlink: bump the instance index directly when iterating
>   devlink: update the code in netns move to latest helpers
>   devlink: protect devlink->dev by the instance lock
>   devlink: always check if the devlink instance is registered
>   devlink: remove the registration guarantee of references
>   devlink: don't require setting features before registration
>   devlink: allow registering parameters after the instance
>   netdevsim: rename a label
>   netdevsim: move devlink registration under the instance lock
> 
>  drivers/net/netdevsim/dev.c |  15 +++--
>  include/net/devlink.h       |   2 +
>  net/devlink/core.c          | 121 ++++++++++++++++--------------------
>  net/devlink/devl_internal.h |  28 ++++-----
>  net/devlink/leftover.c      |  64 ++++++++++++-------
>  net/devlink/netlink.c       |  19 ++++--
>  6 files changed, 137 insertions(+), 112 deletions(-)
> 

The whole series looks good to me. It looks like Jiri also has some
planned followups that will clean some of this up even more, which is great.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 4/9] devlink: always check if the devlink instance is registered
  2023-01-06 17:03   ` Jacob Keller
@ 2023-01-06 21:19     ` Jakub Kicinski
  2023-01-07  9:05       ` Jiri Pirko
  0 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06 21:19 UTC (permalink / raw)
  To: Jacob Keller; +Cc: davem, netdev, edumazet, pabeni, jiri

On Fri, 6 Jan 2023 09:03:18 -0800 Jacob Keller wrote:
> > +static inline bool devl_is_registered(struct devlink *devlink)
> > +{
> > +	/* To prevent races the caller must hold the instance lock
> > +	 * or another lock taken during unregistration.
> > +	 */  
> 
> Why not just lockdep_assert here on the instance lock? I guess this
> comment implies that another lock could be used instead but it seems
> weird to allow that? I guess because of things like the linecards_lock
> as opposed to the instance lock?

Yup, as discussed on the RFC - removing the regions lock specifically 
is quite tricky.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-06 12:55   ` Jiri Pirko
@ 2023-01-06 21:22     ` Jakub Kicinski
  2023-01-07  9:20       ` Jiri Pirko
  2023-01-10  0:21       ` Jacob Keller
  0 siblings, 2 replies; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-06 21:22 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

On Fri, 6 Jan 2023 13:55:53 +0100 Jiri Pirko wrote:
> >@@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
> > 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
> > 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
> > 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
> >-	ASSERT_DEVLINK_REGISTERED(devlink);
> >+
> >+	/* devlink_notify_register() / devlink_notify_unregister()
> >+	 * will replay the notifications if the params are added/removed
> >+	 * outside of the lifetime of the instance.
> >+	 */
> >+	if (!devl_is_registered(devlink))
> >+		return;  
> 
> This helper would be nice to use on other places as well.
> Like devlink_trap_group_notify(), devlink_trap_notify() and others. I
> will take care of that in a follow-up.

Alternatively we could reorder back to registering sub-objects
after the instance and not have to worry about re-sending 
notifications :S

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 4/9] devlink: always check if the devlink instance is registered
  2023-01-06 21:19     ` Jakub Kicinski
@ 2023-01-07  9:05       ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-07  9:05 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jacob Keller, davem, netdev, edumazet, pabeni

Fri, Jan 06, 2023 at 10:19:34PM CET, kuba@kernel.org wrote:
>On Fri, 6 Jan 2023 09:03:18 -0800 Jacob Keller wrote:
>> > +static inline bool devl_is_registered(struct devlink *devlink)
>> > +{
>> > +	/* To prevent races the caller must hold the instance lock
>> > +	 * or another lock taken during unregistration.
>> > +	 */  
>> 
>> Why not just lockdep_assert here on the instance lock? I guess this
>> comment implies that another lock could be used instead but it seems
>> weird to allow that? I guess because of things like the linecards_lock
>> as opposed to the instance lock?
>
>Yup, as discussed on the RFC - removing the regions lock specifically 
>is quite tricky.

I will submit in a jiff. Will add the assert here as a part
of the patchset.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-06 21:22     ` Jakub Kicinski
@ 2023-01-07  9:20       ` Jiri Pirko
  2023-01-10  0:21       ` Jacob Keller
  1 sibling, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-07  9:20 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 10:22:51PM CET, kuba@kernel.org wrote:
>On Fri, 6 Jan 2023 13:55:53 +0100 Jiri Pirko wrote:
>> >@@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
>> > 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
>> > 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
>> > 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
>> >-	ASSERT_DEVLINK_REGISTERED(devlink);
>> >+
>> >+	/* devlink_notify_register() / devlink_notify_unregister()
>> >+	 * will replay the notifications if the params are added/removed
>> >+	 * outside of the lifetime of the instance.
>> >+	 */
>> >+	if (!devl_is_registered(devlink))
>> >+		return;  
>> 
>> This helper would be nice to use on other places as well.
>> Like devlink_trap_group_notify(), devlink_trap_notify() and others. I
>> will take care of that in a follow-up.
>
>Alternatively we could reorder back to registering sub-objects
>after the instance and not have to worry about re-sending 
>notifications :S

Hmm, let me explore that path. Thanks!


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-06 21:22     ` Jakub Kicinski
  2023-01-07  9:20       ` Jiri Pirko
@ 2023-01-10  0:21       ` Jacob Keller
  2023-01-10 16:35         ` Jiri Pirko
  1 sibling, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2023-01-10  0:21 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko; +Cc: davem, netdev, edumazet, pabeni



On 1/6/2023 1:22 PM, Jakub Kicinski wrote:
> On Fri, 6 Jan 2023 13:55:53 +0100 Jiri Pirko wrote:
>>> @@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
>>> 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
>>> 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
>>> 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
>>> -	ASSERT_DEVLINK_REGISTERED(devlink);
>>> +
>>> +	/* devlink_notify_register() / devlink_notify_unregister()
>>> +	 * will replay the notifications if the params are added/removed
>>> +	 * outside of the lifetime of the instance.
>>> +	 */
>>> +	if (!devl_is_registered(devlink))
>>> +		return;  
>>
>> This helper would be nice to use on other places as well.
>> Like devlink_trap_group_notify(), devlink_trap_notify() and others. I
>> will take care of that in a follow-up.
> 
> Alternatively we could reorder back to registering sub-objects
> after the instance and not have to worry about re-sending 
> notifications :S

I did find it convenient to be able to do both pre and post-registering,
but of the two I'd definitely prefer doing it post-registering, as that
makes it easier to handle/allow more dynamic sub-objects.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-10  0:21       ` Jacob Keller
@ 2023-01-10 16:35         ` Jiri Pirko
  2023-01-10 20:22           ` Jakub Kicinski
  0 siblings, 1 reply; 45+ messages in thread
From: Jiri Pirko @ 2023-01-10 16:35 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni

Tue, Jan 10, 2023 at 01:21:18AM CET, jacob.e.keller@intel.com wrote:
>
>
>On 1/6/2023 1:22 PM, Jakub Kicinski wrote:
>> On Fri, 6 Jan 2023 13:55:53 +0100 Jiri Pirko wrote:
>>>> @@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
>>>> 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
>>>> 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
>>>> 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
>>>> -	ASSERT_DEVLINK_REGISTERED(devlink);
>>>> +
>>>> +	/* devlink_notify_register() / devlink_notify_unregister()
>>>> +	 * will replay the notifications if the params are added/removed
>>>> +	 * outside of the lifetime of the instance.
>>>> +	 */
>>>> +	if (!devl_is_registered(devlink))
>>>> +		return;  
>>>
>>> This helper would be nice to use on other places as well.
>>> Like devlink_trap_group_notify(), devlink_trap_notify() and others. I
>>> will take care of that in a follow-up.
>> 
>> Alternatively we could reorder back to registering sub-objects
>> after the instance and not have to worry about re-sending 
>> notifications :S
>
>I did find it convenient to be able to do both pre and post-registering,
>but of the two I'd definitely prefer doing it post-registering, as that
>makes it easier to handle/allow more dynamic sub-objects.

I'm confused. You want to register objects after instance register?


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-10 16:35         ` Jiri Pirko
@ 2023-01-10 20:22           ` Jakub Kicinski
  2023-01-11  9:32             ` Jiri Pirko
  0 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-10 20:22 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jacob Keller, davem, netdev, edumazet, pabeni

On Tue, 10 Jan 2023 17:35:35 +0100 Jiri Pirko wrote:
> >I did find it convenient to be able to do both pre and post-registering,
> >but of the two I'd definitely prefer doing it post-registering, as that
> >makes it easier to handle/allow more dynamic sub-objects.  
> 
> I'm confused. You want to register objects after instance register?

+1, I think it's an anti-pattern.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-10 20:22           ` Jakub Kicinski
@ 2023-01-11  9:32             ` Jiri Pirko
  2023-01-11 16:45               ` Jakub Kicinski
  0 siblings, 1 reply; 45+ messages in thread
From: Jiri Pirko @ 2023-01-11  9:32 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jacob Keller, davem, netdev, edumazet, pabeni

Tue, Jan 10, 2023 at 09:22:22PM CET, kuba@kernel.org wrote:
>On Tue, 10 Jan 2023 17:35:35 +0100 Jiri Pirko wrote:
>> >I did find it convenient to be able to do both pre and post-registering,
>> >but of the two I'd definitely prefer doing it post-registering, as that
>> >makes it easier to handle/allow more dynamic sub-objects.  
>> 
>> I'm confused. You want to register objects after instance register?
>
>+1, I think it's an anti-pattern.

Could you elaborate a bit please?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-06  6:34 ` [PATCH net-next 7/9] devlink: allow registering parameters after the instance Jakub Kicinski
  2023-01-06 12:55   ` Jiri Pirko
@ 2023-01-11 13:21   ` Jiri Pirko
  1 sibling, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-11 13:21 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller

Fri, Jan 06, 2023 at 07:34:00AM CET, kuba@kernel.org wrote:
>It's most natural to register the instance first and then its
>subobjects. Now that we can use the instance lock to protect
>the atomicity of all init - it should also be safe.
>
>Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>---
> net/devlink/leftover.c | 22 +++++++++++-----------
> 1 file changed, 11 insertions(+), 11 deletions(-)
>
>diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
>index 491f821c8b77..1e23b2da78cc 100644
>--- a/net/devlink/leftover.c
>+++ b/net/devlink/leftover.c
>@@ -5263,7 +5263,13 @@ static void devlink_param_notify(struct devlink *devlink,
> 	WARN_ON(cmd != DEVLINK_CMD_PARAM_NEW && cmd != DEVLINK_CMD_PARAM_DEL &&
> 		cmd != DEVLINK_CMD_PORT_PARAM_NEW &&
> 		cmd != DEVLINK_CMD_PORT_PARAM_DEL);
>-	ASSERT_DEVLINK_REGISTERED(devlink);
>+
>+	/* devlink_notify_register() / devlink_notify_unregister()
>+	 * will replay the notifications if the params are added/removed
>+	 * outside of the lifetime of the instance.
>+	 */
>+	if (!devl_is_registered(devlink))
>+		return;
> 
> 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> 	if (!msg)
>@@ -10915,8 +10921,6 @@ int devlink_params_register(struct devlink *devlink,
> 	const struct devlink_param *param = params;
> 	int i, err;
> 
>-	ASSERT_DEVLINK_NOT_REGISTERED(devlink);

Hmm, params list is not protected by any lock. The protection it used
was that it is static after registration. You changed it but didn't add
the lock. All param register/unregister functions need to be renamed to
devl_* and assert instance lock. I will fix this.

[..]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-11  9:32             ` Jiri Pirko
@ 2023-01-11 16:45               ` Jakub Kicinski
  2023-01-11 21:29                 ` Jacob Keller
  0 siblings, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-11 16:45 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jacob Keller, davem, netdev, edumazet, pabeni

On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
> >> I'm confused. You want to register objects after instance register?  
> >
> >+1, I think it's an anti-pattern.  
> 
> Could you elaborate a bit please?

Mixing registering sub-objects before and after the instance is a bit
of an anti-pattern. Easy to introduce bugs during reload and reset /
error recovery. I thought that's what you were saying as well.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-11 16:45               ` Jakub Kicinski
@ 2023-01-11 21:29                 ` Jacob Keller
  2023-01-12  7:07                   ` Leon Romanovsky
  0 siblings, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2023-01-11 21:29 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko; +Cc: davem, netdev, edumazet, pabeni



On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
> On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
>>>> I'm confused. You want to register objects after instance register?  
>>>
>>> +1, I think it's an anti-pattern.  
>>
>> Could you elaborate a bit please?
> 
> Mixing registering sub-objects before and after the instance is a bit
> of an anti-pattern. Easy to introduce bugs during reload and reset /
> error recovery. I thought that's what you were saying as well.

I was thinking of a case where an object is dynamic and might get added
based on events occurring after the devlink was registered.

But the more I think about it the less that makes sense. What events
would cause a whole subobject to be registerd which we wouldn't already
know about during initialization of devlink?

We do need some dynamic support because situations like "add port" will
add a port and then the ports subresources after the main devlink, but I
think that is already supported well and we'd add the port sub-resources
at the same time as the port.

But thinking more on this, there isn't really another good example since
we'd register things like health reporters, regions, resources, etc all
during initialization. Each of these sub objects may have dynamic
portions (ex: region captures, health events, etc) but the need for the
object should be known about during init time if its supported by the
device driver.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-11 21:29                 ` Jacob Keller
@ 2023-01-12  7:07                   ` Leon Romanovsky
  2023-01-12 14:59                     ` Jiri Pirko
  2023-01-12 19:20                     ` Jakub Kicinski
  0 siblings, 2 replies; 45+ messages in thread
From: Leon Romanovsky @ 2023-01-12  7:07 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jakub Kicinski, Jiri Pirko, davem, netdev, edumazet, pabeni

On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
> 
> 
> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
> >>>> I'm confused. You want to register objects after instance register?  
> >>>
> >>> +1, I think it's an anti-pattern.  
> >>
> >> Could you elaborate a bit please?
> > 
> > Mixing registering sub-objects before and after the instance is a bit
> > of an anti-pattern. Easy to introduce bugs during reload and reset /
> > error recovery. I thought that's what you were saying as well.
> 
> I was thinking of a case where an object is dynamic and might get added
> based on events occurring after the devlink was registered.
> 
> But the more I think about it the less that makes sense. What events
> would cause a whole subobject to be registerd which we wouldn't already
> know about during initialization of devlink?
> 
> We do need some dynamic support because situations like "add port" will
> add a port and then the ports subresources after the main devlink, but I
> think that is already supported well and we'd add the port sub-resources
> at the same time as the port.
> 
> But thinking more on this, there isn't really another good example since
> we'd register things like health reporters, regions, resources, etc all
> during initialization. Each of these sub objects may have dynamic
> portions (ex: region captures, health events, etc) but the need for the
> object should be known about during init time if its supported by the
> device driver.

As a user, I don't want to see any late dynamic object addition which is
not triggered by me explicitly. As it doesn't make any sense to add
various delays per-vendor/kernel in configuration scripts just because
not everything is ready. Users need predictability, lazy addition of
objects adds chaos instead.

Agree with Jakub, it is anti-pattern.

Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12  7:07                   ` Leon Romanovsky
@ 2023-01-12 14:59                     ` Jiri Pirko
  2023-01-12 19:58                       ` Leon Romanovsky
  2023-01-12 19:20                     ` Jakub Kicinski
  1 sibling, 1 reply; 45+ messages in thread
From: Jiri Pirko @ 2023-01-12 14:59 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

Thu, Jan 12, 2023 at 08:07:43AM CET, leon@kernel.org wrote:
>On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
>> 
>> 
>> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
>> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
>> >>>> I'm confused. You want to register objects after instance register?  
>> >>>
>> >>> +1, I think it's an anti-pattern.  
>> >>
>> >> Could you elaborate a bit please?
>> > 
>> > Mixing registering sub-objects before and after the instance is a bit
>> > of an anti-pattern. Easy to introduce bugs during reload and reset /
>> > error recovery. I thought that's what you were saying as well.
>> 
>> I was thinking of a case where an object is dynamic and might get added
>> based on events occurring after the devlink was registered.
>> 
>> But the more I think about it the less that makes sense. What events
>> would cause a whole subobject to be registerd which we wouldn't already
>> know about during initialization of devlink?
>> 
>> We do need some dynamic support because situations like "add port" will
>> add a port and then the ports subresources after the main devlink, but I
>> think that is already supported well and we'd add the port sub-resources
>> at the same time as the port.
>> 
>> But thinking more on this, there isn't really another good example since
>> we'd register things like health reporters, regions, resources, etc all
>> during initialization. Each of these sub objects may have dynamic
>> portions (ex: region captures, health events, etc) but the need for the
>> object should be known about during init time if its supported by the
>> device driver.
>
>As a user, I don't want to see any late dynamic object addition which is
>not triggered by me explicitly. As it doesn't make any sense to add
>various delays per-vendor/kernel in configuration scripts just because
>not everything is ready. Users need predictability, lazy addition of
>objects adds chaos instead.
>
>Agree with Jakub, it is anti-pattern.

Yeah, but, we have reload. And during reload, instance is still
registered yet the subobject disappear and reappear. So that would be
inconsistent with the init/fini flow.

Perhaps during reload we should emulate complete fini/init notification
flow to the user?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12  7:07                   ` Leon Romanovsky
  2023-01-12 14:59                     ` Jiri Pirko
@ 2023-01-12 19:20                     ` Jakub Kicinski
  2023-01-12 20:09                       ` Leon Romanovsky
  1 sibling, 1 reply; 45+ messages in thread
From: Jakub Kicinski @ 2023-01-12 19:20 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Jacob Keller, Jiri Pirko, davem, netdev, edumazet, pabeni

On Thu, 12 Jan 2023 09:07:43 +0200 Leon Romanovsky wrote:
> As a user, I don't want to see any late dynamic object addition which is
> not triggered by me explicitly. As it doesn't make any sense to add
> various delays per-vendor/kernel in configuration scripts just because
> not everything is ready. Users need predictability, lazy addition of
> objects adds chaos instead.
> 
> Agree with Jakub, it is anti-pattern.

To be clear my preference would be to always construct the three from
the root. Register the main instance, then sub-objects. I mean - you
tried forcing the opposite order and it only succeeded in 90-something
percent of cases. There's always special cases.

I don't understand your concern about user experience here. We have
notifications for each sub-object. Plus I think drivers should hold 
the instance lock throughout the probe routine. I don't see a scenario
in which registering the main instance first would lead to retry/sleep
hacks in user space, do you? I'm talking about devlink and the subobjs
we have specifically.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12 14:59                     ` Jiri Pirko
@ 2023-01-12 19:58                       ` Leon Romanovsky
  2023-01-13  7:50                         ` Jiri Pirko
  0 siblings, 1 reply; 45+ messages in thread
From: Leon Romanovsky @ 2023-01-12 19:58 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

On Thu, Jan 12, 2023 at 03:59:53PM +0100, Jiri Pirko wrote:
> Thu, Jan 12, 2023 at 08:07:43AM CET, leon@kernel.org wrote:
> >On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
> >> 
> >> 
> >> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
> >> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
> >> >>>> I'm confused. You want to register objects after instance register?  
> >> >>>
> >> >>> +1, I think it's an anti-pattern.  
> >> >>
> >> >> Could you elaborate a bit please?
> >> > 
> >> > Mixing registering sub-objects before and after the instance is a bit
> >> > of an anti-pattern. Easy to introduce bugs during reload and reset /
> >> > error recovery. I thought that's what you were saying as well.
> >> 
> >> I was thinking of a case where an object is dynamic and might get added
> >> based on events occurring after the devlink was registered.
> >> 
> >> But the more I think about it the less that makes sense. What events
> >> would cause a whole subobject to be registerd which we wouldn't already
> >> know about during initialization of devlink?
> >> 
> >> We do need some dynamic support because situations like "add port" will
> >> add a port and then the ports subresources after the main devlink, but I
> >> think that is already supported well and we'd add the port sub-resources
> >> at the same time as the port.
> >> 
> >> But thinking more on this, there isn't really another good example since
> >> we'd register things like health reporters, regions, resources, etc all
> >> during initialization. Each of these sub objects may have dynamic
> >> portions (ex: region captures, health events, etc) but the need for the
> >> object should be known about during init time if its supported by the
> >> device driver.
> >
> >As a user, I don't want to see any late dynamic object addition which is
> >not triggered by me explicitly. As it doesn't make any sense to add
> >various delays per-vendor/kernel in configuration scripts just because
> >not everything is ready. Users need predictability, lazy addition of
> >objects adds chaos instead.
> >
> >Agree with Jakub, it is anti-pattern.
> 
> Yeah, but, we have reload. And during reload, instance is still
> registered yet the subobject disappear and reappear. So that would be
> inconsistent with the init/fini flow.
> 
> Perhaps during reload we should emulate complete fini/init notification
> flow to the user?

"reload" is triggered by me explicitly and I will get success/fail result
at the end. There is no much meaning in subobject notifications during
that operation.

Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12 19:20                     ` Jakub Kicinski
@ 2023-01-12 20:09                       ` Leon Romanovsky
  2023-01-12 22:44                         ` Jacob Keller
  0 siblings, 1 reply; 45+ messages in thread
From: Leon Romanovsky @ 2023-01-12 20:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jacob Keller, Jiri Pirko, davem, netdev, edumazet, pabeni

On Thu, Jan 12, 2023 at 11:20:21AM -0800, Jakub Kicinski wrote:
> On Thu, 12 Jan 2023 09:07:43 +0200 Leon Romanovsky wrote:
> > As a user, I don't want to see any late dynamic object addition which is
> > not triggered by me explicitly. As it doesn't make any sense to add
> > various delays per-vendor/kernel in configuration scripts just because
> > not everything is ready. Users need predictability, lazy addition of
> > objects adds chaos instead.
> > 
> > Agree with Jakub, it is anti-pattern.
> 
> To be clear my preference would be to always construct the three from
> the root. Register the main instance, then sub-objects. I mean - you
> tried forcing the opposite order and it only succeeded in 90-something
> percent of cases. There's always special cases.
> 
> I don't understand your concern about user experience here. We have
> notifications for each sub-object. Plus I think drivers should hold 
> the instance lock throughout the probe routine. I don't see a scenario
> in which registering the main instance first would lead to retry/sleep
> hacks in user space, do you? I'm talking about devlink and the subobjs
> we have specifically.

The term "dynamic object addition" means for me what driver authors will
be able to add objects anytime in lifetime of the driver. I'm pretty sure
that once you allow that, we will see zoo here. Over time, you will get
everything from .probe() to workqueues. The latter caused me to write
about retry/sleep hacks.

If you success to force everyone to add objects in .probe() only, it
will be very close to what I tried to achieve.

Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12 20:09                       ` Leon Romanovsky
@ 2023-01-12 22:44                         ` Jacob Keller
  2023-01-13  6:45                           ` Leon Romanovsky
  0 siblings, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2023-01-12 22:44 UTC (permalink / raw)
  To: Leon Romanovsky, Jakub Kicinski
  Cc: Jiri Pirko, davem, netdev, edumazet, pabeni



On 1/12/2023 12:09 PM, Leon Romanovsky wrote:
> On Thu, Jan 12, 2023 at 11:20:21AM -0800, Jakub Kicinski wrote:
>> On Thu, 12 Jan 2023 09:07:43 +0200 Leon Romanovsky wrote:
>>> As a user, I don't want to see any late dynamic object addition which is
>>> not triggered by me explicitly. As it doesn't make any sense to add
>>> various delays per-vendor/kernel in configuration scripts just because
>>> not everything is ready. Users need predictability, lazy addition of
>>> objects adds chaos instead.
>>>
>>> Agree with Jakub, it is anti-pattern.
>>
>> To be clear my preference would be to always construct the three from
>> the root. Register the main instance, then sub-objects. I mean - you
>> tried forcing the opposite order and it only succeeded in 90-something
>> percent of cases. There's always special cases.
>>

Right. I think its easier to simply require devlink to be registered first.

>> I don't understand your concern about user experience here. We have
>> notifications for each sub-object. Plus I think drivers should hold 
>> the instance lock throughout the probe routine. I don't see a scenario
>> in which registering the main instance first would lead to retry/sleep
>> hacks in user space, do you? I'm talking about devlink and the subobjs
>> we have specifically.
> 
> The term "dynamic object addition" means for me what driver authors will
> be able to add objects anytime in lifetime of the driver. I'm pretty sure
> that once you allow that, we will see zoo here. Over time, you will get
> everything from .probe() to workqueues. The latter caused me to write
> about retry/sleep hacks.
> 
> If you success to force everyone to add objects in .probe() only, it
> will be very close to what I tried to achieve.
> 
> Thanks

Yea. I was initially thinking of something like that, but I've convinced
myself that its a bad idea. The only "dynamic" objects (added after the
initialization phase of devlink) should be those which are triggered via
user space request (i.e. "devlink port add").

Thanks,
Jake

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12 22:44                         ` Jacob Keller
@ 2023-01-13  6:45                           ` Leon Romanovsky
  2023-01-13  7:53                             ` Jiri Pirko
  0 siblings, 1 reply; 45+ messages in thread
From: Leon Romanovsky @ 2023-01-13  6:45 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jakub Kicinski, Jiri Pirko, davem, netdev, edumazet, pabeni

On Thu, Jan 12, 2023 at 02:44:43PM -0800, Jacob Keller wrote:
> 
> 
> On 1/12/2023 12:09 PM, Leon Romanovsky wrote:
> > On Thu, Jan 12, 2023 at 11:20:21AM -0800, Jakub Kicinski wrote:
> >> On Thu, 12 Jan 2023 09:07:43 +0200 Leon Romanovsky wrote:
> >>> As a user, I don't want to see any late dynamic object addition which is
> >>> not triggered by me explicitly. As it doesn't make any sense to add
> >>> various delays per-vendor/kernel in configuration scripts just because
> >>> not everything is ready. Users need predictability, lazy addition of
> >>> objects adds chaos instead.
> >>>
> >>> Agree with Jakub, it is anti-pattern.
> >>
> >> To be clear my preference would be to always construct the three from
> >> the root. Register the main instance, then sub-objects. I mean - you
> >> tried forcing the opposite order and it only succeeded in 90-something
> >> percent of cases. There's always special cases.

Back then, we had only one special case - netdevsim. I still think that
all recent complexity that was brought to the devlink could be avoided
if we would change netdevsim to behave as HW driver (remove sysfs).

> Right. I think its easier to simply require devlink to be registered first.

devlink_register() is no more than a fancy way to say to the world: "I'm
ready to accept commands". Right now, when the need_lock flag is removed
from all devlink commands, we can place devlink_register() at any place.

> 
> >> I don't understand your concern about user experience here. We have
> >> notifications for each sub-object. Plus I think drivers should hold 
> >> the instance lock throughout the probe routine. I don't see a scenario
> >> in which registering the main instance first would lead to retry/sleep
> >> hacks in user space, do you? I'm talking about devlink and the subobjs
> >> we have specifically.
> > 
> > The term "dynamic object addition" means for me what driver authors will
> > be able to add objects anytime in lifetime of the driver. I'm pretty sure
> > that once you allow that, we will see zoo here. Over time, you will get
> > everything from .probe() to workqueues. The latter caused me to write
> > about retry/sleep hacks.
> > 
> > If you success to force everyone to add objects in .probe() only, it
> > will be very close to what I tried to achieve.
> > 
> > Thanks
> 
> Yea. I was initially thinking of something like that, but I've convinced
> myself that its a bad idea. The only "dynamic" objects (added after the
> initialization phase of devlink) should be those which are triggered via
> user space request (i.e. "devlink port add").

Exactly.

> 
> Thanks,
> Jake

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-12 19:58                       ` Leon Romanovsky
@ 2023-01-13  7:50                         ` Jiri Pirko
  2023-01-15  8:35                           ` Leon Romanovsky
  0 siblings, 1 reply; 45+ messages in thread
From: Jiri Pirko @ 2023-01-13  7:50 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

Thu, Jan 12, 2023 at 08:58:58PM CET, leon@kernel.org wrote:
>On Thu, Jan 12, 2023 at 03:59:53PM +0100, Jiri Pirko wrote:
>> Thu, Jan 12, 2023 at 08:07:43AM CET, leon@kernel.org wrote:
>> >On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
>> >> 
>> >> 
>> >> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
>> >> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
>> >> >>>> I'm confused. You want to register objects after instance register?  
>> >> >>>
>> >> >>> +1, I think it's an anti-pattern.  
>> >> >>
>> >> >> Could you elaborate a bit please?
>> >> > 
>> >> > Mixing registering sub-objects before and after the instance is a bit
>> >> > of an anti-pattern. Easy to introduce bugs during reload and reset /
>> >> > error recovery. I thought that's what you were saying as well.
>> >> 
>> >> I was thinking of a case where an object is dynamic and might get added
>> >> based on events occurring after the devlink was registered.
>> >> 
>> >> But the more I think about it the less that makes sense. What events
>> >> would cause a whole subobject to be registerd which we wouldn't already
>> >> know about during initialization of devlink?
>> >> 
>> >> We do need some dynamic support because situations like "add port" will
>> >> add a port and then the ports subresources after the main devlink, but I
>> >> think that is already supported well and we'd add the port sub-resources
>> >> at the same time as the port.
>> >> 
>> >> But thinking more on this, there isn't really another good example since
>> >> we'd register things like health reporters, regions, resources, etc all
>> >> during initialization. Each of these sub objects may have dynamic
>> >> portions (ex: region captures, health events, etc) but the need for the
>> >> object should be known about during init time if its supported by the
>> >> device driver.
>> >
>> >As a user, I don't want to see any late dynamic object addition which is
>> >not triggered by me explicitly. As it doesn't make any sense to add
>> >various delays per-vendor/kernel in configuration scripts just because
>> >not everything is ready. Users need predictability, lazy addition of
>> >objects adds chaos instead.
>> >
>> >Agree with Jakub, it is anti-pattern.
>> 
>> Yeah, but, we have reload. And during reload, instance is still
>> registered yet the subobject disappear and reappear. So that would be
>> inconsistent with the init/fini flow.
>> 
>> Perhaps during reload we should emulate complete fini/init notification
>> flow to the user?
>
>"reload" is triggered by me explicitly and I will get success/fail result
>at the end. There is no much meaning in subobject notifications during
>that operation.

Definitelly not. User would trigger reload, however another entity
(systemd for example) would listen to the notifications and react
if necessary.

>
>Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-13  6:45                           ` Leon Romanovsky
@ 2023-01-13  7:53                             ` Jiri Pirko
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Pirko @ 2023-01-13  7:53 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

Fri, Jan 13, 2023 at 07:45:46AM CET, leon@kernel.org wrote:
>On Thu, Jan 12, 2023 at 02:44:43PM -0800, Jacob Keller wrote:
>> 
>> 
>> On 1/12/2023 12:09 PM, Leon Romanovsky wrote:
>> > On Thu, Jan 12, 2023 at 11:20:21AM -0800, Jakub Kicinski wrote:
>> >> On Thu, 12 Jan 2023 09:07:43 +0200 Leon Romanovsky wrote:
>> >>> As a user, I don't want to see any late dynamic object addition which is
>> >>> not triggered by me explicitly. As it doesn't make any sense to add
>> >>> various delays per-vendor/kernel in configuration scripts just because
>> >>> not everything is ready. Users need predictability, lazy addition of
>> >>> objects adds chaos instead.
>> >>>
>> >>> Agree with Jakub, it is anti-pattern.
>> >>
>> >> To be clear my preference would be to always construct the three from
>> >> the root. Register the main instance, then sub-objects. I mean - you
>> >> tried forcing the opposite order and it only succeeded in 90-something
>> >> percent of cases. There's always special cases.
>
>Back then, we had only one special case - netdevsim. I still think that
>all recent complexity that was brought to the devlink could be avoided
>if we would change netdevsim to behave as HW driver (remove sysfs).
>
>> Right. I think its easier to simply require devlink to be registered first.
>
>devlink_register() is no more than a fancy way to say to the world: "I'm
>ready to accept commands". Right now, when the need_lock flag is removed
>from all devlink commands, we can place devlink_register() at any place.
>
>> 
>> >> I don't understand your concern about user experience here. We have
>> >> notifications for each sub-object. Plus I think drivers should hold 
>> >> the instance lock throughout the probe routine. I don't see a scenario
>> >> in which registering the main instance first would lead to retry/sleep
>> >> hacks in user space, do you? I'm talking about devlink and the subobjs
>> >> we have specifically.
>> > 
>> > The term "dynamic object addition" means for me what driver authors will
>> > be able to add objects anytime in lifetime of the driver. I'm pretty sure
>> > that once you allow that, we will see zoo here. Over time, you will get
>> > everything from .probe() to workqueues. The latter caused me to write
>> > about retry/sleep hacks.
>> > 
>> > If you success to force everyone to add objects in .probe() only, it
>> > will be very close to what I tried to achieve.
>> > 
>> > Thanks
>> 
>> Yea. I was initially thinking of something like that, but I've convinced
>> myself that its a bad idea. The only "dynamic" objects (added after the
>> initialization phase of devlink) should be those which are triggered via
>> user space request (i.e. "devlink port add").
>
>Exactly.

And reload as well.

>
>> 
>> Thanks,
>> Jake

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-13  7:50                         ` Jiri Pirko
@ 2023-01-15  8:35                           ` Leon Romanovsky
  2023-01-16 10:33                             ` Jiri Pirko
  0 siblings, 1 reply; 45+ messages in thread
From: Leon Romanovsky @ 2023-01-15  8:35 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

On Fri, Jan 13, 2023 at 08:50:33AM +0100, Jiri Pirko wrote:
> Thu, Jan 12, 2023 at 08:58:58PM CET, leon@kernel.org wrote:
> >On Thu, Jan 12, 2023 at 03:59:53PM +0100, Jiri Pirko wrote:
> >> Thu, Jan 12, 2023 at 08:07:43AM CET, leon@kernel.org wrote:
> >> >On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
> >> >> 
> >> >> 
> >> >> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
> >> >> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
> >> >> >>>> I'm confused. You want to register objects after instance register?  
> >> >> >>>
> >> >> >>> +1, I think it's an anti-pattern.  
> >> >> >>
> >> >> >> Could you elaborate a bit please?
> >> >> > 
> >> >> > Mixing registering sub-objects before and after the instance is a bit
> >> >> > of an anti-pattern. Easy to introduce bugs during reload and reset /
> >> >> > error recovery. I thought that's what you were saying as well.
> >> >> 
> >> >> I was thinking of a case where an object is dynamic and might get added
> >> >> based on events occurring after the devlink was registered.
> >> >> 
> >> >> But the more I think about it the less that makes sense. What events
> >> >> would cause a whole subobject to be registerd which we wouldn't already
> >> >> know about during initialization of devlink?
> >> >> 
> >> >> We do need some dynamic support because situations like "add port" will
> >> >> add a port and then the ports subresources after the main devlink, but I
> >> >> think that is already supported well and we'd add the port sub-resources
> >> >> at the same time as the port.
> >> >> 
> >> >> But thinking more on this, there isn't really another good example since
> >> >> we'd register things like health reporters, regions, resources, etc all
> >> >> during initialization. Each of these sub objects may have dynamic
> >> >> portions (ex: region captures, health events, etc) but the need for the
> >> >> object should be known about during init time if its supported by the
> >> >> device driver.
> >> >
> >> >As a user, I don't want to see any late dynamic object addition which is
> >> >not triggered by me explicitly. As it doesn't make any sense to add
> >> >various delays per-vendor/kernel in configuration scripts just because
> >> >not everything is ready. Users need predictability, lazy addition of
> >> >objects adds chaos instead.
> >> >
> >> >Agree with Jakub, it is anti-pattern.
> >> 
> >> Yeah, but, we have reload. And during reload, instance is still
> >> registered yet the subobject disappear and reappear. So that would be
> >> inconsistent with the init/fini flow.
> >> 
> >> Perhaps during reload we should emulate complete fini/init notification
> >> flow to the user?
> >
> >"reload" is triggered by me explicitly and I will get success/fail result
> >at the end. There is no much meaning in subobject notifications during
> >that operation.
> 
> Definitelly not. User would trigger reload, however another entity
> (systemd for example) would listen to the notifications and react
> if necessary.

Listen yes, however it is not clear if notification sequence should
mimic fini/init flow.

Thanks

> 
> >
> >Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-15  8:35                           ` Leon Romanovsky
@ 2023-01-16 10:33                             ` Jiri Pirko
  2023-01-16 11:25                               ` Leon Romanovsky
  0 siblings, 1 reply; 45+ messages in thread
From: Jiri Pirko @ 2023-01-16 10:33 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

Sun, Jan 15, 2023 at 09:35:57AM CET, leon@kernel.org wrote:
>On Fri, Jan 13, 2023 at 08:50:33AM +0100, Jiri Pirko wrote:
>> Thu, Jan 12, 2023 at 08:58:58PM CET, leon@kernel.org wrote:
>> >On Thu, Jan 12, 2023 at 03:59:53PM +0100, Jiri Pirko wrote:
>> >> Thu, Jan 12, 2023 at 08:07:43AM CET, leon@kernel.org wrote:
>> >> >On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
>> >> >> 
>> >> >> 
>> >> >> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
>> >> >> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
>> >> >> >>>> I'm confused. You want to register objects after instance register?  
>> >> >> >>>
>> >> >> >>> +1, I think it's an anti-pattern.  
>> >> >> >>
>> >> >> >> Could you elaborate a bit please?
>> >> >> > 
>> >> >> > Mixing registering sub-objects before and after the instance is a bit
>> >> >> > of an anti-pattern. Easy to introduce bugs during reload and reset /
>> >> >> > error recovery. I thought that's what you were saying as well.
>> >> >> 
>> >> >> I was thinking of a case where an object is dynamic and might get added
>> >> >> based on events occurring after the devlink was registered.
>> >> >> 
>> >> >> But the more I think about it the less that makes sense. What events
>> >> >> would cause a whole subobject to be registerd which we wouldn't already
>> >> >> know about during initialization of devlink?
>> >> >> 
>> >> >> We do need some dynamic support because situations like "add port" will
>> >> >> add a port and then the ports subresources after the main devlink, but I
>> >> >> think that is already supported well and we'd add the port sub-resources
>> >> >> at the same time as the port.
>> >> >> 
>> >> >> But thinking more on this, there isn't really another good example since
>> >> >> we'd register things like health reporters, regions, resources, etc all
>> >> >> during initialization. Each of these sub objects may have dynamic
>> >> >> portions (ex: region captures, health events, etc) but the need for the
>> >> >> object should be known about during init time if its supported by the
>> >> >> device driver.
>> >> >
>> >> >As a user, I don't want to see any late dynamic object addition which is
>> >> >not triggered by me explicitly. As it doesn't make any sense to add
>> >> >various delays per-vendor/kernel in configuration scripts just because
>> >> >not everything is ready. Users need predictability, lazy addition of
>> >> >objects adds chaos instead.
>> >> >
>> >> >Agree with Jakub, it is anti-pattern.
>> >> 
>> >> Yeah, but, we have reload. And during reload, instance is still
>> >> registered yet the subobject disappear and reappear. So that would be
>> >> inconsistent with the init/fini flow.
>> >> 
>> >> Perhaps during reload we should emulate complete fini/init notification
>> >> flow to the user?
>> >
>> >"reload" is triggered by me explicitly and I will get success/fail result
>> >at the end. There is no much meaning in subobject notifications during
>> >that operation.
>> 
>> Definitelly not. User would trigger reload, however another entity
>> (systemd for example) would listen to the notifications and react
>> if necessary.
>
>Listen yes, however it is not clear if notification sequence should
>mimic fini/init flow.

Well, it makes sense to me. Why do you think it should not?

>
>Thanks
>
>> 
>> >
>> >Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH net-next 7/9] devlink: allow registering parameters after the instance
  2023-01-16 10:33                             ` Jiri Pirko
@ 2023-01-16 11:25                               ` Leon Romanovsky
  0 siblings, 0 replies; 45+ messages in thread
From: Leon Romanovsky @ 2023-01-16 11:25 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jacob Keller, Jakub Kicinski, davem, netdev, edumazet, pabeni

On Mon, Jan 16, 2023 at 11:33:05AM +0100, Jiri Pirko wrote:
> Sun, Jan 15, 2023 at 09:35:57AM CET, leon@kernel.org wrote:
> >On Fri, Jan 13, 2023 at 08:50:33AM +0100, Jiri Pirko wrote:
> >> Thu, Jan 12, 2023 at 08:58:58PM CET, leon@kernel.org wrote:
> >> >On Thu, Jan 12, 2023 at 03:59:53PM +0100, Jiri Pirko wrote:
> >> >> Thu, Jan 12, 2023 at 08:07:43AM CET, leon@kernel.org wrote:
> >> >> >On Wed, Jan 11, 2023 at 01:29:03PM -0800, Jacob Keller wrote:
> >> >> >> 
> >> >> >> 
> >> >> >> On 1/11/2023 8:45 AM, Jakub Kicinski wrote:
> >> >> >> > On Wed, 11 Jan 2023 10:32:13 +0100 Jiri Pirko wrote:
> >> >> >> >>>> I'm confused. You want to register objects after instance register?  
> >> >> >> >>>
> >> >> >> >>> +1, I think it's an anti-pattern.  
> >> >> >> >>
> >> >> >> >> Could you elaborate a bit please?
> >> >> >> > 
> >> >> >> > Mixing registering sub-objects before and after the instance is a bit
> >> >> >> > of an anti-pattern. Easy to introduce bugs during reload and reset /
> >> >> >> > error recovery. I thought that's what you were saying as well.
> >> >> >> 
> >> >> >> I was thinking of a case where an object is dynamic and might get added
> >> >> >> based on events occurring after the devlink was registered.
> >> >> >> 
> >> >> >> But the more I think about it the less that makes sense. What events
> >> >> >> would cause a whole subobject to be registerd which we wouldn't already
> >> >> >> know about during initialization of devlink?
> >> >> >> 
> >> >> >> We do need some dynamic support because situations like "add port" will
> >> >> >> add a port and then the ports subresources after the main devlink, but I
> >> >> >> think that is already supported well and we'd add the port sub-resources
> >> >> >> at the same time as the port.
> >> >> >> 
> >> >> >> But thinking more on this, there isn't really another good example since
> >> >> >> we'd register things like health reporters, regions, resources, etc all
> >> >> >> during initialization. Each of these sub objects may have dynamic
> >> >> >> portions (ex: region captures, health events, etc) but the need for the
> >> >> >> object should be known about during init time if its supported by the
> >> >> >> device driver.
> >> >> >
> >> >> >As a user, I don't want to see any late dynamic object addition which is
> >> >> >not triggered by me explicitly. As it doesn't make any sense to add
> >> >> >various delays per-vendor/kernel in configuration scripts just because
> >> >> >not everything is ready. Users need predictability, lazy addition of
> >> >> >objects adds chaos instead.
> >> >> >
> >> >> >Agree with Jakub, it is anti-pattern.
> >> >> 
> >> >> Yeah, but, we have reload. And during reload, instance is still
> >> >> registered yet the subobject disappear and reappear. So that would be
> >> >> inconsistent with the init/fini flow.
> >> >> 
> >> >> Perhaps during reload we should emulate complete fini/init notification
> >> >> flow to the user?
> >> >
> >> >"reload" is triggered by me explicitly and I will get success/fail result
> >> >at the end. There is no much meaning in subobject notifications during
> >> >that operation.
> >> 
> >> Definitelly not. User would trigger reload, however another entity
> >> (systemd for example) would listen to the notifications and react
> >> if necessary.
> >
> >Listen yes, however it is not clear if notification sequence should
> >mimic fini/init flow.
> 
> Well, it makes sense to me. Why do you think it should not?

After all this years, I still don't understand the mandate of devlink
reload. It doesn't load/unload driver completely and as such not really
performs probe/remove sequences. There is no requirement from the driver
to do anything even close to fini/init too. 

Sometimes, devlink reload behaves as fini/init, but not always.

This is why I'm not sure.

Thanks

> 
> >
> >Thanks
> >
> >> 
> >> >
> >> >Thanks

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2023-01-16 11:25 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-06  6:33 [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister Jakub Kicinski
2023-01-06  6:33 ` [PATCH net-next 1/9] devlink: bump the instance index directly when iterating Jakub Kicinski
2023-01-06 12:17   ` Jiri Pirko
2023-01-06  6:33 ` [PATCH net-next 2/9] devlink: update the code in netns move to latest helpers Jakub Kicinski
2023-01-06  6:33 ` [PATCH net-next 3/9] devlink: protect devlink->dev by the instance lock Jakub Kicinski
2023-01-06 12:18   ` Jiri Pirko
2023-01-06  6:33 ` [PATCH net-next 4/9] devlink: always check if the devlink instance is registered Jakub Kicinski
2023-01-06 12:41   ` Jiri Pirko
2023-01-06 17:03   ` Jacob Keller
2023-01-06 21:19     ` Jakub Kicinski
2023-01-07  9:05       ` Jiri Pirko
2023-01-06  6:33 ` [PATCH net-next 5/9] devlink: remove the registration guarantee of references Jakub Kicinski
2023-01-06 12:42   ` Jiri Pirko
2023-01-06  6:33 ` [PATCH net-next 6/9] devlink: don't require setting features before registration Jakub Kicinski
2023-01-06 12:43   ` Jiri Pirko
2023-01-06  6:34 ` [PATCH net-next 7/9] devlink: allow registering parameters after the instance Jakub Kicinski
2023-01-06 12:55   ` Jiri Pirko
2023-01-06 21:22     ` Jakub Kicinski
2023-01-07  9:20       ` Jiri Pirko
2023-01-10  0:21       ` Jacob Keller
2023-01-10 16:35         ` Jiri Pirko
2023-01-10 20:22           ` Jakub Kicinski
2023-01-11  9:32             ` Jiri Pirko
2023-01-11 16:45               ` Jakub Kicinski
2023-01-11 21:29                 ` Jacob Keller
2023-01-12  7:07                   ` Leon Romanovsky
2023-01-12 14:59                     ` Jiri Pirko
2023-01-12 19:58                       ` Leon Romanovsky
2023-01-13  7:50                         ` Jiri Pirko
2023-01-15  8:35                           ` Leon Romanovsky
2023-01-16 10:33                             ` Jiri Pirko
2023-01-16 11:25                               ` Leon Romanovsky
2023-01-12 19:20                     ` Jakub Kicinski
2023-01-12 20:09                       ` Leon Romanovsky
2023-01-12 22:44                         ` Jacob Keller
2023-01-13  6:45                           ` Leon Romanovsky
2023-01-13  7:53                             ` Jiri Pirko
2023-01-11 13:21   ` Jiri Pirko
2023-01-06  6:34 ` [PATCH net-next 8/9] netdevsim: rename a label Jakub Kicinski
2023-01-06 12:56   ` Jiri Pirko
2023-01-06  6:34 ` [PATCH net-next 9/9] netdevsim: move devlink registration under the instance lock Jakub Kicinski
2023-01-06 15:49   ` Jiri Pirko
2023-01-06 13:10 ` [PATCH net-next 0/9] devlink: remove the wait-for-references on unregister patchwork-bot+netdevbpf
2023-01-06 15:49 ` Jiri Pirko
2023-01-06 17:06 ` Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).