All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] Clean devlink net namespace operations
@ 2021-07-29  8:15 Leon Romanovsky
  2021-07-29  8:15 ` [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver Leon Romanovsky
  2021-07-29  8:15 ` [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace Leon Romanovsky
  0 siblings, 2 replies; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-29  8:15 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Jiri Pirko
  Cc: Leon Romanovsky, linux-kernel, netdev, Parav Pandit

From: Leon Romanovsky <leonro@nvidia.com>

Hi Dave, Jakub and Jiri

This short series continues my work on devlink core code to make devlink
reload less prone to errors and harden it from API abuse.

Despite first patch being a clear fix, I would ask you to apply it to net-next
anyway, because the fixed patch is anyway old and it will help us to eliminate
merge conflicts that will arise for following patches or even for the second one.

Thanks

Leon Romanovsky (2):
  devlink: Break parameter notification sequence to be before/after
    unload/load driver
  devlink: Allocate devlink directly in requested net namespace

 drivers/net/netdevsim/dev.c |  4 +--
 include/net/devlink.h       | 14 ++++++++--
 net/core/devlink.c          | 56 ++++++++++++++++++-------------------
 3 files changed, 41 insertions(+), 33 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver
  2021-07-29  8:15 [PATCH net-next 0/2] Clean devlink net namespace operations Leon Romanovsky
@ 2021-07-29  8:15 ` Leon Romanovsky
  2021-07-29 11:22   ` Jiri Pirko
  2021-07-29  8:15 ` [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace Leon Romanovsky
  1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-29  8:15 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Jiri Pirko
  Cc: Leon Romanovsky, linux-kernel, netdev, Parav Pandit

From: Leon Romanovsky <leonro@nvidia.com>

The change of namespaces during devlink reload calls to driver unload
before it accesses devlink parameters. The commands below causes to
use-after-free bug when trying to get flow steering mode.

 * ip netns add n1
 * devlink dev reload pci/0000:00:09.0 netns n1

 ==================================================================
 BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
 Read of size 4 at addr ffff888009d04308 by task devlink/275

 CPU: 6 PID: 275 Comm: devlink Not tainted 5.12.0-rc2+ #2853
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x93/0xc2
  print_address_description.constprop.0+0x18/0x140
  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  kasan_report.cold+0x7c/0xd8
  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
  devlink_nl_param_fill+0x1c8/0xe80
  ? __free_pages_ok+0x37a/0x8a0
  ? devlink_flash_update_timeout_notify+0xd0/0xd0
  ? lock_acquire+0x1a9/0x6d0
  ? fs_reclaim_acquire+0xb7/0x160
  ? lock_is_held_type+0x98/0x110
  ? 0xffffffff81000000
  ? lock_release+0x1f9/0x6c0
  ? fs_reclaim_release+0xa1/0xf0
  ? lock_downgrade+0x6d0/0x6d0
  ? lock_is_held_type+0x98/0x110
  ? lock_is_held_type+0x98/0x110
  ? memset+0x20/0x40
  ? __build_skb_around+0x1f8/0x2b0
  devlink_param_notify+0x6d/0x180
  devlink_reload+0x1c3/0x520
  ? devlink_remote_reload_actions_performed+0x30/0x30
  ? mutex_trylock+0x24b/0x2d0
  ? devlink_nl_cmd_reload+0x62b/0x1070
  devlink_nl_cmd_reload+0x66d/0x1070
  ? devlink_reload+0x520/0x520
  ? devlink_get_from_attrs+0x1bc/0x260
  ? devlink_nl_pre_doit+0x64/0x4d0
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  ? mutex_lock_io_nested+0x1130/0x1130
  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
  ? security_capable+0x51/0x90
  genl_rcv_msg+0x27f/0x4a0
  ? genl_get_cmd+0x3c0/0x3c0
  ? lock_acquire+0x1a9/0x6d0
  ? devlink_reload+0x520/0x520
  ? lock_release+0x6c0/0x6c0
  netlink_rcv_skb+0x11d/0x340
  ? genl_get_cmd+0x3c0/0x3c0
  ? netlink_ack+0x9f0/0x9f0
  ? lock_release+0x1f9/0x6c0
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  ? netlink_attachskb+0x730/0x730
  ? _copy_from_iter_full+0x178/0x650
  ? __alloc_skb+0x113/0x2b0
  netlink_sendmsg+0x6f1/0xbd0
  ? netlink_unicast+0x700/0x700
  ? lock_is_held_type+0x98/0x110
  ? netlink_unicast+0x700/0x700
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x193/0x240
  ? __x64_sys_getpeername+0xb0/0xb0
  ? do_sys_openat2+0x10b/0x370
  ? __up_read+0x1a1/0x7b0
  ? do_user_addr_fault+0x219/0xdc0
  ? __x64_sys_openat+0x120/0x1d0
  ? __x64_sys_open+0x1a0/0x1a0
  __x64_sys_sendto+0xdd/0x1b0
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7fc69d0af14a
 Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
 RSP: 002b:00007ffc1d8292f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fc69d0af14a
 RDX: 0000000000000038 RSI: 0000555f57c56440 RDI: 0000000000000003
 RBP: 0000555f57c56410 R08: 00007fc69d17b200 R09: 000000000000000c
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

 Allocated by task 146:
  kasan_save_stack+0x1b/0x40
  __kasan_kmalloc+0x99/0xc0
  mlx5_init_fs+0xf0/0x1c50 [mlx5_core]
  mlx5_load+0xd2/0x180 [mlx5_core]
  mlx5_init_one+0x2f6/0x450 [mlx5_core]
  probe_one+0x47d/0x6e0 [mlx5_core]
  pci_device_probe+0x2a0/0x4a0
  really_probe+0x20a/0xc90
  driver_probe_device+0xd8/0x380
  device_driver_attach+0x1df/0x250
  __driver_attach+0xff/0x240
  bus_for_each_dev+0x11e/0x1a0
  bus_add_driver+0x309/0x570
  driver_register+0x1ee/0x380
  0xffffffffa06b8062
  do_one_initcall+0xd5/0x410
  do_init_module+0x1c8/0x760
  load_module+0x6d8b/0x9650
  __do_sys_finit_module+0x118/0x1b0
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 Freed by task 275:
  kasan_save_stack+0x1b/0x40
  kasan_set_track+0x1c/0x30
  kasan_set_free_info+0x20/0x30
  __kasan_slab_free+0x102/0x140
  slab_free_freelist_hook+0x74/0x1b0
  kfree+0xd7/0x2a0
  mlx5_unload+0x16/0xb0 [mlx5_core]
  mlx5_unload_one+0xae/0x120 [mlx5_core]
  mlx5_devlink_reload_down+0x1bc/0x380 [mlx5_core]
  devlink_reload+0x141/0x520
  devlink_nl_cmd_reload+0x66d/0x1070
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  genl_rcv_msg+0x27f/0x4a0
  netlink_rcv_skb+0x11d/0x340
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  netlink_sendmsg+0x6f1/0xbd0
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x193/0x240
  __x64_sys_sendto+0xdd/0x1b0
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 The buggy address belongs to the object at ffff888009d04300
  which belongs to the cache kmalloc-128 of size 128
 The buggy address is located 8 bytes inside of
  128-byte region [ffff888009d04300, ffff888009d04380)
 The buggy address belongs to the page:
 page:0000000086a64ecc refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888009d04000 pfn:0x9d04
 head:0000000086a64ecc order:1 compound_mapcount:0
 flags: 0x4000000000010200(slab|head)
 raw: 4000000000010200 ffffea0000203980 0000000200000002 ffff8880050428c0
 raw: ffff888009d04000 000000008020001d 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff888009d04200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888009d04280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff888009d04300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                       ^
  ffff888009d04380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff888009d04400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==================================================================

The right solution to devlink reload is to notify about deletion of
parameters, unload driver, change net namespaces, load driver and notify
about addition of parameters.

Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload")
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 net/core/devlink.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index b596a971b473..54e2a0375539 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3801,8 +3801,9 @@ static void devlink_param_notify(struct devlink *devlink,
 				 struct devlink_param_item *param_item,
 				 enum devlink_command cmd);
 
-static void devlink_reload_netns_change(struct devlink *devlink,
-					struct net *dest_net)
+static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
+				  struct net *curr_net,
+				  enum devlink_command cmd)
 {
 	struct devlink_param_item *param_item;
 
@@ -3812,17 +3813,17 @@ static void devlink_reload_netns_change(struct devlink *devlink,
 	 * reload process so the notifications are generated separatelly.
 	 */
 
-	list_for_each_entry(param_item, &devlink->param_list, list)
-		devlink_param_notify(devlink, 0, param_item,
-				     DEVLINK_CMD_PARAM_DEL);
-	devlink_notify(devlink, DEVLINK_CMD_DEL);
+	if (!dest_net || net_eq(dest_net, curr_net))
+		return;
 
-	__devlink_net_set(devlink, dest_net);
+	if (cmd == DEVLINK_CMD_PARAM_NEW)
+		devlink_notify(devlink, DEVLINK_CMD_NEW);
 
-	devlink_notify(devlink, DEVLINK_CMD_NEW);
 	list_for_each_entry(param_item, &devlink->param_list, list)
-		devlink_param_notify(devlink, 0, param_item,
-				     DEVLINK_CMD_PARAM_NEW);
+		devlink_param_notify(devlink, 0, param_item, cmd);
+
+	if (cmd == DEVLINK_CMD_PARAM_DEL)
+		devlink_notify(devlink, DEVLINK_CMD_DEL);
 }
 
 static bool devlink_reload_supported(const struct devlink_ops *ops)
@@ -3902,6 +3903,7 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
 			  u32 *actions_performed, struct netlink_ext_ack *extack)
 {
 	u32 remote_reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
+	struct net *curr_net;
 	int err;
 
 	if (!devlink->reload_enabled)
@@ -3909,18 +3911,24 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
 
 	memcpy(remote_reload_stats, devlink->stats.remote_reload_stats,
 	       sizeof(remote_reload_stats));
+
+	curr_net = devlink_net(devlink);
+	devlink_params_notify(devlink, dest_net, curr_net,
+			      DEVLINK_CMD_PARAM_DEL);
 	err = devlink->ops->reload_down(devlink, !!dest_net, action, limit, extack);
 	if (err)
 		return err;
 
-	if (dest_net && !net_eq(dest_net, devlink_net(devlink)))
-		devlink_reload_netns_change(devlink, dest_net);
+	if (dest_net && !net_eq(dest_net, curr_net))
+		__devlink_net_set(devlink, dest_net);
 
 	err = devlink->ops->reload_up(devlink, action, limit, actions_performed, extack);
 	devlink_reload_failed_set(devlink, !!err);
 	if (err)
 		return err;
 
+	devlink_params_notify(devlink, dest_net, curr_net,
+			      DEVLINK_CMD_PARAM_NEW);
 	WARN_ON(!(*actions_performed & BIT(action)));
 	/* Catch driver on updating the remote action within devlink reload */
 	WARN_ON(memcmp(remote_reload_stats, devlink->stats.remote_reload_stats,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace
  2021-07-29  8:15 [PATCH net-next 0/2] Clean devlink net namespace operations Leon Romanovsky
  2021-07-29  8:15 ` [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver Leon Romanovsky
@ 2021-07-29  8:15 ` Leon Romanovsky
  2021-07-29 11:55   ` Jiri Pirko
  1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-29  8:15 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Jiri Pirko
  Cc: Leon Romanovsky, linux-kernel, netdev, Parav Pandit

From: Leon Romanovsky <leonro@nvidia.com>

There is no need in extra call indirection and check from impossible
flow where someone tries to set namespace without prior call
to devlink_alloc().

Instead of this extra logic and additional EXPORT_SYMBOL, use specialized
devlink allocation function that receives net namespace as an argument.

Such specialized API allows clear view when devlink initialized in wrong
net namespace and/or kernel users don't try to change devlink namespace
under the hood.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/net/netdevsim/dev.c |  4 ++--
 include/net/devlink.h       | 14 ++++++++++++--
 net/core/devlink.c          | 26 ++++++++------------------
 3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 6348307bfa84..d538a39d4225 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -1431,10 +1431,10 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
 	struct devlink *devlink;
 	int err;
 
-	devlink = devlink_alloc(&nsim_dev_devlink_ops, sizeof(*nsim_dev));
+	devlink = devlink_alloc_ns(&nsim_dev_devlink_ops, sizeof(*nsim_dev),
+				   nsim_bus_dev->initial_net);
 	if (!devlink)
 		return -ENOMEM;
-	devlink_net_set(devlink, nsim_bus_dev->initial_net);
 	nsim_dev = devlink_priv(devlink);
 	nsim_dev->nsim_bus_dev = nsim_bus_dev;
 	nsim_dev->switch_id.id_len = sizeof(nsim_dev->switch_id.id);
diff --git a/include/net/devlink.h b/include/net/devlink.h
index e48a62320407..b4691c40320f 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1540,8 +1540,18 @@ static inline struct devlink *netdev_to_devlink(struct net_device *dev)
 struct ib_device;
 
 struct net *devlink_net(const struct devlink *devlink);
-void devlink_net_set(struct devlink *devlink, struct net *net);
-struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size);
+/* This RAW call is intended for software devices that can
+ * create devlink instance in other namespaces than init_net.
+ *
+ * Drivers that operate on real HW must use devlink_alloc() instead.
+ */
+struct devlink *devlink_alloc_ns(const struct devlink_ops *ops,
+				 size_t priv_size, struct net *net);
+static inline struct devlink *devlink_alloc(const struct devlink_ops *ops,
+					    size_t priv_size)
+{
+	return devlink_alloc_ns(ops, priv_size, &init_net);
+}
 int devlink_register(struct devlink *devlink, struct device *dev);
 void devlink_unregister(struct devlink *devlink);
 void devlink_reload_enable(struct devlink *devlink);
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 54e2a0375539..b419b7a6ec40 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -108,19 +108,6 @@ struct net *devlink_net(const struct devlink *devlink)
 }
 EXPORT_SYMBOL_GPL(devlink_net);
 
-static void __devlink_net_set(struct devlink *devlink, struct net *net)
-{
-	write_pnet(&devlink->_net, net);
-}
-
-void devlink_net_set(struct devlink *devlink, struct net *net)
-{
-	if (WARN_ON(devlink->dev))
-		return;
-	__devlink_net_set(devlink, net);
-}
-EXPORT_SYMBOL_GPL(devlink_net_set);
-
 static struct devlink *devlink_get_from_attrs(struct net *net,
 					      struct nlattr **attrs)
 {
@@ -3920,7 +3907,7 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
 		return err;
 
 	if (dest_net && !net_eq(dest_net, curr_net))
-		__devlink_net_set(devlink, dest_net);
+		write_pnet(&devlink->_net, dest_net);
 
 	err = devlink->ops->reload_up(devlink, action, limit, actions_performed, extack);
 	devlink_reload_failed_set(devlink, !!err);
@@ -8776,15 +8763,18 @@ static bool devlink_reload_actions_valid(const struct devlink_ops *ops)
 }
 
 /**
- *	devlink_alloc - Allocate new devlink instance resources
+ *	devlink_alloc_ns - Allocate new devlink instance resources
+ *	in specific namespace
  *
  *	@ops: ops
  *	@priv_size: size of user private data
+ *	@net: net namespace
  *
  *	Allocate new devlink instance resources, including devlink index
  *	and name.
  */
-struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
+struct devlink *devlink_alloc_ns(const struct devlink_ops *ops,
+				 size_t priv_size, struct net *net)
 {
 	struct devlink *devlink;
 
@@ -8799,7 +8789,7 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
 		return NULL;
 	devlink->ops = ops;
 	xa_init_flags(&devlink->snapshot_ids, XA_FLAGS_ALLOC);
-	__devlink_net_set(devlink, &init_net);
+	write_pnet(&devlink->_net, net);
 	INIT_LIST_HEAD(&devlink->port_list);
 	INIT_LIST_HEAD(&devlink->rate_list);
 	INIT_LIST_HEAD(&devlink->sb_list);
@@ -8815,7 +8805,7 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
 	mutex_init(&devlink->reporters_lock);
 	return devlink;
 }
-EXPORT_SYMBOL_GPL(devlink_alloc);
+EXPORT_SYMBOL_GPL(devlink_alloc_ns);
 
 /**
  *	devlink_register - Register devlink instance
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver
  2021-07-29  8:15 ` [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver Leon Romanovsky
@ 2021-07-29 11:22   ` Jiri Pirko
  2021-07-29 11:35     ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Jiri Pirko @ 2021-07-29 11:22 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, Leon Romanovsky,
	linux-kernel, netdev, Parav Pandit

Thu, Jul 29, 2021 at 10:15:25AM CEST, leon@kernel.org wrote:
>From: Leon Romanovsky <leonro@nvidia.com>
>
>The change of namespaces during devlink reload calls to driver unload
>before it accesses devlink parameters. The commands below causes to
>use-after-free bug when trying to get flow steering mode.
>
> * ip netns add n1
> * devlink dev reload pci/0000:00:09.0 netns n1
>
> ==================================================================
> BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
> Read of size 4 at addr ffff888009d04308 by task devlink/275
>
> CPU: 6 PID: 275 Comm: devlink Not tainted 5.12.0-rc2+ #2853
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
>  dump_stack+0x93/0xc2
>  print_address_description.constprop.0+0x18/0x140
>  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
>  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
>  kasan_report.cold+0x7c/0xd8
>  ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
>  mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core]
>  devlink_nl_param_fill+0x1c8/0xe80
>  ? __free_pages_ok+0x37a/0x8a0
>  ? devlink_flash_update_timeout_notify+0xd0/0xd0
>  ? lock_acquire+0x1a9/0x6d0
>  ? fs_reclaim_acquire+0xb7/0x160
>  ? lock_is_held_type+0x98/0x110
>  ? 0xffffffff81000000
>  ? lock_release+0x1f9/0x6c0
>  ? fs_reclaim_release+0xa1/0xf0
>  ? lock_downgrade+0x6d0/0x6d0
>  ? lock_is_held_type+0x98/0x110
>  ? lock_is_held_type+0x98/0x110
>  ? memset+0x20/0x40
>  ? __build_skb_around+0x1f8/0x2b0
>  devlink_param_notify+0x6d/0x180
>  devlink_reload+0x1c3/0x520
>  ? devlink_remote_reload_actions_performed+0x30/0x30
>  ? mutex_trylock+0x24b/0x2d0
>  ? devlink_nl_cmd_reload+0x62b/0x1070
>  devlink_nl_cmd_reload+0x66d/0x1070
>  ? devlink_reload+0x520/0x520
>  ? devlink_get_from_attrs+0x1bc/0x260
>  ? devlink_nl_pre_doit+0x64/0x4d0
>  genl_family_rcv_msg_doit+0x1e9/0x2f0
>  ? mutex_lock_io_nested+0x1130/0x1130
>  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
>  ? security_capable+0x51/0x90
>  genl_rcv_msg+0x27f/0x4a0
>  ? genl_get_cmd+0x3c0/0x3c0
>  ? lock_acquire+0x1a9/0x6d0
>  ? devlink_reload+0x520/0x520
>  ? lock_release+0x6c0/0x6c0
>  netlink_rcv_skb+0x11d/0x340
>  ? genl_get_cmd+0x3c0/0x3c0
>  ? netlink_ack+0x9f0/0x9f0
>  ? lock_release+0x1f9/0x6c0
>  genl_rcv+0x24/0x40
>  netlink_unicast+0x433/0x700
>  ? netlink_attachskb+0x730/0x730
>  ? _copy_from_iter_full+0x178/0x650
>  ? __alloc_skb+0x113/0x2b0
>  netlink_sendmsg+0x6f1/0xbd0
>  ? netlink_unicast+0x700/0x700
>  ? lock_is_held_type+0x98/0x110
>  ? netlink_unicast+0x700/0x700
>  sock_sendmsg+0xb0/0xe0
>  __sys_sendto+0x193/0x240
>  ? __x64_sys_getpeername+0xb0/0xb0
>  ? do_sys_openat2+0x10b/0x370
>  ? __up_read+0x1a1/0x7b0
>  ? do_user_addr_fault+0x219/0xdc0
>  ? __x64_sys_openat+0x120/0x1d0
>  ? __x64_sys_open+0x1a0/0x1a0
>  __x64_sys_sendto+0xdd/0x1b0
>  ? syscall_enter_from_user_mode+0x1d/0x50
>  do_syscall_64+0x2d/0x40
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7fc69d0af14a
> Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
> RSP: 002b:00007ffc1d8292f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
> RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fc69d0af14a
> RDX: 0000000000000038 RSI: 0000555f57c56440 RDI: 0000000000000003
> RBP: 0000555f57c56410 R08: 00007fc69d17b200 R09: 000000000000000c
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>
> Allocated by task 146:
>  kasan_save_stack+0x1b/0x40
>  __kasan_kmalloc+0x99/0xc0
>  mlx5_init_fs+0xf0/0x1c50 [mlx5_core]
>  mlx5_load+0xd2/0x180 [mlx5_core]
>  mlx5_init_one+0x2f6/0x450 [mlx5_core]
>  probe_one+0x47d/0x6e0 [mlx5_core]
>  pci_device_probe+0x2a0/0x4a0
>  really_probe+0x20a/0xc90
>  driver_probe_device+0xd8/0x380
>  device_driver_attach+0x1df/0x250
>  __driver_attach+0xff/0x240
>  bus_for_each_dev+0x11e/0x1a0
>  bus_add_driver+0x309/0x570
>  driver_register+0x1ee/0x380
>  0xffffffffa06b8062
>  do_one_initcall+0xd5/0x410
>  do_init_module+0x1c8/0x760
>  load_module+0x6d8b/0x9650
>  __do_sys_finit_module+0x118/0x1b0
>  do_syscall_64+0x2d/0x40
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Freed by task 275:
>  kasan_save_stack+0x1b/0x40
>  kasan_set_track+0x1c/0x30
>  kasan_set_free_info+0x20/0x30
>  __kasan_slab_free+0x102/0x140
>  slab_free_freelist_hook+0x74/0x1b0
>  kfree+0xd7/0x2a0
>  mlx5_unload+0x16/0xb0 [mlx5_core]
>  mlx5_unload_one+0xae/0x120 [mlx5_core]
>  mlx5_devlink_reload_down+0x1bc/0x380 [mlx5_core]
>  devlink_reload+0x141/0x520
>  devlink_nl_cmd_reload+0x66d/0x1070
>  genl_family_rcv_msg_doit+0x1e9/0x2f0
>  genl_rcv_msg+0x27f/0x4a0
>  netlink_rcv_skb+0x11d/0x340
>  genl_rcv+0x24/0x40
>  netlink_unicast+0x433/0x700
>  netlink_sendmsg+0x6f1/0xbd0
>  sock_sendmsg+0xb0/0xe0
>  __sys_sendto+0x193/0x240
>  __x64_sys_sendto+0xdd/0x1b0
>  do_syscall_64+0x2d/0x40
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> The buggy address belongs to the object at ffff888009d04300
>  which belongs to the cache kmalloc-128 of size 128
> The buggy address is located 8 bytes inside of
>  128-byte region [ffff888009d04300, ffff888009d04380)
> The buggy address belongs to the page:
> page:0000000086a64ecc refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888009d04000 pfn:0x9d04
> head:0000000086a64ecc order:1 compound_mapcount:0
> flags: 0x4000000000010200(slab|head)
> raw: 4000000000010200 ffffea0000203980 0000000200000002 ffff8880050428c0
> raw: ffff888009d04000 000000008020001d 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff888009d04200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888009d04280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff888009d04300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                       ^
>  ffff888009d04380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff888009d04400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
>
>The right solution to devlink reload is to notify about deletion of
>parameters, unload driver, change net namespaces, load driver and notify
>about addition of parameters.
>
>Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload")
>Reviewed-by: Parav Pandit <parav@nvidia.com>
>Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>---
> net/core/devlink.c | 32 ++++++++++++++++++++------------
> 1 file changed, 20 insertions(+), 12 deletions(-)
>
>diff --git a/net/core/devlink.c b/net/core/devlink.c
>index b596a971b473..54e2a0375539 100644
>--- a/net/core/devlink.c
>+++ b/net/core/devlink.c
>@@ -3801,8 +3801,9 @@ static void devlink_param_notify(struct devlink *devlink,
> 				 struct devlink_param_item *param_item,
> 				 enum devlink_command cmd);
> 
>-static void devlink_reload_netns_change(struct devlink *devlink,
>-					struct net *dest_net)
>+static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,

Please name it differently. This function notifies not only the params,
but the devlink instance itself as well.


>+				  struct net *curr_net,
>+				  enum devlink_command cmd)
> {
> 	struct devlink_param_item *param_item;
> 
>@@ -3812,17 +3813,17 @@ static void devlink_reload_netns_change(struct devlink *devlink,
> 	 * reload process so the notifications are generated separatelly.
> 	 */
> 
>-	list_for_each_entry(param_item, &devlink->param_list, list)
>-		devlink_param_notify(devlink, 0, param_item,
>-				     DEVLINK_CMD_PARAM_DEL);
>-	devlink_notify(devlink, DEVLINK_CMD_DEL);
>+	if (!dest_net || net_eq(dest_net, curr_net))
>+		return;
> 
>-	__devlink_net_set(devlink, dest_net);
>+	if (cmd == DEVLINK_CMD_PARAM_NEW)
>+		devlink_notify(devlink, DEVLINK_CMD_NEW);

This is quite odd. According to PARAMS cmd you decife devlink CMD.

Just have bool arg which would help you select both
DEVLINK_CMD_PARAM_NEW/DEL and DEVLINK_CMD_NEW/DEL.


> 
>-	devlink_notify(devlink, DEVLINK_CMD_NEW);
> 	list_for_each_entry(param_item, &devlink->param_list, list)
>-		devlink_param_notify(devlink, 0, param_item,
>-				     DEVLINK_CMD_PARAM_NEW);
>+		devlink_param_notify(devlink, 0, param_item, cmd);
>+
>+	if (cmd == DEVLINK_CMD_PARAM_DEL)
>+		devlink_notify(devlink, DEVLINK_CMD_DEL);
> }
> 
> static bool devlink_reload_supported(const struct devlink_ops *ops)
>@@ -3902,6 +3903,7 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
> 			  u32 *actions_performed, struct netlink_ext_ack *extack)
> {
> 	u32 remote_reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
>+	struct net *curr_net;
> 	int err;
> 
> 	if (!devlink->reload_enabled)
>@@ -3909,18 +3911,24 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
> 
> 	memcpy(remote_reload_stats, devlink->stats.remote_reload_stats,
> 	       sizeof(remote_reload_stats));
>+
>+	curr_net = devlink_net(devlink);
>+	devlink_params_notify(devlink, dest_net, curr_net,
>+			      DEVLINK_CMD_PARAM_DEL);
> 	err = devlink->ops->reload_down(devlink, !!dest_net, action, limit, extack);
> 	if (err)
> 		return err;
> 
>-	if (dest_net && !net_eq(dest_net, devlink_net(devlink)))
>-		devlink_reload_netns_change(devlink, dest_net);
>+	if (dest_net && !net_eq(dest_net, curr_net))
>+		__devlink_net_set(devlink, dest_net);
> 
> 	err = devlink->ops->reload_up(devlink, action, limit, actions_performed, extack);
> 	devlink_reload_failed_set(devlink, !!err);
> 	if (err)
> 		return err;
> 
>+	devlink_params_notify(devlink, dest_net, curr_net,
>+			      DEVLINK_CMD_PARAM_NEW);
> 	WARN_ON(!(*actions_performed & BIT(action)));
> 	/* Catch driver on updating the remote action within devlink reload */
> 	WARN_ON(memcmp(remote_reload_stats, devlink->stats.remote_reload_stats,
>-- 
>2.31.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver
  2021-07-29 11:22   ` Jiri Pirko
@ 2021-07-29 11:35     ` Leon Romanovsky
  2021-07-29 11:57       ` Jiri Pirko
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-29 11:35 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, linux-kernel,
	netdev, Parav Pandit

On Thu, Jul 29, 2021 at 01:22:58PM +0200, Jiri Pirko wrote:
> Thu, Jul 29, 2021 at 10:15:25AM CEST, leon@kernel.org wrote:
> >From: Leon Romanovsky <leonro@nvidia.com>

<...>

> >diff --git a/net/core/devlink.c b/net/core/devlink.c
> >index b596a971b473..54e2a0375539 100644
> >--- a/net/core/devlink.c
> >+++ b/net/core/devlink.c
> >@@ -3801,8 +3801,9 @@ static void devlink_param_notify(struct devlink *devlink,
> > 				 struct devlink_param_item *param_item,
> > 				 enum devlink_command cmd);
> > 
> >-static void devlink_reload_netns_change(struct devlink *devlink,
> >-					struct net *dest_net)
> >+static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
> 
> Please name it differently. This function notifies not only the params,
> but the devlink instance itself as well.

I'm open for suggestion. What did you have in mind?

> 
> 
> >+				  struct net *curr_net,
> >+				  enum devlink_command cmd)
> > {
> > 	struct devlink_param_item *param_item;
> > 
> >@@ -3812,17 +3813,17 @@ static void devlink_reload_netns_change(struct devlink *devlink,
> > 	 * reload process so the notifications are generated separatelly.
> > 	 */
> > 
> >-	list_for_each_entry(param_item, &devlink->param_list, list)
> >-		devlink_param_notify(devlink, 0, param_item,
> >-				     DEVLINK_CMD_PARAM_DEL);
> >-	devlink_notify(devlink, DEVLINK_CMD_DEL);
> >+	if (!dest_net || net_eq(dest_net, curr_net))
> >+		return;
> > 
> >-	__devlink_net_set(devlink, dest_net);
> >+	if (cmd == DEVLINK_CMD_PARAM_NEW)
> >+		devlink_notify(devlink, DEVLINK_CMD_NEW);
> 
> This is quite odd. According to PARAMS cmd you decife devlink CMD.
> 
> Just have bool arg which would help you select both
> DEVLINK_CMD_PARAM_NEW/DEL and DEVLINK_CMD_NEW/DEL

The patch is quite misleading, but the final result looks neat:

   3847 static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
   3848                                   struct net *curr_net,
   3849                                   enum devlink_command cmd)
   3850 {
   3851         struct devlink_param_item *param_item;
   3852 
   3853         /* Userspace needs to be notified about devlink objects
   3854          * removed from original and entering new network namespace.
   3855          * The rest of the devlink objects are re-created during
   3856          * reload process so the notifications are generated separatelly.
   3857          */
   3858 
   3859         if (!dest_net || net_eq(dest_net, curr_net))
   3860                 return;
   3861 
   3862         if (cmd == DEVLINK_CMD_PARAM_NEW)
   3863                 devlink_notify(devlink, DEVLINK_CMD_NEW);
   3864 
   3865         list_for_each_entry(param_item, &devlink->param_list, list)
   3866                 devlink_param_notify(devlink, 0, param_item, cmd);
   3867 
   3868         if (cmd == DEVLINK_CMD_PARAM_DEL)
   3869                 devlink_notify(devlink, DEVLINK_CMD_DEL);
   3870 }


So as you can see in line 3866, we anyway will need to provide "cmd", so
do you suggest to add extra two bool variables to the function signature
to avoid "cmd == DEVLINK_CMD_PARAM_NEW" and "cmd == DEVLINK_CMD_PARAM_DEL" ifs?

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace
  2021-07-29  8:15 ` [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace Leon Romanovsky
@ 2021-07-29 11:55   ` Jiri Pirko
  2021-07-29 12:06     ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Jiri Pirko @ 2021-07-29 11:55 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, Leon Romanovsky,
	linux-kernel, netdev, Parav Pandit

Thu, Jul 29, 2021 at 10:15:26AM CEST, leon@kernel.org wrote:
>From: Leon Romanovsky <leonro@nvidia.com>
>
>There is no need in extra call indirection and check from impossible
>flow where someone tries to set namespace without prior call
>to devlink_alloc().
>
>Instead of this extra logic and additional EXPORT_SYMBOL, use specialized
>devlink allocation function that receives net namespace as an argument.
>
>Such specialized API allows clear view when devlink initialized in wrong
>net namespace and/or kernel users don't try to change devlink namespace
>under the hood.
>
>Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>---
> drivers/net/netdevsim/dev.c |  4 ++--
> include/net/devlink.h       | 14 ++++++++++++--
> net/core/devlink.c          | 26 ++++++++------------------
> 3 files changed, 22 insertions(+), 22 deletions(-)
>
>diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
>index 6348307bfa84..d538a39d4225 100644
>--- a/drivers/net/netdevsim/dev.c
>+++ b/drivers/net/netdevsim/dev.c
>@@ -1431,10 +1431,10 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
> 	struct devlink *devlink;
> 	int err;
> 
>-	devlink = devlink_alloc(&nsim_dev_devlink_ops, sizeof(*nsim_dev));
>+	devlink = devlink_alloc_ns(&nsim_dev_devlink_ops, sizeof(*nsim_dev),
>+				   nsim_bus_dev->initial_net);
> 	if (!devlink)
> 		return -ENOMEM;
>-	devlink_net_set(devlink, nsim_bus_dev->initial_net);
> 	nsim_dev = devlink_priv(devlink);
> 	nsim_dev->nsim_bus_dev = nsim_bus_dev;
> 	nsim_dev->switch_id.id_len = sizeof(nsim_dev->switch_id.id);
>diff --git a/include/net/devlink.h b/include/net/devlink.h
>index e48a62320407..b4691c40320f 100644
>--- a/include/net/devlink.h
>+++ b/include/net/devlink.h
>@@ -1540,8 +1540,18 @@ static inline struct devlink *netdev_to_devlink(struct net_device *dev)
> struct ib_device;
> 
> struct net *devlink_net(const struct devlink *devlink);
>-void devlink_net_set(struct devlink *devlink, struct net *net);
>-struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size);
>+/* This RAW call is intended for software devices that can

Not sure what "RAW call" is, perhaps you can just avoid this here.

Otherwise, this patch looks fine to me:
Reviewed-by: Jiri Pirko <jiri@nvidia.com>

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver
  2021-07-29 11:35     ` Leon Romanovsky
@ 2021-07-29 11:57       ` Jiri Pirko
  2021-07-29 12:11         ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Jiri Pirko @ 2021-07-29 11:57 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, linux-kernel,
	netdev, Parav Pandit

Thu, Jul 29, 2021 at 01:35:55PM CEST, leon@kernel.org wrote:
>On Thu, Jul 29, 2021 at 01:22:58PM +0200, Jiri Pirko wrote:
>> Thu, Jul 29, 2021 at 10:15:25AM CEST, leon@kernel.org wrote:
>> >From: Leon Romanovsky <leonro@nvidia.com>
>
><...>
>
>> >diff --git a/net/core/devlink.c b/net/core/devlink.c
>> >index b596a971b473..54e2a0375539 100644
>> >--- a/net/core/devlink.c
>> >+++ b/net/core/devlink.c
>> >@@ -3801,8 +3801,9 @@ static void devlink_param_notify(struct devlink *devlink,
>> > 				 struct devlink_param_item *param_item,
>> > 				 enum devlink_command cmd);
>> > 
>> >-static void devlink_reload_netns_change(struct devlink *devlink,
>> >-					struct net *dest_net)
>> >+static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
>> 
>> Please name it differently. This function notifies not only the params,
>> but the devlink instance itself as well.
>
>I'm open for suggestion. What did you have in mind?

devlink_ns_change_notify?

>
>> 
>> 
>> >+				  struct net *curr_net,
>> >+				  enum devlink_command cmd)
>> > {
>> > 	struct devlink_param_item *param_item;
>> > 
>> >@@ -3812,17 +3813,17 @@ static void devlink_reload_netns_change(struct devlink *devlink,
>> > 	 * reload process so the notifications are generated separatelly.
>> > 	 */
>> > 
>> >-	list_for_each_entry(param_item, &devlink->param_list, list)
>> >-		devlink_param_notify(devlink, 0, param_item,
>> >-				     DEVLINK_CMD_PARAM_DEL);
>> >-	devlink_notify(devlink, DEVLINK_CMD_DEL);
>> >+	if (!dest_net || net_eq(dest_net, curr_net))
>> >+		return;
>> > 
>> >-	__devlink_net_set(devlink, dest_net);
>> >+	if (cmd == DEVLINK_CMD_PARAM_NEW)
>> >+		devlink_notify(devlink, DEVLINK_CMD_NEW);
>> 
>> This is quite odd. According to PARAMS cmd you decife devlink CMD.
>> 
>> Just have bool arg which would help you select both
>> DEVLINK_CMD_PARAM_NEW/DEL and DEVLINK_CMD_NEW/DEL
>
>The patch is quite misleading, but the final result looks neat:
>
>   3847 static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
>   3848                                   struct net *curr_net,
>   3849                                   enum devlink_command cmd)
>   3850 {
>   3851         struct devlink_param_item *param_item;
>   3852 
>   3853         /* Userspace needs to be notified about devlink objects
>   3854          * removed from original and entering new network namespace.
>   3855          * The rest of the devlink objects are re-created during
>   3856          * reload process so the notifications are generated separatelly.
>   3857          */
>   3858 
>   3859         if (!dest_net || net_eq(dest_net, curr_net))
>   3860                 return;
>   3861 
>   3862         if (cmd == DEVLINK_CMD_PARAM_NEW)
>   3863                 devlink_notify(devlink, DEVLINK_CMD_NEW);

Nothing misleading here. This is exactly what I'm pointing out...



>   3864 
>   3865         list_for_each_entry(param_item, &devlink->param_list, list)
>   3866                 devlink_param_notify(devlink, 0, param_item, cmd);
>   3867 
>   3868         if (cmd == DEVLINK_CMD_PARAM_DEL)
>   3869                 devlink_notify(devlink, DEVLINK_CMD_DEL);
>   3870 }
>
>
>So as you can see in line 3866, we anyway will need to provide "cmd", so
>do you suggest to add extra two bool variables to the function signature
>to avoid "cmd == DEVLINK_CMD_PARAM_NEW" and "cmd == DEVLINK_CMD_PARAM_DEL" ifs?
>
>Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace
  2021-07-29 11:55   ` Jiri Pirko
@ 2021-07-29 12:06     ` Leon Romanovsky
  0 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-29 12:06 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, linux-kernel,
	netdev, Parav Pandit

On Thu, Jul 29, 2021 at 01:55:29PM +0200, Jiri Pirko wrote:
> Thu, Jul 29, 2021 at 10:15:26AM CEST, leon@kernel.org wrote:
> >From: Leon Romanovsky <leonro@nvidia.com>
> >
> >There is no need in extra call indirection and check from impossible
> >flow where someone tries to set namespace without prior call
> >to devlink_alloc().
> >
> >Instead of this extra logic and additional EXPORT_SYMBOL, use specialized
> >devlink allocation function that receives net namespace as an argument.
> >
> >Such specialized API allows clear view when devlink initialized in wrong
> >net namespace and/or kernel users don't try to change devlink namespace
> >under the hood.
> >
> >Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> >---
> > drivers/net/netdevsim/dev.c |  4 ++--
> > include/net/devlink.h       | 14 ++++++++++++--
> > net/core/devlink.c          | 26 ++++++++------------------
> > 3 files changed, 22 insertions(+), 22 deletions(-)
> >
> >diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
> >index 6348307bfa84..d538a39d4225 100644
> >--- a/drivers/net/netdevsim/dev.c
> >+++ b/drivers/net/netdevsim/dev.c
> >@@ -1431,10 +1431,10 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
> > 	struct devlink *devlink;
> > 	int err;
> > 
> >-	devlink = devlink_alloc(&nsim_dev_devlink_ops, sizeof(*nsim_dev));
> >+	devlink = devlink_alloc_ns(&nsim_dev_devlink_ops, sizeof(*nsim_dev),
> >+				   nsim_bus_dev->initial_net);
> > 	if (!devlink)
> > 		return -ENOMEM;
> >-	devlink_net_set(devlink, nsim_bus_dev->initial_net);
> > 	nsim_dev = devlink_priv(devlink);
> > 	nsim_dev->nsim_bus_dev = nsim_bus_dev;
> > 	nsim_dev->switch_id.id_len = sizeof(nsim_dev->switch_id.id);
> >diff --git a/include/net/devlink.h b/include/net/devlink.h
> >index e48a62320407..b4691c40320f 100644
> >--- a/include/net/devlink.h
> >+++ b/include/net/devlink.h
> >@@ -1540,8 +1540,18 @@ static inline struct devlink *netdev_to_devlink(struct net_device *dev)
> > struct ib_device;
> > 
> > struct net *devlink_net(const struct devlink *devlink);
> >-void devlink_net_set(struct devlink *devlink, struct net *net);
> >-struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size);
> >+/* This RAW call is intended for software devices that can
> 
> Not sure what "RAW call" is, perhaps you can just avoid this here.

I wanted to emphasize the message that we shouldn't see this call in
regular drivers and it is very specific to SW drivers.

I'll drop "RAW".

> 
> Otherwise, this patch looks fine to me:
> Reviewed-by: Jiri Pirko <jiri@nvidia.com>

Thanks

> 
> [...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver
  2021-07-29 11:57       ` Jiri Pirko
@ 2021-07-29 12:11         ` Leon Romanovsky
  0 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-29 12:11 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, linux-kernel,
	netdev, Parav Pandit

On Thu, Jul 29, 2021 at 01:57:01PM +0200, Jiri Pirko wrote:
> Thu, Jul 29, 2021 at 01:35:55PM CEST, leon@kernel.org wrote:
> >On Thu, Jul 29, 2021 at 01:22:58PM +0200, Jiri Pirko wrote:
> >> Thu, Jul 29, 2021 at 10:15:25AM CEST, leon@kernel.org wrote:
> >> >From: Leon Romanovsky <leonro@nvidia.com>
> >
> ><...>
> >
> >> >diff --git a/net/core/devlink.c b/net/core/devlink.c
> >> >index b596a971b473..54e2a0375539 100644
> >> >--- a/net/core/devlink.c
> >> >+++ b/net/core/devlink.c
> >> >@@ -3801,8 +3801,9 @@ static void devlink_param_notify(struct devlink *devlink,
> >> > 				 struct devlink_param_item *param_item,
> >> > 				 enum devlink_command cmd);
> >> > 
> >> >-static void devlink_reload_netns_change(struct devlink *devlink,
> >> >-					struct net *dest_net)
> >> >+static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
> >> 
> >> Please name it differently. This function notifies not only the params,
> >> but the devlink instance itself as well.
> >
> >I'm open for suggestion. What did you have in mind?
> 
> devlink_ns_change_notify?

NP, will change 

> 
> >
> >> 
> >> 
> >> >+				  struct net *curr_net,
> >> >+				  enum devlink_command cmd)
> >> > {
> >> > 	struct devlink_param_item *param_item;
> >> > 
> >> >@@ -3812,17 +3813,17 @@ static void devlink_reload_netns_change(struct devlink *devlink,
> >> > 	 * reload process so the notifications are generated separatelly.
> >> > 	 */
> >> > 
> >> >-	list_for_each_entry(param_item, &devlink->param_list, list)
> >> >-		devlink_param_notify(devlink, 0, param_item,
> >> >-				     DEVLINK_CMD_PARAM_DEL);
> >> >-	devlink_notify(devlink, DEVLINK_CMD_DEL);
> >> >+	if (!dest_net || net_eq(dest_net, curr_net))
> >> >+		return;
> >> > 
> >> >-	__devlink_net_set(devlink, dest_net);
> >> >+	if (cmd == DEVLINK_CMD_PARAM_NEW)
> >> >+		devlink_notify(devlink, DEVLINK_CMD_NEW);
> >> 
> >> This is quite odd. According to PARAMS cmd you decife devlink CMD.
> >> 
> >> Just have bool arg which would help you select both
> >> DEVLINK_CMD_PARAM_NEW/DEL and DEVLINK_CMD_NEW/DEL
> >
> >The patch is quite misleading, but the final result looks neat:
> >
> >   3847 static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
> >   3848                                   struct net *curr_net,
> >   3849                                   enum devlink_command cmd)
> >   3850 {
> >   3851         struct devlink_param_item *param_item;
> >   3852 
> >   3853         /* Userspace needs to be notified about devlink objects
> >   3854          * removed from original and entering new network namespace.
> >   3855          * The rest of the devlink objects are re-created during
> >   3856          * reload process so the notifications are generated separatelly.
> >   3857          */
> >   3858 
> >   3859         if (!dest_net || net_eq(dest_net, curr_net))
> >   3860                 return;
> >   3861 
> >   3862         if (cmd == DEVLINK_CMD_PARAM_NEW)
> >   3863                 devlink_notify(devlink, DEVLINK_CMD_NEW);
> 
> Nothing misleading here. This is exactly what I'm pointing out...

Do you ask for this change?

diff --git a/net/core/devlink.c b/net/core/devlink.c
index 077310c26a8b..ee7204787b29 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3846,7 +3846,7 @@ static void devlink_param_notify(struct devlink *devlink,
 
 static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
                                  struct net *curr_net,
-                                 enum devlink_command cmd)
+                                 enum devlink_command cmd, bool new)
 {
        struct devlink_param_item *param_item;
 
@@ -3859,13 +3859,13 @@ static void devlink_params_notify(struct devlink *devlink, struct net *dest_net,
        if (!dest_net || net_eq(dest_net, curr_net))
                return;
 
-       if (cmd == DEVLINK_CMD_PARAM_NEW)
+       if (new)
                devlink_notify(devlink, DEVLINK_CMD_NEW);
 
        list_for_each_entry(param_item, &devlink->param_list, list)
                devlink_param_notify(devlink, 0, param_item, cmd);
 
-       if (cmd == DEVLINK_CMD_PARAM_DEL)
+       if (!new)
                devlink_notify(devlink, DEVLINK_CMD_DEL);
 }
 
@@ -3957,7 +3957,7 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
 
        curr_net = devlink_net(devlink);
        devlink_params_notify(devlink, dest_net, curr_net,
-                             DEVLINK_CMD_PARAM_DEL);
+                             DEVLINK_CMD_PARAM_DEL, false);
        err = devlink->ops->reload_down(devlink, !!dest_net, action, limit, extack);
        if (err)
                return err;
@@ -3971,7 +3971,7 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
                return err;
 
        devlink_params_notify(devlink, dest_net, curr_net,
-                             DEVLINK_CMD_PARAM_NEW);
+                             DEVLINK_CMD_PARAM_NEW, true);
        WARN_ON(!(*actions_performed & BIT(action)));
        /* Catch driver on updating the remote action within devlink reload */
        WARN_ON(memcmp(remote_reload_stats, devlink->stats.remote_reload_stats,


> 
> 
> 
> >   3864 
> >   3865         list_for_each_entry(param_item, &devlink->param_list, list)
> >   3866                 devlink_param_notify(devlink, 0, param_item, cmd);
> >   3867 
> >   3868         if (cmd == DEVLINK_CMD_PARAM_DEL)
> >   3869                 devlink_notify(devlink, DEVLINK_CMD_DEL);
> >   3870 }
> >
> >
> >So as you can see in line 3866, we anyway will need to provide "cmd", so
> >do you suggest to add extra two bool variables to the function signature
> >to avoid "cmd == DEVLINK_CMD_PARAM_NEW" and "cmd == DEVLINK_CMD_PARAM_DEL" ifs?
> >
> >Thanks

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-07-29 12:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29  8:15 [PATCH net-next 0/2] Clean devlink net namespace operations Leon Romanovsky
2021-07-29  8:15 ` [PATCH net-next 1/2] devlink: Break parameter notification sequence to be before/after unload/load driver Leon Romanovsky
2021-07-29 11:22   ` Jiri Pirko
2021-07-29 11:35     ` Leon Romanovsky
2021-07-29 11:57       ` Jiri Pirko
2021-07-29 12:11         ` Leon Romanovsky
2021-07-29  8:15 ` [PATCH net-next 2/2] devlink: Allocate devlink directly in requested net namespace Leon Romanovsky
2021-07-29 11:55   ` Jiri Pirko
2021-07-29 12:06     ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.