All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.10 0/3] net: fix use-after-free of net_device's in __rtnl_newlink()
@ 2022-07-14 16:11 Fedor Pchelkin
  2022-07-14 16:11 ` [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime Fedor Pchelkin
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Fedor Pchelkin @ 2022-07-14 16:11 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Fedor Pchelkin, Jakub Kicinski, Alexey Khoroshilov

Syzkaller reports use-after-free for net_device's in 5.10 stable releases.
The problem has been fixed by the following patch series and
it can be cleanly applied to the 5.10 branch.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime
  2022-07-14 16:11 [PATCH 5.10 0/3] net: fix use-after-free of net_device's in __rtnl_newlink() Fedor Pchelkin
@ 2022-07-14 16:11 ` Fedor Pchelkin
  2022-07-15 14:12   ` Greg Kroah-Hartman
  2022-07-14 16:11 ` [PATCH 5.10 2/3] net: make free_netdev() more lenient with unregistering devices Fedor Pchelkin
  2022-07-14 16:11 ` [PATCH 5.10 3/3] net: make sure devices go through netdev_wait_all_refs Fedor Pchelkin
  2 siblings, 1 reply; 6+ messages in thread
From: Fedor Pchelkin @ 2022-07-14 16:11 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Fedor Pchelkin, Jakub Kicinski, Alexey Khoroshilov

From: Jakub Kicinski <kuba@kernel.org>

commit 2b446e650b418f9a9e75f99852e2f2560cabfa17 upstream.

Explain the two basic flows of struct net_device's operation.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/netdevices.rst | 171 +++++++++++++++++++++++-
 net/core/rtnetlink.c                    |   2 +-
 2 files changed, 166 insertions(+), 7 deletions(-)

diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index e65665c5ab50..17bdcb746dcf 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -10,18 +10,177 @@ Introduction
 The following is a random collection of documentation regarding
 network devices.
 
-struct net_device allocation rules
-==================================
+struct net_device lifetime rules
+================================
 Network device structures need to persist even after module is unloaded and
 must be allocated with alloc_netdev_mqs() and friends.
 If device has registered successfully, it will be freed on last use
-by free_netdev(). This is required to handle the pathologic case cleanly
-(example: rmmod mydriver </sys/class/net/myeth/mtu )
+by free_netdev(). This is required to handle the pathological case cleanly
+(example: ``rmmod mydriver </sys/class/net/myeth/mtu``)
 
-alloc_netdev_mqs()/alloc_netdev() reserve extra space for driver
+alloc_netdev_mqs() / alloc_netdev() reserve extra space for driver
 private data which gets freed when the network device is freed. If
 separately allocated data is attached to the network device
-(netdev_priv(dev)) then it is up to the module exit handler to free that.
+(netdev_priv()) then it is up to the module exit handler to free that.
+
+There are two groups of APIs for registering struct net_device.
+First group can be used in normal contexts where ``rtnl_lock`` is not already
+held: register_netdev(), unregister_netdev().
+Second group can be used when ``rtnl_lock`` is already held:
+register_netdevice(), unregister_netdevice(), free_netdevice().
+
+Simple drivers
+--------------
+
+Most drivers (especially device drivers) handle lifetime of struct net_device
+in context where ``rtnl_lock`` is not held (e.g. driver probe and remove paths).
+
+In that case the struct net_device registration is done using
+the register_netdev(), and unregister_netdev() functions:
+
+.. code-block:: c
+
+  int probe()
+  {
+    struct my_device_priv *priv;
+    int err;
+
+    dev = alloc_netdev_mqs(...);
+    if (!dev)
+      return -ENOMEM;
+    priv = netdev_priv(dev);
+
+    /* ... do all device setup before calling register_netdev() ...
+     */
+
+    err = register_netdev(dev);
+    if (err)
+      goto err_undo;
+
+    /* net_device is visible to the user! */
+
+  err_undo:
+    /* ... undo the device setup ... */
+    free_netdev(dev);
+    return err;
+  }
+
+  void remove()
+  {
+    unregister_netdev(dev);
+    free_netdev(dev);
+  }
+
+Note that after calling register_netdev() the device is visible in the system.
+Users can open it and start sending / receiving traffic immediately,
+or run any other callback, so all initialization must be done prior to
+registration.
+
+unregister_netdev() closes the device and waits for all users to be done
+with it. The memory of struct net_device itself may still be referenced
+by sysfs but all operations on that device will fail.
+
+free_netdev() can be called after unregister_netdev() returns on when
+register_netdev() failed.
+
+Device management under RTNL
+----------------------------
+
+Registering struct net_device while in context which already holds
+the ``rtnl_lock`` requires extra care. In those scenarios most drivers
+will want to make use of struct net_device's ``needs_free_netdev``
+and ``priv_destructor`` members for freeing of state.
+
+Example flow of netdev handling under ``rtnl_lock``:
+
+.. code-block:: c
+
+  static void my_setup(struct net_device *dev)
+  {
+    dev->needs_free_netdev = true;
+  }
+
+  static void my_destructor(struct net_device *dev)
+  {
+    some_obj_destroy(priv->obj);
+    some_uninit(priv);
+  }
+
+  int create_link()
+  {
+    struct my_device_priv *priv;
+    int err;
+
+    ASSERT_RTNL();
+
+    dev = alloc_netdev(sizeof(*priv), "net%d", NET_NAME_UNKNOWN, my_setup);
+    if (!dev)
+      return -ENOMEM;
+    priv = netdev_priv(dev);
+
+    /* Implicit constructor */
+    err = some_init(priv);
+    if (err)
+      goto err_free_dev;
+
+    priv->obj = some_obj_create();
+    if (!priv->obj) {
+      err = -ENOMEM;
+      goto err_some_uninit;
+    }
+    /* End of constructor, set the destructor: */
+    dev->priv_destructor = my_destructor;
+
+    err = register_netdevice(dev);
+    if (err)
+      /* register_netdevice() calls destructor on failure */
+      goto err_free_dev;
+
+    /* If anything fails now unregister_netdevice() (or unregister_netdev())
+     * will take care of calling my_destructor and free_netdev().
+     */
+
+    return 0;
+
+  err_some_uninit:
+    some_uninit(priv);
+  err_free_dev:
+    free_netdev(dev);
+    return err;
+  }
+
+If struct net_device.priv_destructor is set it will be called by the core
+some time after unregister_netdevice(), it will also be called if
+register_netdevice() fails. The callback may be invoked with or without
+``rtnl_lock`` held.
+
+There is no explicit constructor callback, driver "constructs" the private
+netdev state after allocating it and before registration.
+
+Setting struct net_device.needs_free_netdev makes core call free_netdevice()
+automatically after unregister_netdevice() when all references to the device
+are gone. It only takes effect after a successful call to register_netdevice()
+so if register_netdevice() fails driver is responsible for calling
+free_netdev().
+
+free_netdev() is safe to call on error paths right after unregister_netdevice()
+or when register_netdevice() fails. Parts of netdev (de)registration process
+happen after ``rtnl_lock`` is released, therefore in those cases free_netdev()
+will defer some of the processing until ``rtnl_lock`` is released.
+
+Devices spawned from struct rtnl_link_ops should never free the
+struct net_device directly.
+
+.ndo_init and .ndo_uninit
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+``.ndo_init`` and ``.ndo_uninit`` callbacks are called during net_device
+registration and de-registration, under ``rtnl_lock``. Drivers can use
+those e.g. when parts of their init process need to run under ``rtnl_lock``.
+
+``.ndo_init`` runs before device is visible in the system, ``.ndo_uninit``
+runs during de-registering after device is closed but other subsystems
+may still have outstanding references to the netdevice.
 
 MTU
 ===
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index bb0596c41b3e..79f514afb17d 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3441,7 +3441,7 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	if (ops->newlink) {
 		err = ops->newlink(link_net ? : net, dev, tb, data, extack);
-		/* Drivers should call free_netdev() in ->destructor
+		/* Drivers should set dev->needs_free_netdev
 		 * and unregister it on failure after registration
 		 * so that device could be finally freed in rtnl_unlock.
 		 */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 5.10 2/3] net: make free_netdev() more lenient with unregistering devices
  2022-07-14 16:11 [PATCH 5.10 0/3] net: fix use-after-free of net_device's in __rtnl_newlink() Fedor Pchelkin
  2022-07-14 16:11 ` [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime Fedor Pchelkin
@ 2022-07-14 16:11 ` Fedor Pchelkin
  2022-07-14 16:11 ` [PATCH 5.10 3/3] net: make sure devices go through netdev_wait_all_refs Fedor Pchelkin
  2 siblings, 0 replies; 6+ messages in thread
From: Fedor Pchelkin @ 2022-07-14 16:11 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Fedor Pchelkin, Jakub Kicinski, Alexey Khoroshilov

From: Jakub Kicinski <kuba@kernel.org>

commit c269a24ce057abfc31130960e96ab197ef6ab196 upstream.

There are two flavors of handling netdev registration:
 - ones called without holding rtnl_lock: register_netdev() and
   unregister_netdev(); and
 - those called with rtnl_lock held: register_netdevice() and
   unregister_netdevice().

While the semantics of the former are pretty clear, the same can't
be said about the latter. The netdev_todo mechanism is utilized to
perform some of the device unregistering tasks and it hooks into
rtnl_unlock() so the locked variants can't actually finish the work.
In general free_netdev() does not mix well with locked calls. Most
drivers operating under rtnl_lock set dev->needs_free_netdev to true
and expect core to make the free_netdev() call some time later.

The part where this becomes most problematic is error paths. There is
no way to unwind the state cleanly after a call to register_netdevice(),
since unreg can't be performed fully without dropping locks.

Make free_netdev() more lenient, and defer the freeing if device
is being unregistered. This allows error paths to simply call
free_netdev() both after register_netdevice() failed, and after
a call to unregister_netdevice() but before dropping rtnl_lock.

Simplify the error paths which are currently doing gymnastics
around free_netdev() handling.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/8021q/vlan.c     |  4 +---
 net/core/dev.c       | 11 +++++++++++
 net/core/rtnetlink.c | 23 ++++++-----------------
 3 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 15bbfaf943fd..8b644113715e 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -284,9 +284,7 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
 	return 0;
 
 out_free_newdev:
-	if (new_dev->reg_state == NETREG_UNINITIALIZED ||
-	    new_dev->reg_state == NETREG_UNREGISTERED)
-		free_netdev(new_dev);
+	free_netdev(new_dev);
 	return err;
 }
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 8fa739259041..adde93cbca9f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10631,6 +10631,17 @@ void free_netdev(struct net_device *dev)
 	struct napi_struct *p, *n;
 
 	might_sleep();
+
+	/* When called immediately after register_netdevice() failed the unwind
+	 * handling may still be dismantling the device. Handle that case by
+	 * deferring the free.
+	 */
+	if (dev->reg_state == NETREG_UNREGISTERING) {
+		ASSERT_RTNL();
+		dev->needs_free_netdev = true;
+		return;
+	}
+
 	netif_free_tx_queues(dev);
 	netif_free_rx_queues(dev);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 79f514afb17d..3d6ab194d0f5 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3439,26 +3439,15 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	dev->ifindex = ifm->ifi_index;
 
-	if (ops->newlink) {
+	if (ops->newlink)
 		err = ops->newlink(link_net ? : net, dev, tb, data, extack);
-		/* Drivers should set dev->needs_free_netdev
-		 * and unregister it on failure after registration
-		 * so that device could be finally freed in rtnl_unlock.
-		 */
-		if (err < 0) {
-			/* If device is not registered at all, free it now */
-			if (dev->reg_state == NETREG_UNINITIALIZED ||
-			    dev->reg_state == NETREG_UNREGISTERED)
-				free_netdev(dev);
-			goto out;
-		}
-	} else {
+	else
 		err = register_netdevice(dev);
-		if (err < 0) {
-			free_netdev(dev);
-			goto out;
-		}
+	if (err < 0) {
+		free_netdev(dev);
+		goto out;
 	}
+
 	err = rtnl_configure_link(dev, ifm);
 	if (err < 0)
 		goto out_unregister;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 5.10 3/3] net: make sure devices go through netdev_wait_all_refs
  2022-07-14 16:11 [PATCH 5.10 0/3] net: fix use-after-free of net_device's in __rtnl_newlink() Fedor Pchelkin
  2022-07-14 16:11 ` [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime Fedor Pchelkin
  2022-07-14 16:11 ` [PATCH 5.10 2/3] net: make free_netdev() more lenient with unregistering devices Fedor Pchelkin
@ 2022-07-14 16:11 ` Fedor Pchelkin
  2 siblings, 0 replies; 6+ messages in thread
From: Fedor Pchelkin @ 2022-07-14 16:11 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Fedor Pchelkin, Jakub Kicinski, Alexey Khoroshilov, Hulk Robot,
	Yang Yingliang

From: Jakub Kicinski <kuba@kernel.org>

commit 766b0515d5bec4b780750773ed3009b148df8c0a upstream.

If register_netdevice() fails at the very last stage - the
notifier call - some subsystems may have already seen it and
grabbed a reference. struct net_device can't be freed right
away without calling netdev_wait_all_refs().

Now that we have a clean interface in form of dev->needs_free_netdev
and lenient free_netdev() we can undo what commit 93ee31f14f6f ("[NET]:
Fix free_netdev on register_netdev failure.") has done and complete
the unregistration path by bringing the net_set_todo() call back.

After registration fails user is still expected to explicitly
free the net_device, so make sure ->needs_free_netdev is cleared,
otherwise rolling back the registration will cause the old double
free for callers who release rtnl_lock before the free.

This also solves the problem of priv_destructor not being called
on notifier error.

net_set_todo() will be moved back into unregister_netdevice_queue()
in a follow up.

Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/core/dev.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index adde93cbca9f..0071a11a6dc3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10077,17 +10077,11 @@ int register_netdevice(struct net_device *dev)
 	ret = call_netdevice_notifiers(NETDEV_REGISTER, dev);
 	ret = notifier_to_errno(ret);
 	if (ret) {
+		/* Expect explicit free_netdev() on failure */
+		dev->needs_free_netdev = false;
 		rollback_registered(dev);
-		rcu_barrier();
-
-		dev->reg_state = NETREG_UNREGISTERED;
-		/* We should put the kobject that hold in
-		 * netdev_unregister_kobject(), otherwise
-		 * the net device cannot be freed when
-		 * driver calls free_netdev(), because the
-		 * kobject is being hold.
-		 */
-		kobject_put(&dev->dev.kobj);
+		net_set_todo(dev);
+		goto out;
 	}
 	/*
 	 *	Prevent userspace races by waiting until the network
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime
  2022-07-14 16:11 ` [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime Fedor Pchelkin
@ 2022-07-15 14:12   ` Greg Kroah-Hartman
  2022-07-15 16:23     ` Fedor Pchelkin
  0 siblings, 1 reply; 6+ messages in thread
From: Greg Kroah-Hartman @ 2022-07-15 14:12 UTC (permalink / raw)
  To: Fedor Pchelkin; +Cc: stable, Jakub Kicinski, Alexey Khoroshilov

On Thu, Jul 14, 2022 at 07:11:32PM +0300, Fedor Pchelkin wrote:
> From: Jakub Kicinski <kuba@kernel.org>
> 
> commit 2b446e650b418f9a9e75f99852e2f2560cabfa17 upstream.
> 
> Explain the two basic flows of struct net_device's operation.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

If you pass on patches to others, you also must sign off on them.

Please do so for all of the patches in this series.

And why is a documentation patch needed for stable?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime
  2022-07-15 14:12   ` Greg Kroah-Hartman
@ 2022-07-15 16:23     ` Fedor Pchelkin
  0 siblings, 0 replies; 6+ messages in thread
From: Fedor Pchelkin @ 2022-07-15 16:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable, Jakub Kicinski, Alexey Khoroshilov

 > If you pass on patches to others, you also must sign off on them.
Sorry. I've fixed that for all the patches.

I also found out that the previous series was incomplete. It solved the
use-after-free bug but produced a new one. It turns out that several
more patches are required in order to fully solve this problem.

 > And why is a documentation patch needed for stable?
I think the documentation patch is needed because it safely fixes a docs 
file section on the issue and a comment fragment. It is important for 
consistency because that docs patch describes the problem and gives more 
clarity to the code, and it was included in the original patch series as 
well.

Fedor

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-07-15 16:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14 16:11 [PATCH 5.10 0/3] net: fix use-after-free of net_device's in __rtnl_newlink() Fedor Pchelkin
2022-07-14 16:11 ` [PATCH 5.10 1/3] docs: net: explain struct net_device lifetime Fedor Pchelkin
2022-07-15 14:12   ` Greg Kroah-Hartman
2022-07-15 16:23     ` Fedor Pchelkin
2022-07-14 16:11 ` [PATCH 5.10 2/3] net: make free_netdev() more lenient with unregistering devices Fedor Pchelkin
2022-07-14 16:11 ` [PATCH 5.10 3/3] net: make sure devices go through netdev_wait_all_refs Fedor Pchelkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.