All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] net: avoid registering new queue objects after device unregistration
@ 2021-11-22 16:20 Antoine Tenart
  2021-11-22 16:31 ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Antoine Tenart @ 2021-11-22 16:20 UTC (permalink / raw)
  To: davem, kuba; +Cc: Antoine Tenart, alexander.duyck, netdev

netdev_queue_update_kobjects can be called after device unregistration
started (and device_del was called) resulting in two issues: possible
registration of new queue kobjects (leading to the following trace) and
providing a wrong 'old_queue' number (because real_num_tx_queues is not
updated in the unregistration path).

  BUG: KASAN: use-after-free in kobject_get+0x14/0x90
  Read of size 1 at addr ffff88801961248c by task ethtool/755

  CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014
  Call Trace:
   dump_stack_lvl+0x57/0x72
   print_address_description.constprop.0+0x1f/0x140
   kasan_report.cold+0x7f/0x11b
   kobject_get+0x14/0x90
   kobject_add_internal+0x3d1/0x450
   kobject_init_and_add+0xba/0xf0
   netdev_queue_update_kobjects+0xcf/0x200
   netif_set_real_num_tx_queues+0xb4/0x310
   veth_set_channels+0x1c3/0x550
   ethnl_set_channels+0x524/0x610

The fix for both is to only allow unregistering queue kobjects after a
net device started its unregistration and to ensure we know the current
Tx queue numbers (we update dev->real_num_tx_queues before returning).
This relies on the fact that dev->real_num_tx_queues is used for
'old_num' expect when firstly allocating queues.

(Rx queues are not affected as net_rx_queue_update_kobjects can't be
called after a net device started its unregistration).

Fixes: 5c56580b74e5 ("net: Adjust TX queue kobjects if number of queues changes during unregister")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---

Since v1:
  - Rebased on latest net tree.

David asked in v1 to rename queue functions in net-sysfs[1]. I do have a
patch to prefix all Rx queue functions with net_rx_queue_ and all Tx
ones with net_tx_queue_ in net-sysfs. However this could be confusing as
for historical reasons 'netdev_queue' was always used for Tx queues and
'(net(dev)_)rx_queue' for Rx ones, including 'struct netdev_queue' and
'struct netdev_rx_queue.

- Only renaming the functions in net-sysfs can be a little confusing
  with key structures not following the same scheme.

- But renaming 'struct netdev_queue' is really invasive (380+
  occurrences all over the kernel).

WDYT?

(Anyway such a patch would target net-next while this one is a fix for
net).

Thanks!

[1] https://lore.kernel.org/all/20211026133822.949135-1-atenart@kernel.org/T/#mb0ed3e8366d92df21c35b95cecb2a7e497d25544

 net/core/net-sysfs.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 9c01c642cf9e..0d9777484576 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1706,6 +1706,12 @@ netdev_queue_update_kobjects(struct net_device *dev, int old_num, int new_num)
 	int i;
 	int error = 0;
 
+	/* When unregistering a net device we can't register new kobjects, as
+	 * this would result in an UaF of the device's kobj.
+	 */
+	if (dev->reg_state == NETREG_UNREGISTERING && new_num > old_num)
+		return -EINVAL;
+
 	for (i = old_num; i < new_num; i++) {
 		error = netdev_queue_add_kobject(dev, i);
 		if (error) {
@@ -1725,6 +1731,7 @@ netdev_queue_update_kobjects(struct net_device *dev, int old_num, int new_num)
 		kobject_put(&queue->kobj);
 	}
 
+	dev->real_num_tx_queues = new_num;
 	return error;
 #else
 	return 0;
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: avoid registering new queue objects after device unregistration
  2021-11-22 16:20 [PATCH net v2] net: avoid registering new queue objects after device unregistration Antoine Tenart
@ 2021-11-22 16:31 ` Jakub Kicinski
  2021-11-22 16:56   ` Antoine Tenart
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2021-11-22 16:31 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: davem, alexander.duyck, netdev, Michal Kubecek

On Mon, 22 Nov 2021 17:20:07 +0100 Antoine Tenart wrote:
>    veth_set_channels+0x1c3/0x550
>    ethnl_set_channels+0x524/0x610

The patch looks fine, but ethtool calls should not happen after
unregister IMHO. Too many drivers would be surprised. 

Maybe we should catch unregistered devices in ethnl_ops_begin()?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: avoid registering new queue objects after device unregistration
  2021-11-22 16:31 ` Jakub Kicinski
@ 2021-11-22 16:56   ` Antoine Tenart
  2021-11-22 19:37     ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Antoine Tenart @ 2021-11-22 16:56 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, alexander.duyck, netdev, Michal Kubecek

Quoting Jakub Kicinski (2021-11-22 17:31:44)
> On Mon, 22 Nov 2021 17:20:07 +0100 Antoine Tenart wrote:
> >    veth_set_channels+0x1c3/0x550
> >    ethnl_set_channels+0x524/0x610
> 
> The patch looks fine, but ethtool calls should not happen after
> unregister IMHO. Too many drivers would be surprised. 
> 
> Maybe we should catch unregistered devices in ethnl_ops_begin()?

That might be good to have indeed, I'll have a look. I'm not sure about
other paths that could do the same: if there are, we might want to keep
the check in this patch as well. Or WARN to catch future misuses?

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: avoid registering new queue objects after device unregistration
  2021-11-22 16:56   ` Antoine Tenart
@ 2021-11-22 19:37     ` Jakub Kicinski
  2021-11-23 11:18       ` Antoine Tenart
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2021-11-22 19:37 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: davem, alexander.duyck, netdev, Michal Kubecek

On Mon, 22 Nov 2021 17:56:06 +0100 Antoine Tenart wrote:
> Quoting Jakub Kicinski (2021-11-22 17:31:44)
> > On Mon, 22 Nov 2021 17:20:07 +0100 Antoine Tenart wrote:  
> > >    veth_set_channels+0x1c3/0x550
> > >    ethnl_set_channels+0x524/0x610  
> > 
> > The patch looks fine, but ethtool calls should not happen after
> > unregister IMHO. Too many drivers would be surprised. 
> > 
> > Maybe we should catch unregistered devices in ethnl_ops_begin()?  
> 
> That might be good to have indeed, I'll have a look. I'm not sure about
> other paths that could do the same: if there are, we might want to keep
> the check in this patch as well. Or WARN to catch future misuses?

My knee jerk reaction was to add a WARN() just because I can't think
why changing queue count after unregister would be needed. But it's not
really illegal and we do support it before register, so why not after
unregister..

We can venture a warning with a comment saying that this is just for
catching bad uses and see if any driver hits it on a normal path?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: avoid registering new queue objects after device unregistration
  2021-11-22 19:37     ` Jakub Kicinski
@ 2021-11-23 11:18       ` Antoine Tenart
  0 siblings, 0 replies; 5+ messages in thread
From: Antoine Tenart @ 2021-11-23 11:18 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, alexander.duyck, netdev, Michal Kubecek

Quoting Jakub Kicinski (2021-11-22 20:37:42)
> On Mon, 22 Nov 2021 17:56:06 +0100 Antoine Tenart wrote:
> > Quoting Jakub Kicinski (2021-11-22 17:31:44)
> > > On Mon, 22 Nov 2021 17:20:07 +0100 Antoine Tenart wrote:  
> > > >    veth_set_channels+0x1c3/0x550
> > > >    ethnl_set_channels+0x524/0x610  
> > > 
> > > The patch looks fine, but ethtool calls should not happen after
> > > unregister IMHO. Too many drivers would be surprised. 
> > > 
> > > Maybe we should catch unregistered devices in ethnl_ops_begin()?  
> > 
> > That might be good to have indeed, I'll have a look. I'm not sure about
> > other paths that could do the same: if there are, we might want to keep
> > the check in this patch as well. Or WARN to catch future misuses?
> 
> My knee jerk reaction was to add a WARN() just because I can't think
> why changing queue count after unregister would be needed. But it's not
> really illegal and we do support it before register, so why not after
> unregister..

It's a little tricky. The queue count can be changed before
registration (updating dev->real_num_rx/tx_queues only) but the queue
objects can't[1]. The symmetry would be to only allowing to update
dev->real_num_tx_queues after unregister (which seems useless).

This is what's actually done for Rx queues[2]. For Tx the issue we have
is a quirk that was added to allow qdiscs to remove their extra Tx
queues after unregister[3] as in unregister_netdevice_many the net
devices are unlisted and moved to NETREG_UNREGISTERING before the call
to dev_shutdown().

[1] https://elixir.bootlin.com/linux/v5.16-rc2/source/net/core/dev.c#L2918
[2] https://elixir.bootlin.com/linux/v5.16-rc2/source/net/core/dev.c#L2967
[3] 5c56580b74e5 ("net: Adjust TX queue kobjects if number of queues
    changes during unregister")

> We can venture a warning with a comment saying that this is just for
> catching bad uses and see if any driver hits it on a normal path?

With that a better fix seems to be what you suggested, in
ethnl_ops_begin. A WARN in netdev_queue_update_kobjects[4] detecting
misuses might be good as well as the logic is not that straightforward
here.

[4] Or in netif_set_real_num_tx_queues as it is already detecting when
    queues are being disabled. !disabling && NETREG_UNREGISTERING should
    be illegal or at least not update the kobjects.
    https://elixir.bootlin.com/linux/v5.16-rc2/source/net/core/dev.c#L2913

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-23 11:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-22 16:20 [PATCH net v2] net: avoid registering new queue objects after device unregistration Antoine Tenart
2021-11-22 16:31 ` Jakub Kicinski
2021-11-22 16:56   ` Antoine Tenart
2021-11-22 19:37     ` Jakub Kicinski
2021-11-23 11:18       ` Antoine Tenart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.