All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	f.fainelli@gmail.com, xiyou.wangcong@gmail.com,
	Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH net 0/3] net: fix issues around register_netdevice() failures
Date: Wed,  6 Jan 2021 10:40:04 -0800	[thread overview]
Message-ID: <20210106184007.1821480-1-kuba@kernel.org> (raw)

This series attempts to clean up the life cycle of struct
net_device. Dave has added dev->needs_free_netdev in the
past to fix double frees, we can lean on that mechanism
a little more to fix remaining issues with register_netdevice().

This is the next chapter of the saga which already includes:
commit 0e0eee2465df ("net: correct error path in rtnl_newlink()")
commit e51fb152318e ("rtnetlink: fix a memory leak when ->newlink fails")
commit cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
commit 93ee31f14f6f ("[NET]: Fix free_netdev on register_netdev failure.")
commit 814152a89ed5 ("net: fix memleak in register_netdevice()")
commit 10cc514f451a ("net: Fix null de-reference of device refcount")

The immediate problem which gets fixed here is that calling
free_netdev() right after unregister_netdevice() is illegal
because we need to release rtnl_lock first, to let the
unregistration finish. Note that unregister_netdevice() is
just a wrapper of unregister_netdevice_queue(), it only
does half of the job.

Where this limitation becomes most problematic is in failure
modes of register_netdevice(). There is a notifier call right
at the end of it, which lets other subsystems veto the entire
thing. At which point we should really go through a full
unregister_netdevice(), but we can't because callers may
go straight to free_netdev() after the failure, and that's
no bueno (see the previous paragraph).

This set makes free_netdev() more lenient, when device
is still being unregistered free_netdev() will simply set
dev->needs_free_netdev and let the unregister process do
the freeing.

With the free_netdev() problem out of the way failures in
register_netdevice() can make use of net_todo, again.
Users are still expected to call free_netdev() right after
failure but that will only set dev->needs_free_netdev.

To prevent the pathological case of:

 dev->needs_free_netdev = true;
 if (register_netdevice(dev)) {
   rtnl_unlock();
   free_netdev(dev);
 }

make register_netdevice()'s failure clear dev->needs_free_netdev.

Problems described above are only present with register_netdevice() /
unregister_netdevice(). We have two parallel APIs for registration
of devices:
 - those called outside rtnl_lock (register_netdev(), and
   unregister_netdev());
 - and those to be used under rtnl_lock - register_netdevice()
   and unregister_netdevice().
The former is trivial and has no problems. The alternative
approach to fix the latter would be to also separate the
freeing functions - i.e. add free_netdevice(). This has been
implemented (incl. converting all relevant calls in the tree)
but it feels a little unnecessary to put the burden of choosing
the right free_netdev{,ice}() call on the programmer when we
can "just do the right thing" by default.

Jakub Kicinski (3):
  docs: net: explain struct net_device lifetime
  net: make free_netdev() more lenient with unregistering devices
  net: make sure devices go through netdev_wait_all_refs

 Documentation/networking/netdevices.rst | 171 +++++++++++++++++++++++-
 net/8021q/vlan.c                        |   4 +-
 net/core/dev.c                          |  25 ++--
 net/core/rtnetlink.c                    |  23 +---
 4 files changed, 187 insertions(+), 36 deletions(-)

-- 
2.26.2


             reply	other threads:[~2021-01-06 18:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06 18:40 Jakub Kicinski [this message]
2021-01-06 18:40 ` [PATCH net 1/3] docs: net: explain struct net_device lifetime Jakub Kicinski
2021-01-06 18:40 ` [PATCH net 2/3] net: make free_netdev() more lenient with unregistering devices Jakub Kicinski
2021-01-06 18:40 ` [PATCH net 3/3] net: make sure devices go through netdev_wait_all_refs Jakub Kicinski
2021-01-09  3:40 ` [PATCH net 0/3] net: fix issues around register_netdevice() failures patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210106184007.1821480-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.