netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [BUG] deadlock in nl80211_vendor_cmd
       [not found] ` <CABYd82Z=YXmZPTQhf0K1M4nS2wk3dPBSqx91D8SoUd59AUzpHg@mail.gmail.com>
@ 2022-03-21 17:00   ` William McVicker
  0 siblings, 0 replies; 19+ messages in thread
From: William McVicker @ 2022-03-21 17:00 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless
  Cc: Marek Szyprowski, Kalle Valo, David S. Miller, Jakub Kicinski,
	netdev, Amitkumar Karwar, Ganapathi Bhat, Xinming Hu,
	Cc: Android Kernel

On 03/21/2022, Will McVicker wrote:
> On Thu, Mar 17, 2022 at 10:09 AM <willmcvicker@google.com> wrote:
> 
> > Hi,
> >
> > I wanted to report a deadlock that I'm hitting as a result of the upstream
> > commit a05829a7222e ("cfg80211: avoid holding the RTNL when calling the
> > driver"). I'm using the Pixel 6 with downstream version of the 5.15 kernel,
> > but I'm pretty sure this will happen on the upstream tip-of-tree kernel as
> > well.
> >
> > Basically, my wlan driver uses the wiphy_vendor_command ops to handle
> > a number of vendor specific operations. One of them in particular deletes
> > a cfg80211 interface. The deadlock happens when thread 1 tries to take the
> > RTNL lock before calling cfg80211_unregister_device() while thread 2 is
> > inside nl80211_pre_doit(), holding the RTNL lock, and waiting on
> > wiphy_lock().
> >
> > Here is the call flow:
> >
> > Thread 1:                         Thread 2:
> >
> > nl80211_pre_doit():
> >   -> rtnl_lock()
> >                                       nl80211_pre_doit():
> >                                        -> rtnl_lock()
> >                                        -> <blocked by Thread 1>
> >   -> wiphy_lock()
> >   -> rtnl_unlock()
> >   -> <unblock Thread 1>
> > exit nl80211_pre_doit()
> >                                        <Thread 2 got the RTNL lock>
> >                                        -> wiphy_lock()
> >                                        -> <blocked by Thread 1>
> > nl80211_doit()
> >   -> nl80211_vendor_cmd()
> >       -> rtnl_lock() <DEADLOCK>
> >       -> cfg80211_unregister_device()
> >       -> rtnl_unlock()
> >
> >
> > To be complete, here are the kernel call traces when the deadlock occurs:
> >
> > Thread 1 Call trace:
> >     <Take rtnl before calling cfg80211_unregister_device()>
> >     nl80211_vendor_cmd+0x210/0x218
> >     genl_rcv_msg+0x3ac/0x45c
> >     netlink_rcv_skb+0x130/0x168
> >     genl_rcv+0x38/0x54
> >     netlink_unicast_kernel+0xe4/0x1f4
> >     netlink_unicast+0x128/0x21c
> >     netlink_sendmsg+0x2d8/0x3d8
> >
> > Thread 2 Call trace:
> >     <Take wiphy_lock>
> >     nl80211_pre_doit+0x1b0/0x250
> >     genl_rcv_msg+0x37c/0x45c
> >     netlink_rcv_skb+0x130/0x168
> >     genl_rcv+0x38/0x54
> >     netlink_unicast_kernel+0xe4/0x1f4
> >     netlink_unicast+0x128/0x21c
> >     netlink_sendmsg+0x2d8/0x3d8
> >
> > I'm not an networking expert. So my main question is if I'm allowed to take
> > the RTNL lock inside the nl80211_vendor_cmd callbacks? If so, then
> > regardless of why I take it, we shouldn't be allowing this deadlock
> > situation, right?
> >
> > I hope that helps explain the issue. Let me know if you need any more
> > details.
> >
> > Thanks,
> > Will
> >
> 
> Sorry my CC list got dropped. Adding the following:
> 
> Kalle Valo <kvalo@codeaurora.org>
> "David S. Miller" <davem@davemloft.net>
> Jakub Kicinski <kuba@kernel.org>
> netdev@vger.kernel.org
> Amitkumar Karwar <amitkarwar@gmail.com>
> Ganapathi Bhat <ganapathi.bhat@nxp.com>
> Xinming Hu <huxinming820@gmail.com>
> kernel-team@android.com

Sorry for the noise. The lists bounced due to html. Resending with mutt to make
sure everyone gets this message.

As an update, I was able to fix the deadlock by updating nl80211_pre_doit() to
not hold the RTNL lock while waiting to get the wiphy_lock. This allows us to
take the RTNL lock within nl80211_doit() and have parallel calls to
nl80211_doit(). Below is the logic I tested. Please let me know if I'm heading
in the right direction.

Thanks,
Will

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 686a69381731..bb4ad746509b 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -15227,7 +15227,24 @@ static int nl80211_pre_doit(const struct genl_ops *ops, struct sk_buff *skb,
 	}
 
 	if (rdev && !(ops->internal_flags & NL80211_FLAG_NO_WIPHY_MTX)) {
-		wiphy_lock(&rdev->wiphy);
+		while (!mutex_trylock(&rdev->wiphy.mtx)) {
+			/* Holding the RTNL lock while waiting for the wiphy lock can lead to
+			 * a deadlock within doit() ops that don't hold the RTNL in pre_doit. So
+			 * we need to release the RTNL lock first while we wait for the wiphy
+			 * lock.
+			 */
+			rtnl_unlock();
+			wiphy_lock(&rdev->wiphy);
+
+			/* Once we get the wiphy_lock, we need to grab the RTNL lock. If we can't
+			 * get it, then we need to unlock the wiphy to avoid a deadlock in
+			 * pre_doit and then retry taking the locks again. */
+			if (!rtnl_trylock()) {
+				wiphy_unlock(&rdev->wiphy);
+				rtnl_lock();
+			} else
+				break;
+		}
 		/* we keep the mutex locked until post_doit */
 		__release(&rdev->wiphy.mtx);
 	}

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
       [not found]       ` <dc556455-51a2-06e8-8ec5-b807c2901b7e@quicinc.com>
@ 2022-03-24 21:58         ` William McVicker
  2022-03-25 12:04           ` Johannes Berg
  0 siblings, 1 reply; 19+ messages in thread
From: William McVicker @ 2022-03-24 21:58 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless
  Cc: Marek Szyprowski, Kalle Valo, David S. Miller, Jakub Kicinski,
	netdev, Amitkumar Karwar, Ganapathi Bhat, Xinming Hu,
	kernel-team, Paolo Abeni

On 03/23/2022, Jeff Johnson wrote:
> On 3/22/2022 2:58 PM, William McVicker wrote:
> > On 03/22/2022, Jeff Johnson wrote:
> > > On 3/21/2022 1:07 PM, Johannes Berg wrote:
> > > [..snip..]
> > > 
> > > > > I'm not an networking expert. So my main question is if I'm allowed to take
> > > > > the RTNL lock inside the nl80211_vendor_cmd callbacks?
> > > > 
> > > > Evidently, you're not. It's interesting though, it used to be that we
> > > > called these with the RTNL held, now we don't, and the driver you're
> > > > using somehow "got fixed" to take it, but whoever fixed it didn't take
> > > > into account that this is not possible?
> > > 
> > > On this point I just want to remind that prior to the locking change that a
> > > driver would specify on a per-vendor command basis whether or not it wanted
> > > the rtnl_lock to be held via NL80211_FLAG_NEED_RTNL. I'm guessing for the
> > > command in question the driver did not set this flag since the driver wanted
> > > to explicitly take the lock itself, otherwise it would have deadlocked on
> > > itself with the 5.10 kernel.
> > > 
> > > /jeff
> > 
> > On the 5.10 kernel, the core kernel sets NL80211_FLAG_NEED_RTNL as part of
> > the internal_flags for NL80211_CMD_VENDOR:
> > 
> > net/wireless/nl80211.c:
> >     {
> >        .cmd = NL80211_CMD_VENDOR,
> >        .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
> >        .doit = nl80211_vendor_cmd,
> >        .dumpit = nl80211_vendor_cmd_dump,
> >        .flags = GENL_UNS_ADMIN_PERM,
> >        .internal_flags = NL80211_FLAG_NEED_WIPHY |
> >                NL80211_FLAG_NEED_RTNL |
> >                NL80211_FLAG_CLEAR_SKB,
> >     },
> > 
> > So the 5.10 version of this driver doesn't need to directly call rtnl_lock()
> > within the vendor command doit() functions since pre_doit() handles the RTNL
> > locking.
> > 
> > It would be nice if nl80211_vendor_cmd() could support taking the RTNL lock if
> > requested via the vendor flags. That would require moving the wiphy lock to
> > nl80211_vendor_cmds() so that it could take the RTNL and wiphy lock in the
> > correct order. Is that something you'd be open to Johannes?
> > 
> > --Will
> 
> Thanks for correcting my understanding. I concur that it would be useful for
> vendor commands to be able to specify that a given command needs the RTNL
> lock to be held.
> 
> 

Hi Johannes,

I found that we can hit this same ABBA deadlock within the nl80211 code
before ever even calling into the vendor doit() function. The issue I found
is caused by the way we unlock the RTNL mutex. Here is the call flow that
leads to the deadlock:

Thread 1                         Thread 2
 nl80211_pre_doit():
   rtnl_lock()
   wiphy_lock()                   nl80211_pre_doit():
                                    rtnl_lock() // blocked by Thread 1
   rtnl_unlock():
     netdev_run_todo():
       __rtnl_unlock()
                                    <got RTNL lock>
                                    wiphy_lock() // blocked by Thread 1
       rtnl_lock(); // DEADLOCK
 doit()
 nl80211_post_doit():
   wiphy_unlock();

Basically, unlocking the RTNL within netdev_run_todo() gives another thread
that is waiting for the RTNL in nl80211_pre_doit() a chance to grab the
RTNL lock leading to the deadlock. I found that there are multiple
instances where rtnl_lock() is called within netdev_run_todo(): a couple of
times inside netdev_wait_allrefs() and directly by netdev_run_todo().

Since I'm not really familiar with all the RNTL locking requirements, I was
hoping you could take a look at netdev_run_todo() to see if it's possible
to refactor it to avoid this deadlock. If not, then I don't think we can
call rtnl_unlock() while still holding the wiphy mutex.

Thanks,
Will

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-24 21:58         ` William McVicker
@ 2022-03-25 12:04           ` Johannes Berg
  2022-03-25 12:06             ` Johannes Berg
  2022-03-25 16:49             ` Jakub Kicinski
  0 siblings, 2 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 12:04 UTC (permalink / raw)
  To: William McVicker, linux-wireless
  Cc: Marek Szyprowski, Kalle Valo, David S. Miller, Jakub Kicinski,
	netdev, Amitkumar Karwar, Ganapathi Bhat, Xinming Hu,
	kernel-team, Paolo Abeni

Hi,

(Jakub, can you please see below, I wonder which you prefer)

> 
> I found that we can hit this same ABBA deadlock within the nl80211 code
> before ever even calling into the vendor doit() function. 

Hmm.

> The issue I found
> is caused by the way we unlock the RTNL mutex. Here is the call flow that
> leads to the deadlock:
> 
> Thread 1                         Thread 2
>  nl80211_pre_doit():
>    rtnl_lock()
>    wiphy_lock()                   nl80211_pre_doit():
>                                     rtnl_lock() // blocked by Thread 1
>    rtnl_unlock():
>      netdev_run_todo():
>        __rtnl_unlock()
>                                     <got RTNL lock>
>                                     wiphy_lock() // blocked by Thread 1
>        rtnl_lock(); // DEADLOCK
>  doit()
>  nl80211_post_doit():
>    wiphy_unlock();
> 
> Basically, unlocking the RTNL within netdev_run_todo() gives another thread
> that is waiting for the RTNL in nl80211_pre_doit() a chance to grab the
> RTNL lock leading to the deadlock. I found that there are multiple
> instances where rtnl_lock() is called within netdev_run_todo(): a couple of
> times inside netdev_wait_allrefs() and directly by netdev_run_todo().

Have you actually observed this in practice?

It's true, but dynamically this only happens if you get into the

	while (!list_empty(&list)) {
	...
	}

code in netdev_run_todo().

Which in itself can only be true if a netdev was unregistered and
netdev_run_todo() didn't run yet.

Now, since normally we go through rtnl_unlock(), it's highly likely that
we get into rtnl_lock() with the todo list being empty (but not
impossible, read on), so then rtnl_unlock()/netdev_run_todo() won't be
getting into this branch, and then the deadlock cannot happen.

Now, it might be possible somewhere in the tree to unregister a netdev
and then unlock using __rtnl_unlock(), so the todo list isn't processed
until the next time, but __rtnl_unlock() isn't exported and all the
users I found didn't seem to cause this problem.


On the other hand, clearly people thought it was worth worrying about,
there are already *two* almost identical implementations of avoiding
this problem:
 - rtnl_lock_unregistering
 - rtnl_lock_unregistering_all

(identical except for the netns list they use, partial vs. all).

So we can avoid the potential deadlock in cfg80211 in a few ways:

 1) export rtnl_lock_unregistering_all() or maybe a variant after
    refactoring the two versions, to allow cfg80211 to use it, that way
    netdev_run_todo() can never have a non-empty todo list

 2) export __rtnl_unlock() so cfg80211 can avoid running
    netdev_run_todo() in the unlock, personally I like this less because
    it might encourage random drivers to use it

 3) completely rework cfg80211's locking, adding a separate mutex for
    the wiphy list so we don't need to acquire the RTNL at all here
    (unless the ops need it, but there's no issue if we don't drop it),
    something like https://p.sipsolutions.net/27d08e1f5881a793.txt


I think I'm happy with 3) now (even if it took a couple of hours), so I
think we can go with it, just need to go through all the possibilities.

johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 12:04           ` Johannes Berg
@ 2022-03-25 12:06             ` Johannes Berg
  2022-03-25 16:49             ` Jakub Kicinski
  1 sibling, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 12:06 UTC (permalink / raw)
  To: William McVicker, linux-wireless
  Cc: Marek Szyprowski, Kalle Valo, David S. Miller, Jakub Kicinski,
	netdev, Amitkumar Karwar, Ganapathi Bhat, Xinming Hu,
	kernel-team, Paolo Abeni

On Fri, 2022-03-25 at 13:04 +0100, Johannes Berg wrote:
> 
> So we can avoid the potential deadlock in cfg80211 in a few ways:
> 
>  1) export rtnl_lock_unregistering_all() or maybe a variant after
>     refactoring the two versions, to allow cfg80211 to use it, that way
>     netdev_run_todo() can never have a non-empty todo list
> 
>  2) export __rtnl_unlock() so cfg80211 can avoid running
>     netdev_run_todo() in the unlock, personally I like this less because
>     it might encourage random drivers to use it
> 
>  3) completely rework cfg80211's locking, adding a separate mutex for
>     the wiphy list so we don't need to acquire the RTNL at all here
>     (unless the ops need it, but there's no issue if we don't drop it),
>     something like https://p.sipsolutions.net/27d08e1f5881a793.txt
> 

Note that none of these actually let you do what you wanted - that is
acquiring the RTNL in the vendor op itself.

johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 12:04           ` Johannes Berg
  2022-03-25 12:06             ` Johannes Berg
@ 2022-03-25 16:49             ` Jakub Kicinski
  2022-03-25 17:01               ` Johannes Berg
  1 sibling, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2022-03-25 16:49 UTC (permalink / raw)
  To: Johannes Berg
  Cc: William McVicker, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 25 Mar 2022 13:04:23 +0100 Johannes Berg wrote:
> So we can avoid the potential deadlock in cfg80211 in a few ways:
> 
>  1) export rtnl_lock_unregistering_all() or maybe a variant after
>     refactoring the two versions, to allow cfg80211 to use it, that way
>     netdev_run_todo() can never have a non-empty todo list
> 
>  2) export __rtnl_unlock() so cfg80211 can avoid running
>     netdev_run_todo() in the unlock, personally I like this less because
>     it might encourage random drivers to use it
> 
>  3) completely rework cfg80211's locking, adding a separate mutex for
>     the wiphy list so we don't need to acquire the RTNL at all here
>     (unless the ops need it, but there's no issue if we don't drop it),
>     something like https://p.sipsolutions.net/27d08e1f5881a793.txt
> 
> 
> I think I'm happy with 3) now (even if it took a couple of hours), so I
> think we can go with it, just need to go through all the possibilities.

I like 3) as well. FWIW a few places (e.g. mlx5, devlink, I think I've
seen more) had been converting to xarray for managing the "registered"
objects. It may be worth looking into if you're re-doing things, anyway.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 16:49             ` Jakub Kicinski
@ 2022-03-25 17:01               ` Johannes Berg
  2022-03-25 18:08                 ` William McVicker
  2022-03-25 20:40                 ` Jakub Kicinski
  0 siblings, 2 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 17:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: William McVicker, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 2022-03-25 at 09:49 -0700, Jakub Kicinski wrote:
> On Fri, 25 Mar 2022 13:04:23 +0100 Johannes Berg wrote:
> > So we can avoid the potential deadlock in cfg80211 in a few ways:
> > 
> >  1) export rtnl_lock_unregistering_all() or maybe a variant after
> >     refactoring the two versions, to allow cfg80211 to use it, that way
> >     netdev_run_todo() can never have a non-empty todo list
> > 
> >  2) export __rtnl_unlock() so cfg80211 can avoid running
> >     netdev_run_todo() in the unlock, personally I like this less because
> >     it might encourage random drivers to use it
> > 
> >  3) completely rework cfg80211's locking, adding a separate mutex for
> >     the wiphy list so we don't need to acquire the RTNL at all here
> >     (unless the ops need it, but there's no issue if we don't drop it),
> >     something like https://p.sipsolutions.net/27d08e1f5881a793.txt
> > 
> > 
> > I think I'm happy with 3) now (even if it took a couple of hours), so I
> > think we can go with it, just need to go through all the possibilities.
> 
> I like 3) as well. FWIW a few places (e.g. mlx5, devlink, I think I've
> seen more) had been converting to xarray for managing the "registered"
> objects. It may be worth looking into if you're re-doing things, anyway.
> 

That's not a bad idea, but I think I wouldn't want to backport that, so
separately :) I don't think that fundamentally changes the locking
properties though.


Couple of more questions I guess: First, are we assuming that the
cfg80211 code *is* actually broken, even if it looks like nothing can
cause the situation, due to the empty todo list?

Given that we have rtnl_lock_unregistering() (and also
rtnl_lock_unregistering_all()), it looks like we *do* in fact at least
not want to make an assumption that no user of __rtnl_unlock() can have
added a todo item.

I mean, there's technically yet *another* thing we could do - something
like this:

[this doesn't compile, need to suitably make net_todo_list non-static]
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -95,6 +95,7 @@ void __rtnl_unlock(void)
 
        defer_kfree_skb_list = NULL;
 
+       WARN_ON(!list_empty(&net_todo_list));
        mutex_unlock(&rtnl_mutex);
 
        while (head) {

and actually that would allow us to get rid of rtnl_lock_unregistering()
and rtnl_lock_unregistering_all() simply because we'd actually guarantee
the invariant that when the RTNL is freshly locked, the list is empty
(by guaranteeing that it's always empty when it's unlocked, since it can
only be added to under RTNL).

With some suitable commentary, that might also be a reasonable thing?
__rtnl_unlock() is actually rather pretty rare, and not exported.


However, if you don't like that ...

I've been testing with this patch, to make lockdep complain:

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9933,6 +9933,11 @@ void netdev_run_todo(void)
        if (!list_empty(&list))
                rcu_barrier();
 
+#ifdef CONFIG_LOCKDEP
+       rtnl_lock();
+       __rtnl_unlock();
+#endif
+
        while (!list_empty(&list)) {
                struct net_device *dev
                        = list_first_entry(&list, struct net_device, todo_list);


That causes lockdep to complain for cfg80211 even if the list *is* in
fact empty.

Would you be open to adding something like that? Perhaps if I don't just
do the easy rtnl_lock/unlock, but try to find the corresponding lockdep-
only things to do there, to cause lockdep to do things without really
locking? OTOH, the locking overhead of the RTNL we just unlocked is
probably minimal, vs. the actual work *lockdep* is doing to track all
this ...

Thanks,
johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 17:01               ` Johannes Berg
@ 2022-03-25 18:08                 ` William McVicker
  2022-03-25 20:21                   ` Johannes Berg
  2022-03-25 20:36                   ` William McVicker
  2022-03-25 20:40                 ` Jakub Kicinski
  1 sibling, 2 replies; 19+ messages in thread
From: William McVicker @ 2022-03-25 18:08 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On 03/25/2022, Johannes Berg wrote:
> On Fri, 2022-03-25 at 09:49 -0700, Jakub Kicinski wrote:
> > On Fri, 25 Mar 2022 13:04:23 +0100 Johannes Berg wrote:
> > > So we can avoid the potential deadlock in cfg80211 in a few ways:
> > > 
> > >  1) export rtnl_lock_unregistering_all() or maybe a variant after
> > >     refactoring the two versions, to allow cfg80211 to use it, that way
> > >     netdev_run_todo() can never have a non-empty todo list
> > > 
> > >  2) export __rtnl_unlock() so cfg80211 can avoid running
> > >     netdev_run_todo() in the unlock, personally I like this less because
> > >     it might encourage random drivers to use it
> > > 
> > >  3) completely rework cfg80211's locking, adding a separate mutex for
> > >     the wiphy list so we don't need to acquire the RTNL at all here
> > >     (unless the ops need it, but there's no issue if we don't drop it),
> > >     something like https://p.sipsolutions.net/27d08e1f5881a793.txt
> > > 
> > > 
> > > I think I'm happy with 3) now (even if it took a couple of hours), so I
> > > think we can go with it, just need to go through all the possibilities.
> > 
> > I like 3) as well. FWIW a few places (e.g. mlx5, devlink, I think I've
> > seen more) had been converting to xarray for managing the "registered"
> > objects. It may be worth looking into if you're re-doing things, anyway.
> > 
> 
> That's not a bad idea, but I think I wouldn't want to backport that, so
> separately :) I don't think that fundamentally changes the locking
> properties though.
> 
> 
> Couple of more questions I guess: First, are we assuming that the
> cfg80211 code *is* actually broken, even if it looks like nothing can
> cause the situation, due to the empty todo list?

I'm able to reproduce this issue pretty easily with a Pixel 6 when I add
support to allow vendor commands to request for the RTNL. For this case, I just
delay unlocking the RTNL until nl80211_vendor_cmds() at which point I check the
flags to see if I should unlock before calling doit(). That allows me to run my
tests again and hit this issue. I imagine that I could hit this issue without
any changes if I re-work my vendor ops to not need the RTNL.

> 
> Given that we have rtnl_lock_unregistering() (and also
> rtnl_lock_unregistering_all()), it looks like we *do* in fact at least
> not want to make an assumption that no user of __rtnl_unlock() can have
> added a todo item.
> 
> I mean, there's technically yet *another* thing we could do - something
> like this:
> 
> [this doesn't compile, need to suitably make net_todo_list non-static]
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -95,6 +95,7 @@ void __rtnl_unlock(void)
>  
>         defer_kfree_skb_list = NULL;
>  
> +       WARN_ON(!list_empty(&net_todo_list));
>         mutex_unlock(&rtnl_mutex);
>  
>         while (head) {
> 
> and actually that would allow us to get rid of rtnl_lock_unregistering()
> and rtnl_lock_unregistering_all() simply because we'd actually guarantee
> the invariant that when the RTNL is freshly locked, the list is empty
> (by guaranteeing that it's always empty when it's unlocked, since it can
> only be added to under RTNL).
> 
> With some suitable commentary, that might also be a reasonable thing?
> __rtnl_unlock() is actually rather pretty rare, and not exported.
> 
> 
> However, if you don't like that ...
> 
> I've been testing with this patch, to make lockdep complain:
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -9933,6 +9933,11 @@ void netdev_run_todo(void)
>         if (!list_empty(&list))
>                 rcu_barrier();
>  
> +#ifdef CONFIG_LOCKDEP
> +       rtnl_lock();
> +       __rtnl_unlock();
> +#endif
> +
>         while (!list_empty(&list)) {
>                 struct net_device *dev
>                         = list_first_entry(&list, struct net_device, todo_list);
> 
> 
> That causes lockdep to complain for cfg80211 even if the list *is* in
> fact empty.
> 
> Would you be open to adding something like that? Perhaps if I don't just
> do the easy rtnl_lock/unlock, but try to find the corresponding lockdep-
> only things to do there, to cause lockdep to do things without really
> locking? OTOH, the locking overhead of the RTNL we just unlocked is
> probably minimal, vs. the actual work *lockdep* is doing to track all
> this ...
> 
> Thanks,
> johannes

Let me know if you'd like me to test any patches out.

Thanks,
Will

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 18:08                 ` William McVicker
@ 2022-03-25 20:21                   ` Johannes Berg
  2022-03-25 20:36                   ` William McVicker
  1 sibling, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 20:21 UTC (permalink / raw)
  To: William McVicker
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 2022-03-25 at 18:08 +0000, William McVicker wrote:
> 
> I'm able to reproduce this issue pretty easily with a Pixel 6 when I add
> support to allow vendor commands to request for the RTNL. 
> 

Hm, wait, which of the two issues?

> For this case, I just
> delay unlocking the RTNL until nl80211_vendor_cmds() at which point I check the
> flags to see if I should unlock before calling doit(). That allows me to run my
> tests again and hit this issue. I imagine that I could hit this issue without
> any changes if I re-work my vendor ops to not need the RTNL.

What are the vendor ops doing though?

If they're actually unregistering a netdev - which I believe you
mentioned earlier - then that's quite clearly going to cause an issue,
if you unlock RTNL while the wiphy mutex is still held.

If not, then I don't see right now how you'd be able to trigger any
issue here at all.

The original issue - that you rtnl_lock() yourself while the wiphy mutex
is held - can't happen anymore with your rework I guess.


johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 18:08                 ` William McVicker
  2022-03-25 20:21                   ` Johannes Berg
@ 2022-03-25 20:36                   ` William McVicker
  2022-03-25 21:16                     ` Johannes Berg
  1 sibling, 1 reply; 19+ messages in thread
From: William McVicker @ 2022-03-25 20:36 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On 03/25/2022, William McVicker wrote:
> On 03/25/2022, Johannes Berg wrote:
> > On Fri, 2022-03-25 at 09:49 -0700, Jakub Kicinski wrote:
> > > On Fri, 25 Mar 2022 13:04:23 +0100 Johannes Berg wrote:
> > > > So we can avoid the potential deadlock in cfg80211 in a few ways:
> > > > 
> > > >  1) export rtnl_lock_unregistering_all() or maybe a variant after
> > > >     refactoring the two versions, to allow cfg80211 to use it, that way
> > > >     netdev_run_todo() can never have a non-empty todo list
> > > > 
> > > >  2) export __rtnl_unlock() so cfg80211 can avoid running
> > > >     netdev_run_todo() in the unlock, personally I like this less because
> > > >     it might encourage random drivers to use it
> > > > 
> > > >  3) completely rework cfg80211's locking, adding a separate mutex for
> > > >     the wiphy list so we don't need to acquire the RTNL at all here
> > > >     (unless the ops need it, but there's no issue if we don't drop it),
> > > >     something like https://p.sipsolutions.net/27d08e1f5881a793.txt
> > > > 
> > > > 
> > > > I think I'm happy with 3) now (even if it took a couple of hours), so I
> > > > think we can go with it, just need to go through all the possibilities.
> > > 
> > > I like 3) as well. FWIW a few places (e.g. mlx5, devlink, I think I've
> > > seen more) had been converting to xarray for managing the "registered"
> > > objects. It may be worth looking into if you're re-doing things, anyway.
> > > 
> > 
> > That's not a bad idea, but I think I wouldn't want to backport that, so
> > separately :) I don't think that fundamentally changes the locking
> > properties though.
> > 
> > 
> > Couple of more questions I guess: First, are we assuming that the
> > cfg80211 code *is* actually broken, even if it looks like nothing can
> > cause the situation, due to the empty todo list?
> 
> I'm able to reproduce this issue pretty easily with a Pixel 6 when I add
> support to allow vendor commands to request for the RTNL. For this case, I just
> delay unlocking the RTNL until nl80211_vendor_cmds() at which point I check the
> flags to see if I should unlock before calling doit(). That allows me to run my
> tests again and hit this issue. I imagine that I could hit this issue without
> any changes if I re-work my vendor ops to not need the RTNL.
> 
> > 
> > Given that we have rtnl_lock_unregistering() (and also
> > rtnl_lock_unregistering_all()), it looks like we *do* in fact at least
> > not want to make an assumption that no user of __rtnl_unlock() can have
> > added a todo item.
> > 
> > I mean, there's technically yet *another* thing we could do - something
> > like this:
> > 
> > [this doesn't compile, need to suitably make net_todo_list non-static]
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -95,6 +95,7 @@ void __rtnl_unlock(void)
> >  
> >         defer_kfree_skb_list = NULL;
> >  
> > +       WARN_ON(!list_empty(&net_todo_list));
> >         mutex_unlock(&rtnl_mutex);
> >  
> >         while (head) {
> > 
> > and actually that would allow us to get rid of rtnl_lock_unregistering()
> > and rtnl_lock_unregistering_all() simply because we'd actually guarantee
> > the invariant that when the RTNL is freshly locked, the list is empty
> > (by guaranteeing that it's always empty when it's unlocked, since it can
> > only be added to under RTNL).
> > 
> > With some suitable commentary, that might also be a reasonable thing?
> > __rtnl_unlock() is actually rather pretty rare, and not exported.
> > 
> > 
> > However, if you don't like that ...
> > 
> > I've been testing with this patch, to make lockdep complain:
> > 
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -9933,6 +9933,11 @@ void netdev_run_todo(void)
> >         if (!list_empty(&list))
> >                 rcu_barrier();
> >  
> > +#ifdef CONFIG_LOCKDEP
> > +       rtnl_lock();
> > +       __rtnl_unlock();
> > +#endif
> > +
> >         while (!list_empty(&list)) {
> >                 struct net_device *dev
> >                         = list_first_entry(&list, struct net_device, todo_list);
> > 
> > 
> > That causes lockdep to complain for cfg80211 even if the list *is* in
> > fact empty.
> > 
> > Would you be open to adding something like that? Perhaps if I don't just
> > do the easy rtnl_lock/unlock, but try to find the corresponding lockdep-
> > only things to do there, to cause lockdep to do things without really
> > locking? OTOH, the locking overhead of the RTNL we just unlocked is
> > probably minimal, vs. the actual work *lockdep* is doing to track all
> > this ...
> > 
> > Thanks,
> > johannes
> 
> Let me know if you'd like me to test any patches out.
> 
> Thanks,
> Will

Hi Johannes,

I found that my wlan driver is using the vendor commands to create/delete NAN
interfaces for this Android feature called Wi-Fi aware [1]. Basically, this
features allows users to discover other nearby devices and allows them to
connect directly with one another over a local network. To get my driver
working again, I first had to allow the kernel to let my driver request for the
RTNL lock for these NAN create/delete interface vendor commands. With that
I got the following code path:


Thread 1                         Thread 2
 nl80211_pre_doit():
   rtnl_lock()
   wiphy_lock()                   nl80211_pre_doit():
                                    rtnl_lock() // blocked by Thread 1
 nl80211_vendor_cmd():
   doit()
     cfg80211_unregister_netdevice()
   rtnl_unlock():
     netdev_run_todo():
       __rtnl_unlock()
                                    <got RTNL lock>
                                    wiphy_lock() // blocked by Thread 1
       rtnl_lock(); // DEADLOCK
 nl80211_post_doit():
   wiphy_unlock();


Since I'm unlocking the RTNL inside nl80211_vendor_cmd() after calling doit()
instead of waiting till post_doit(), I get into the situation you mentioned
where the net_todo_list is not empty when calling rtnl_unlock. So I decided to
drop the rtnl_unlock() in nl80211_vendor_cmd() and defer that until
nl80211_post_doit() after calling wiphy_unlock(). With this change, I haven't
been able to reproduce the deadlock. So it's possible that we aren't actually
able to hit this deadlock in nl80211_pre_doit() with the existing code since,
as you mentioned, one wouldn't be able to call unregister_netdevice() without
having the RTNL lock.

Sorry if I sent you down a rabbit hole with the first code path scenario.

Thanks,
Will

[1] https://developer.android.com/guide/topics/connectivity/wifi-aware

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 17:01               ` Johannes Berg
  2022-03-25 18:08                 ` William McVicker
@ 2022-03-25 20:40                 ` Jakub Kicinski
  2022-03-25 21:25                   ` Johannes Berg
  1 sibling, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2022-03-25 20:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: William McVicker, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 25 Mar 2022 18:01:30 +0100 Johannes Berg wrote:
> That's not a bad idea, but I think I wouldn't want to backport that, so
> separately :) I don't think that fundamentally changes the locking
> properties though.
> 
> 
> Couple of more questions I guess: First, are we assuming that the
> cfg80211 code *is* actually broken, even if it looks like nothing can
> cause the situation, due to the empty todo list?

Right.

> Given that we have rtnl_lock_unregistering() (and also
> rtnl_lock_unregistering_all()), it looks like we *do* in fact at least
> not want to make an assumption that no user of __rtnl_unlock() can have
> added a todo item.
> 
> I mean, there's technically yet *another* thing we could do - something
> like this:
> 
> [this doesn't compile, need to suitably make net_todo_list non-static]
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -95,6 +95,7 @@ void __rtnl_unlock(void)
>  
>         defer_kfree_skb_list = NULL;
>  
> +       WARN_ON(!list_empty(&net_todo_list));
>         mutex_unlock(&rtnl_mutex);
>  
>         while (head) {

Yeah, I think we could do that.

> and actually that would allow us to get rid of rtnl_lock_unregistering()
> and rtnl_lock_unregistering_all() simply because we'd actually guarantee
> the invariant that when the RTNL is freshly locked, the list is empty
> (by guaranteeing that it's always empty when it's unlocked, since it can
> only be added to under RTNL).

TBH I don't know what you mean by rtnl_lock_unregistering(), I don't
have that in my tree. rtnl_lock_unregistering_all() can't hurt the case
we're talking about AFAICT.

Eric removed some of the netns / loopback dependencies in net-next, 
make sure you pull!

> With some suitable commentary, that might also be a reasonable thing?
> __rtnl_unlock() is actually rather pretty rare, and not exported.

The main use for it seems to be re-locking before loading a module,
which TBH I have no idea why, is it just a cargo cult or a historical
thing :S  I don't see how letting netdevs leave before _loading_ 
a module makes any difference whatsoever.

> However, if you don't like that ...
> 
> I've been testing with this patch, to make lockdep complain:
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -9933,6 +9933,11 @@ void netdev_run_todo(void)
>         if (!list_empty(&list))
>                 rcu_barrier();
>  
> +#ifdef CONFIG_LOCKDEP
> +       rtnl_lock();
> +       __rtnl_unlock();
> +#endif
> +
>         while (!list_empty(&list)) {
>                 struct net_device *dev
>                         = list_first_entry(&list, struct net_device, todo_list);
> 
> 
> That causes lockdep to complain for cfg80211 even if the list *is* in
> fact empty.
> 
> Would you be open to adding something like that? Perhaps if I don't just
> do the easy rtnl_lock/unlock, but try to find the corresponding lockdep-
> only things to do there, to cause lockdep to do things without really
> locking? OTOH, the locking overhead of the RTNL we just unlocked is
> probably minimal, vs. the actual work *lockdep* is doing to track all
> this ...

The WARN_ON() you suggested up front make perfect sense to me.
You can also take the definition of net_unlink_todo() out of
netdevice.h while at it because o_0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 20:36                   ` William McVicker
@ 2022-03-25 21:16                     ` Johannes Berg
  2022-03-25 21:54                       ` Johannes Berg
                                         ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 21:16 UTC (permalink / raw)
  To: William McVicker
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 2022-03-25 at 20:36 +0000, William McVicker wrote:
> 
> I found that my wlan driver is using the vendor commands to create/delete NAN
> interfaces for this Android feature called Wi-Fi aware [1]. Basically, this
> features allows users to discover other nearby devices and allows them to
> connect directly with one another over a local network. 
> 

Wait, why is it doing that? We actually support a NAN interface type
upstream :) It's not really quite fully fleshed out, but it could be?
Probably should be?


> Thread 1                         Thread 2
>  nl80211_pre_doit():
>    rtnl_lock()
>    wiphy_lock()                   nl80211_pre_doit():
>                                     rtnl_lock() // blocked by Thread 1
>  nl80211_vendor_cmd():
>    doit()
>      cfg80211_unregister_netdevice()
>    rtnl_unlock():
>      netdev_run_todo():
>        __rtnl_unlock()
>                                     <got RTNL lock>
>                                     wiphy_lock() // blocked by Thread 1
>        rtnl_lock(); // DEADLOCK
>  nl80211_post_doit():
>    wiphy_unlock();


Right, this is what I had discussed in my other mails.

Basically, you're actually doing (some form of) unregister_netdevice()
before rtnl_unlock().

Clearly this isn't possible in cfg80211 itself.

However, I couldn't entirely discount the possibility that this is
possible:

Thread 1                   Thread 2
                            rtnl_lock()
                            unregister_netdevice()
                            __rtnl_unlock()
rtnl_lock()
wiphy_lock()
netdev_run_todo()
 __rtnl_unlock()
 // list not empty now    
 // because of thread 2     rtnl_lock()
 rtnl_lock()
                            wiphy_lock()

** DEADLOCK **


Given my other discussion with Jakub though, it seems that we can indeed
make sure that this cannot happen, and then this scenario is impossible
without the unregistration you're doing.

> Since I'm unlocking the RTNL inside nl80211_vendor_cmd() after calling doit()
> instead of waiting till post_doit(), I get into the situation you mentioned
> where the net_todo_list is not empty when calling rtnl_unlock. So I decided to
> drop the rtnl_unlock() in nl80211_vendor_cmd() and defer that until
> nl80211_post_doit() after calling wiphy_unlock(). With this change, I haven't
> been able to reproduce the deadlock. So it's possible that we aren't actually
> able to hit this deadlock in nl80211_pre_doit() with the existing code since,
> as you mentioned, one wouldn't be able to call unregister_netdevice() without
> having the RTNL lock.
> 

Right, this is why I said earlier that actually adding a flag for vendor
commands to get the RTNL would be more complex - you'd have to basically
open-code pre_doit() and post_doit() in there and check the sub-command
flag at the very beginning and very end.

johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 20:40                 ` Jakub Kicinski
@ 2022-03-25 21:25                   ` Johannes Berg
  2022-03-25 21:48                     ` Jakub Kicinski
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 21:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: William McVicker, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Xinming Hu,
	kernel-team, Paolo Abeni, Eric Dumazet, Cong Wang, Cong Wang,
	Eric W. Biederman

On Fri, 2022-03-25 at 13:40 -0700, Jakub Kicinski wrote:
> On Fri, 25 Mar 2022 18:01:30 +0100 Johannes Berg wrote:
> > That's not a bad idea, but I think I wouldn't want to backport that, so
> > separately :) I don't think that fundamentally changes the locking
> > properties though.
> > 
> > 
> > Couple of more questions I guess: First, are we assuming that the
> > cfg80211 code *is* actually broken, even if it looks like nothing can
> > cause the situation, due to the empty todo list?
> 
> Right.

I guess that the below is basically saying "it's not really broken" :)

> > Given that we have rtnl_lock_unregistering() (and also
> > rtnl_lock_unregistering_all()), it looks like we *do* in fact at least
> > not want to make an assumption that no user of __rtnl_unlock() can have
> > added a todo item.
> > 
> > I mean, there's technically yet *another* thing we could do - something
> > like this:
> > 
> > [this doesn't compile, need to suitably make net_todo_list non-static]
> > --- a/net/core/rtnetlink.c
> > +++ b/net/core/rtnetlink.c
> > @@ -95,6 +95,7 @@ void __rtnl_unlock(void)
> >  
> >         defer_kfree_skb_list = NULL;
> >  
> > +       WARN_ON(!list_empty(&net_todo_list));
> >         mutex_unlock(&rtnl_mutex);
> >  
> >         while (head) {
> 
> Yeah, I think we could do that.

It seems that would be simpler. Even if I might eventually still want to
do the cfg80211 change, but it would give us some confidence that this
really cannot be happening anywhere.


> TBH I don't know what you mean by rtnl_lock_unregistering(), I don't
> have that in my tree. rtnl_lock_unregistering_all() can't hurt the case
> we're talking about AFAICT.
> 
> Eric removed some of the netns / loopback dependencies in net-next, 
> make sure you pull!

Sorry, yeah, I was looking at an older tree where I was testing on in
the simulation environment - this disappeared in commit 8a4fc54b07d7
("net: get rid of rtnl_lock_unregistering()").

> > With some suitable commentary, that might also be a reasonable thing?
> > __rtnl_unlock() is actually rather pretty rare, and not exported.
> 
> The main use for it seems to be re-locking before loading a module,
> which TBH I have no idea why, is it just a cargo cult or a historical
> thing :S  I don't see how letting netdevs leave before _loading_ 
> a module makes any difference whatsoever.

Indeed.


> The WARN_ON() you suggested up front make perfect sense to me.
> You can also take the definition of net_unlink_todo() out of
> netdevice.h while at it because o_0

Heh indeed, what?

But (and now I'll CC even more people...) if we can actually have an
invariant that while RTNL is unlocked the todo list is empty, then we
also don't need rtnl_lock_unregistering_all(), and can remove the
netdev_unregistering_wq, etc., no?

IOW, I'm not sure why we needed commit 50624c934db1 ("net: Delay
default_device_exit_batch until no devices are unregistering v2"), but I
also have little doubt that we did.

Ah, no. This isn't about locking in this case, it's literally about
ensuring that free_netdev() has been called in netdev_run_todo()?

Which we don't care about in cfg80211 - we just care about the list
being empty so there's no chance we'll reacquire the RTNL.

johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 21:25                   ` Johannes Berg
@ 2022-03-25 21:48                     ` Jakub Kicinski
  2022-03-25 21:50                       ` Johannes Berg
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2022-03-25 21:48 UTC (permalink / raw)
  To: Johannes Berg
  Cc: William McVicker, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Xinming Hu,
	kernel-team, Paolo Abeni, Eric Dumazet, Cong Wang, Cong Wang,
	Eric W. Biederman

On Fri, 25 Mar 2022 22:25:05 +0100 Johannes Berg wrote:
> > > With some suitable commentary, that might also be a reasonable thing?
> > > __rtnl_unlock() is actually rather pretty rare, and not exported.  
> > 
> > The main use for it seems to be re-locking before loading a module,
> > which TBH I have no idea why, is it just a cargo cult or a historical
> > thing :S  I don't see how letting netdevs leave before _loading_ 
> > a module makes any difference whatsoever.  
> 
> Indeed.
> 
> > The WARN_ON() you suggested up front make perfect sense to me.
> > You can also take the definition of net_unlink_todo() out of
> > netdevice.h while at it because o_0  
> 
> Heh indeed, what?

To be clear - I just meant that it's declaring a static variable in 
a header, so I doubt that it'll do the right thing unless it's only
called from one compilation unit.

> But (and now I'll CC even more people...) if we can actually have an
> invariant that while RTNL is unlocked the todo list is empty, then we
> also don't need rtnl_lock_unregistering_all(), and can remove the
> netdev_unregistering_wq, etc., no?
> 
> IOW, I'm not sure why we needed commit 50624c934db1 ("net: Delay
> default_device_exit_batch until no devices are unregistering v2"), but I
> also have little doubt that we did.
> 
> Ah, no. This isn't about locking in this case, it's literally about
> ensuring that free_netdev() has been called in netdev_run_todo()?

Yup, multiple contexts sitting independently in netdev_run_todo() and
chewing on netdevs is slightly different. destructors of those netdevs
could be pointing at memory of a module being unloaded.

> Which we don't care about in cfg80211 - we just care about the list
> being empty so there's no chance we'll reacquire the RTNL.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 21:48                     ` Jakub Kicinski
@ 2022-03-25 21:50                       ` Johannes Berg
  0 siblings, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 21:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: William McVicker, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Xinming Hu,
	kernel-team, Paolo Abeni, Eric Dumazet, Cong Wang, Cong Wang,
	Eric W. Biederman

On Fri, 2022-03-25 at 14:48 -0700, Jakub Kicinski wrote:
> > 
> > > The WARN_ON() you suggested up front make perfect sense to me.
> > > You can also take the definition of net_unlink_todo() out of
> > > netdevice.h while at it because o_0  
> > 
> > Heh indeed, what?
> 
> To be clear - I just meant that it's declaring a static variable in 
> a header, so I doubt that it'll do the right thing unless it's only
> called from one compilation unit.

Right, it's odd. I just made a patch (will send in a minute) moving the
entire block to dev.c, which is the only user of any of it.

> > Ah, no. This isn't about locking in this case, it's literally about
> > ensuring that free_netdev() has been called in netdev_run_todo()?
> 
> Yup, multiple contexts sitting independently in netdev_run_todo() and
> chewing on netdevs is slightly different. destructors of those netdevs
> could be pointing at memory of a module being unloaded.

Right, thanks.

johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 21:16                     ` Johannes Berg
@ 2022-03-25 21:54                       ` Johannes Berg
  2022-03-25 22:18                       ` Jeff Johnson
  2022-03-25 23:57                       ` William McVicker
  2 siblings, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2022-03-25 21:54 UTC (permalink / raw)
  To: William McVicker
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 2022-03-25 at 22:16 +0100, Johannes Berg wrote:
> 
> > Thread 1                         Thread 2
> >  nl80211_pre_doit():
> >    rtnl_lock()
> >    wiphy_lock()                   nl80211_pre_doit():
> >                                     rtnl_lock() // blocked by Thread 1
> >  nl80211_vendor_cmd():
> >    doit()
> >      cfg80211_unregister_netdevice()
> >    rtnl_unlock():
> >      netdev_run_todo():
> >        __rtnl_unlock()
> >                                     <got RTNL lock>
> >                                     wiphy_lock() // blocked by Thread 1
> >        rtnl_lock(); // DEADLOCK
> >  nl80211_post_doit():
> >    wiphy_unlock();
> 
> 
> Right, this is what I had discussed in my other mails.
> 
> Basically, you're actually doing (some form of) unregister_netdevice()
> before rtnl_unlock().
> 
> Clearly this isn't possible in cfg80211 itself.
> 
> However, I couldn't entirely discount the possibility that this is
> possible:
> 
> Thread 1                   Thread 2
>                             rtnl_lock()
>                             unregister_netdevice()
>                             __rtnl_unlock()
> rtnl_lock()
> wiphy_lock()
> netdev_run_todo()
>  __rtnl_unlock()
>  // list not empty now    
>  // because of thread 2     rtnl_lock()
>  rtnl_lock()
>                             wiphy_lock()
> 
> ** DEADLOCK **
> 
> 
> Given my other discussion with Jakub though, it seems that we can indeed
> make sure that this cannot happen, and then this scenario is impossible
> without the unregistration you're doing.
> 

I just sent a patch for this then, forgot to CC everyone:

https://lore.kernel.org/r/20220325225055.37e89a72f814.Ic73d206e217db20fd22dcec14fe5442ca732804b@changeid

But basically it changes nothing, just adds a WARN_ON with documentation
ensuring that the invariant never breaks, i.e. that Thread 2 can't
happen.

And maybe I should've written that with 3 Threads, but the setup of
unregister_netdevice()/__rtnl_unlock() could happen anywhere in the
system anyway.


johannes

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 21:16                     ` Johannes Berg
  2022-03-25 21:54                       ` Johannes Berg
@ 2022-03-25 22:18                       ` Jeff Johnson
  2022-03-25 23:57                       ` William McVicker
  2 siblings, 0 replies; 19+ messages in thread
From: Jeff Johnson @ 2022-03-25 22:18 UTC (permalink / raw)
  To: Johannes Berg, William McVicker
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On 3/25/2022 2:16 PM, Johannes Berg wrote:
> On Fri, 2022-03-25 at 20:36 +0000, William McVicker wrote:
>>
>> I found that my wlan driver is using the vendor commands to create/delete NAN
>> interfaces for this Android feature called Wi-Fi aware [1]. Basically, this
>> features allows users to discover other nearby devices and allows them to
>> connect directly with one another over a local network.
>>
> 
> Wait, why is it doing that? We actually support a NAN interface type
> upstream :) It's not really quite fully fleshed out, but it could be?
> Probably should be?

And this is the issue with Android drivers. Android team proposes 
changes to the Wifi HAL and driver vendors have to implement those 
quickly to meet product deadlines. Some infrastructure changes we're 
able to get into the core kernel without having an in-tree driver that 
uses them (such as introducing NL80211_IFTYPE_NAN), but there have been 
instances of core kernel changes being rejected because there was not an 
in-tree user.

Yes, in your ideal world all of the Android wifi drivers would be 
in-tree. And in that ideal world every release cycle the Android team 
would advocate for core kernel changes needed to support the new 
features of the HAL. But past history has shown attempts to upstream new 
features has been delayed, perhaps in part due to the absence of an 
in-tree driver that utilizes those features, and the only way to meet 
product deadlines is to take the vendor command route.

And yes my out-of-tree driver is facing the exact same issue with NAN 
interface creation and deletion via vendor commands.

Previously you had suggested:
> Your easiest option might be to just patch NL80211_FLAG_NEED_RTNL into
> your kernel for vendor commands and call it a day? 

Would you consider taking that upstream given that there are very few 
in-tree users of vendor commands, and I fear Will and I aren't the only 
ones who'll face this issue?

Will, suggest you at least advocate for getting this into the 5.15 ACK.

/jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 21:16                     ` Johannes Berg
  2022-03-25 21:54                       ` Johannes Berg
  2022-03-25 22:18                       ` Jeff Johnson
@ 2022-03-25 23:57                       ` William McVicker
  2022-03-26  0:07                         ` Jakub Kicinski
  2 siblings, 1 reply; 19+ messages in thread
From: William McVicker @ 2022-03-25 23:57 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Jakub Kicinski, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On 03/25/2022, Johannes Berg wrote:
> On Fri, 2022-03-25 at 20:36 +0000, William McVicker wrote:
> > 
> > I found that my wlan driver is using the vendor commands to create/delete NAN
> > interfaces for this Android feature called Wi-Fi aware [1]. Basically, this
> > features allows users to discover other nearby devices and allows them to
> > connect directly with one another over a local network. 
> > 
> 
> Wait, why is it doing that? We actually support a NAN interface type
> upstream :) It's not really quite fully fleshed out, but it could be?
> Probably should be?
> 
> 
> > Thread 1                         Thread 2
> >  nl80211_pre_doit():
> >    rtnl_lock()
> >    wiphy_lock()                   nl80211_pre_doit():
> >                                     rtnl_lock() // blocked by Thread 1
> >  nl80211_vendor_cmd():
> >    doit()
> >      cfg80211_unregister_netdevice()
> >    rtnl_unlock():
> >      netdev_run_todo():
> >        __rtnl_unlock()
> >                                     <got RTNL lock>
> >                                     wiphy_lock() // blocked by Thread 1
> >        rtnl_lock(); // DEADLOCK
> >  nl80211_post_doit():
> >    wiphy_unlock();
> 
> 
> Right, this is what I had discussed in my other mails.
> 
> Basically, you're actually doing (some form of) unregister_netdevice()
> before rtnl_unlock().
> 
> Clearly this isn't possible in cfg80211 itself.
> 
> However, I couldn't entirely discount the possibility that this is
> possible:
> 
> Thread 1                   Thread 2
>                             rtnl_lock()
>                             unregister_netdevice()
>                             __rtnl_unlock()
> rtnl_lock()
> wiphy_lock()
> netdev_run_todo()
>  __rtnl_unlock()
>  // list not empty now    
>  // because of thread 2     rtnl_lock()
>  rtnl_lock()
>                             wiphy_lock()
> 
> ** DEADLOCK **
> 
> 
> Given my other discussion with Jakub though, it seems that we can indeed
> make sure that this cannot happen, and then this scenario is impossible
> without the unregistration you're doing.

Sounds good.

> 
> > Since I'm unlocking the RTNL inside nl80211_vendor_cmd() after calling doit()
> > instead of waiting till post_doit(), I get into the situation you mentioned
> > where the net_todo_list is not empty when calling rtnl_unlock. So I decided to
> > drop the rtnl_unlock() in nl80211_vendor_cmd() and defer that until
> > nl80211_post_doit() after calling wiphy_unlock(). With this change, I haven't
> > been able to reproduce the deadlock. So it's possible that we aren't actually
> > able to hit this deadlock in nl80211_pre_doit() with the existing code since,
> > as you mentioned, one wouldn't be able to call unregister_netdevice() without
> > having the RTNL lock.
> > 
> 
> Right, this is why I said earlier that actually adding a flag for vendor
> commands to get the RTNL would be more complex - you'd have to basically
> open-code pre_doit() and post_doit() in there and check the sub-command
> flag at the very beginning and very end.
> 
> johannes

Instead of open coding it, we could just pass the internal_flags via struct
genl_info to nl80211_vendor_cmds() and then handle the rtnl_unlock() there if
the vendor command doesn't request it. I included the patch below in case
there's any chance you would consider this for upstream. This would at least
add backwards compatibility to the vendor ops API so that existing drivers that
depend on the RTNL being held don't need to be fully refactored.

Thanks,
Will

[1] https://lore.kernel.org/all/487e4136-94dc-5a77-89c7-e416a05c3a35@quicinc.com/

---
 include/net/cfg80211.h  |  1 +
 include/net/genetlink.h |  1 +
 net/netlink/genetlink.c |  1 +
 net/wireless/nl80211.c  | 54 +++++++++++++++++++++++++++++------------
 4 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 30d86032e8cb..01f8a2cc6d11 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -4706,6 +4706,7 @@ enum wiphy_vendor_command_flags {
 	WIPHY_VENDOR_CMD_NEED_WDEV = BIT(0),
 	WIPHY_VENDOR_CMD_NEED_NETDEV = BIT(1),
 	WIPHY_VENDOR_CMD_NEED_RUNNING = BIT(2),
+	WIPHY_VENDOR_CMD_NEED_RTNL = BIT(3),
 };
 
 /**
diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index 7cb3fa8310ed..e92796366492 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -92,6 +92,7 @@ struct genl_info {
 	possible_net_t		_net;
 	void *			user_ptr[2];
 	struct netlink_ext_ack *extack;
+	u8			internal_flags;
 };
 
 static inline struct net *genl_info_net(struct genl_info *info)
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 1afca2a6c2ac..2db1c07c9f5a 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -719,6 +719,7 @@ static int genl_family_rcv_msg_doit(const struct genl_family *family,
 	info.userhdr = nlmsg_data(nlh) + GENL_HDRLEN;
 	info.attrs = attrbuf;
 	info.extack = extack;
+	info.internal_flags = ops->internal_flags;
 	genl_info_net_set(&info, net);
 	memset(&info.user_ptr, 0, sizeof(info.user_ptr));
 
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 686a69381731..561c3cd3a9a0 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -13991,6 +13991,19 @@ static int nl80211_vendor_check_policy(const struct wiphy_vendor_command *vcmd,
 	return nla_validate_nested(attr, vcmd->maxattr, vcmd->policy, extack);
 }
 
+#define NL80211_FLAG_NEED_WIPHY		0x01
+#define NL80211_FLAG_NEED_NETDEV	0x02
+#define NL80211_FLAG_NEED_RTNL		0x04
+#define NL80211_FLAG_CHECK_NETDEV_UP	0x08
+#define NL80211_FLAG_NEED_NETDEV_UP	(NL80211_FLAG_NEED_NETDEV |\
+					 NL80211_FLAG_CHECK_NETDEV_UP)
+#define NL80211_FLAG_NEED_WDEV		0x10
+/* If a netdev is associated, it must be UP, P2P must be started */
+#define NL80211_FLAG_NEED_WDEV_UP	(NL80211_FLAG_NEED_WDEV |\
+					 NL80211_FLAG_CHECK_NETDEV_UP)
+#define NL80211_FLAG_CLEAR_SKB		0x20
+#define NL80211_FLAG_NO_WIPHY_MTX	0x40
+
 static int nl80211_vendor_cmd(struct sk_buff *skb, struct genl_info *info)
 {
 	struct cfg80211_registered_device *rdev = info->user_ptr[0];
@@ -13999,6 +14012,12 @@ static int nl80211_vendor_cmd(struct sk_buff *skb, struct genl_info *info)
 					   info->attrs);
 	int i, err;
 	u32 vid, subcmd;
+	bool internal_rtnl_flag = info->internal_flags & NL80211_FLAG_NEED_RTNL;
+
+	/* In case of an error, we need to set the RTNL flag so that we unlock the
+	 * RTNL in post_doit().
+	 */
+	info->internal_flags = NL80211_FLAG_NEED_RTNL;
 
 	if (!rdev->wiphy.vendor_commands)
 		return -EOPNOTSUPP;
@@ -14058,6 +14077,12 @@ static int nl80211_vendor_cmd(struct sk_buff *skb, struct genl_info *info)
 				return err;
 		}
 
+		if (!internal_rtnl_flag && !(vcmd->flags & WIPHY_VENDOR_CMD_NEED_RTNL)) {
+			rtnl_unlock();
+			/* clear the rtnl flag so that we don't unlock in post_doit(). */
+			info->internal_flags &= ~NL80211_FLAG_NEED_RTNL;
+		}
+
 		rdev->cur_cmd_info = info;
 		err = vcmd->doit(&rdev->wiphy, wdev, data, len);
 		rdev->cur_cmd_info = NULL;
@@ -15165,19 +15190,6 @@ static int nl80211_set_fils_aad(struct sk_buff *skb,
 	return rdev_set_fils_aad(rdev, dev, &fils_aad);
 }
 
-#define NL80211_FLAG_NEED_WIPHY		0x01
-#define NL80211_FLAG_NEED_NETDEV	0x02
-#define NL80211_FLAG_NEED_RTNL		0x04
-#define NL80211_FLAG_CHECK_NETDEV_UP	0x08
-#define NL80211_FLAG_NEED_NETDEV_UP	(NL80211_FLAG_NEED_NETDEV |\
-					 NL80211_FLAG_CHECK_NETDEV_UP)
-#define NL80211_FLAG_NEED_WDEV		0x10
-/* If a netdev is associated, it must be UP, P2P must be started */
-#define NL80211_FLAG_NEED_WDEV_UP	(NL80211_FLAG_NEED_WDEV |\
-					 NL80211_FLAG_CHECK_NETDEV_UP)
-#define NL80211_FLAG_CLEAR_SKB		0x20
-#define NL80211_FLAG_NO_WIPHY_MTX	0x40
-
 static int nl80211_pre_doit(const struct genl_ops *ops, struct sk_buff *skb,
 			    struct genl_info *info)
 {
@@ -15231,8 +15243,14 @@ static int nl80211_pre_doit(const struct genl_ops *ops, struct sk_buff *skb,
 		/* we keep the mutex locked until post_doit */
 		__release(&rdev->wiphy.mtx);
 	}
-	if (!(ops->internal_flags & NL80211_FLAG_NEED_RTNL))
-		rtnl_unlock();
+
+	/* NL80211 vendor command doit() will handle the RTNL unlocking based on the
+	 * vendor command flags.
+	 */
+	if (ops->cmd != NL80211_CMD_VENDOR) {
+		if (!(ops->internal_flags & NL80211_FLAG_NEED_RTNL))
+			rtnl_unlock();
+	}
 
 	return 0;
 }
@@ -15259,7 +15277,11 @@ static void nl80211_post_doit(const struct genl_ops *ops, struct sk_buff *skb,
 		wiphy_unlock(&rdev->wiphy);
 	}
 
-	if (ops->internal_flags & NL80211_FLAG_NEED_RTNL)
+	/* If a vendor command requested for the RTNL, then it will set the
+	 * info->internal_flags to indicate that the RTNL needs to be released.
+	 */
+	if (ops->internal_flags & NL80211_FLAG_NEED_RTNL ||
+	    info->internal_flags & NL80211_FLAG_NEED_RTNL)
 		rtnl_unlock();
 
 	/* If needed, clear the netlink message payload from the SKB
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-25 23:57                       ` William McVicker
@ 2022-03-26  0:07                         ` Jakub Kicinski
  2022-03-26  0:12                           ` William McVicker
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Kicinski @ 2022-03-26  0:07 UTC (permalink / raw)
  To: William McVicker
  Cc: Johannes Berg, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On Fri, 25 Mar 2022 23:57:33 +0000 William McVicker wrote:
> Instead of open coding it, we could just pass the internal_flags via struct
> genl_info to nl80211_vendor_cmds() and then handle the rtnl_unlock() there if
> the vendor command doesn't request it. I included the patch below in case
> there's any chance you would consider this for upstream. This would at least
> add backwards compatibility to the vendor ops API so that existing drivers that
> depend on the RTNL being held don't need to be fully refactored.

Sorry to step in, Johannes may be AFK already. There's no asterisk next
to the "we don't cater to out of tree code" rule, AFAIK.  We change
locking often, making a precedent like this would be ill-advised.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [BUG] deadlock in nl80211_vendor_cmd
  2022-03-26  0:07                         ` Jakub Kicinski
@ 2022-03-26  0:12                           ` William McVicker
  0 siblings, 0 replies; 19+ messages in thread
From: William McVicker @ 2022-03-26  0:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Johannes Berg, linux-wireless, Marek Szyprowski, Kalle Valo,
	David S. Miller, netdev, Amitkumar Karwar, Ganapathi Bhat,
	Xinming Hu, kernel-team, Paolo Abeni

On 03/25/2022, Jakub Kicinski wrote:
> On Fri, 25 Mar 2022 23:57:33 +0000 William McVicker wrote:
> > Instead of open coding it, we could just pass the internal_flags via struct
> > genl_info to nl80211_vendor_cmds() and then handle the rtnl_unlock() there if
> > the vendor command doesn't request it. I included the patch below in case
> > there's any chance you would consider this for upstream. This would at least
> > add backwards compatibility to the vendor ops API so that existing drivers that
> > depend on the RTNL being held don't need to be fully refactored.
> 
> Sorry to step in, Johannes may be AFK already. There's no asterisk next
> to the "we don't cater to out of tree code" rule, AFAIK.  We change
> locking often, making a precedent like this would be ill-advised.

Yeah I understand. I'll talk to Broadcom about this to see why they didn't use
the existing upstream NAN interface. This sounds like it's going to be
a problem for all the Android out-of-tree drivers.

Thanks for the help!

--Will

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-03-26  0:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <0000000000009e9b7105da6d1779@google.com>
     [not found] ` <CABYd82Z=YXmZPTQhf0K1M4nS2wk3dPBSqx91D8SoUd59AUzpHg@mail.gmail.com>
2022-03-21 17:00   ` [BUG] deadlock in nl80211_vendor_cmd William McVicker
     [not found] ` <99eda6d1dad3ff49435b74e539488091642b10a8.camel@sipsolutions.net>
     [not found]   ` <5d5cf050-7de0-7bad-2407-276970222635@quicinc.com>
     [not found]     ` <YjpGlRvcg72zNo8s@google.com>
     [not found]       ` <dc556455-51a2-06e8-8ec5-b807c2901b7e@quicinc.com>
2022-03-24 21:58         ` William McVicker
2022-03-25 12:04           ` Johannes Berg
2022-03-25 12:06             ` Johannes Berg
2022-03-25 16:49             ` Jakub Kicinski
2022-03-25 17:01               ` Johannes Berg
2022-03-25 18:08                 ` William McVicker
2022-03-25 20:21                   ` Johannes Berg
2022-03-25 20:36                   ` William McVicker
2022-03-25 21:16                     ` Johannes Berg
2022-03-25 21:54                       ` Johannes Berg
2022-03-25 22:18                       ` Jeff Johnson
2022-03-25 23:57                       ` William McVicker
2022-03-26  0:07                         ` Jakub Kicinski
2022-03-26  0:12                           ` William McVicker
2022-03-25 20:40                 ` Jakub Kicinski
2022-03-25 21:25                   ` Johannes Berg
2022-03-25 21:48                     ` Jakub Kicinski
2022-03-25 21:50                       ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).