All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miquel Raynal <miquel.raynal@bootlin.com>
To: Alexander Aring <aahringo@redhat.com>
Cc: Alexander Aring <alex.aring@gmail.com>,
	Stefan Schmidt <stefan@datenfreihafen.org>,
	linux-wpan - ML <linux-wpan@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	Network Development <netdev@vger.kernel.org>,
	David Girault <david.girault@qorvo.com>,
	Romuald Despres <romuald.despres@qorvo.com>,
	Frederic Blain <frederic.blain@qorvo.com>,
	Nicolas Schodet <nico@ni.fr.eu.org>,
	Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Subject: Re: [PATCH wpan-next 19/20] ieee802154: hwsim: Do not check the rtnl
Date: Fri, 26 Aug 2022 00:41:39 +0200	[thread overview]
Message-ID: <20220826004139.7f04e375@xps-13> (raw)
In-Reply-To: <20220819190944.0718c7e1@xps-13>

Hi Alexander,

miquel.raynal@bootlin.com wrote on Fri, 19 Aug 2022 19:09:44 +0200:

> Hi Alexander,
> 
> aahringo@redhat.com wrote on Tue, 5 Jul 2022 21:23:21 -0400:
> 
> > Hi,
> > 
> > On Fri, Jul 1, 2022 at 10:37 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote:  
> > >
> > > There is no need to ensure the rtnl is locked when changing a driver's
> > > channel. This cause issues when scanning and this is the only driver
> > > relying on it. Just drop this dependency because it does not seem
> > > legitimate.
> > >    
> > 
> > switching channels relies on changing pib attributes, pib attributes
> > are protected by rtnl. If you experience issues here then it's
> > probably because you do something wrong. All drivers assuming here
> > that rtnl lock is held.  
> 
> ---8<---
> > especially this change could end in invalid free. Maybe we can solve
> > this problem in a different way, what exactly is the problem by
> > helding rtnl lock?
> --->8---  
> 
> During a scan we need to change channels. So when the background job
> kicks-in, we first acquire scan_lock, then we check the internal
> parameters of the structure protected by this lock, like the next
> channel to use and the sdata pointer. A channel change must be
> performed, preceded by an rtnl_lock(). This will again trigger a
> possible circular lockdep dependency warning because the triggering path
> acquires the rtnl (as part of the netlink layer) before the scan lock.
> 
> One possible solution would be to do the following:
> scan_work() {
> 	acquire(scan_lock);
> 	// do some config
> 	release(scan_lock);
> 	rtnl_lock();
> 	perform_channel_change();
> 	rtnl_unlock();
> 	acquire(scan_lock);
> 	// reinit the scan struct ptr and the sdata ptr
> 	// do some more things
> 	release(scan_lock);
> }
> 
> It looks highly non-elegant IMHO. Otherwise I need to stop verifying in
> the drivers that the rtnl is taken. Any third option here?

I've tried two other solutions.

A/ Enforcing the dependency rtnl -> scan_lock

This means always acquiring the rtnl before scan_lock, and in terms of
code requires to take the rtnl in the scan worker. Of course enclosing
the drv_change_chan() call would mean releasing the scan_lock in the
middle and re-taking it after all, which would defeat the protection of
the scan_req structure which the lock is supposed to enforce. So I went
for acquiring the lock at the top, before acquiring scan_lock, of
course.

This does not work because we need to acquire the rtnl in the worker,
while at the same time there are places like ->slave_close which need
to acquire the worker lock (during flush_workqueue()) and this can only
happen under rtnl. Lockdep then complains about a possible circular
dependency.

B/ Avoiding the rtnl in scan operations and allowing the reverse
dependency, which is scan_lock -> rtnl

I've drafted this solution because I think the scan operation do not
really need the rtnl. This idea got reinforced when I found this
wireless change: a05829a7222e ("cfg80211: avoid holding the RTNL when
calling the driver").

But unfortunately I get the same issue again, with the ->close()
implementation which needs to acquire the worker lock to flush, this
makes a rtnl -> worker_lock dependency which is incompatible with a
worker_lock -> scan_lock -> rtnl chain (this is is typically what should
happen when changing the channel during a scan).

So I looked at reducing the scope of scan_lock, in order to avoid
taking it for too long and avoid the scan_lock -> rtnl or rtnl ->
scan_lock dependency in the worker, but I think in the end it is a
truly bad idea.

Finally, I decided I could use another workqueue for the mac related
commands which is not the one for the data. We don't care about
flushing it because we _need_ the beacons/scan workers to be stopped,
which is handled in their dedicated helpers. Doing so removes a rtnl ->
worker_lock dependency, which allows to acquire the rtnl from the
worker. I've mostly implemented it, I'll clean all this up and send a
v2 tomorrow.

Thanks,
Miquèl

  reply	other threads:[~2022-08-25 22:41 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-01 14:30 [PATCH wpan-next 00/20] net: ieee802154: Support scanning/beaconing Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 01/20] net: mac802154: Allow the creation of coordinator interfaces Miquel Raynal
2022-07-06  1:51   ` Alexander Aring
2022-08-19 17:11     ` Miquel Raynal
2022-08-23 12:33       ` Alexander Aring
2022-08-23 16:29         ` Miquel Raynal
2022-08-23 21:44           ` Alexander Aring
2022-08-24  7:35             ` Miquel Raynal
2022-08-24 21:43               ` Alexander Aring
2022-08-25  8:40                 ` Miquel Raynal
2022-08-26  0:51                   ` Alexander Aring
2022-08-26  1:35                     ` Alexander Aring
2022-08-26  8:08                       ` Miquel Raynal
2022-08-29  2:31                         ` Alexander Aring
2022-08-29  8:05                           ` Miquel Raynal
2022-08-26  7:30                     ` Miquel Raynal
2022-08-24 10:20             ` Miquel Raynal
2022-08-24 12:43               ` Alexander Aring
2022-08-24 13:26                 ` Miquel Raynal
2022-08-24 21:53                   ` Alexander Aring
2022-08-25  1:02                     ` Alexander Aring
2022-08-25  8:46                       ` Miquel Raynal
2022-08-25 12:58                     ` Miquel Raynal
2022-08-26  1:05                       ` Alexander Aring
2022-08-26  7:54                         ` Miquel Raynal
2022-08-29  2:52                           ` Alexander Aring
2022-08-29  8:02                             ` Miquel Raynal
2022-08-30  2:23                               ` Alexander Aring
2022-08-31 15:39                                 ` Miquel Raynal
2022-09-01  0:09                                   ` Miquel Raynal
2022-09-01 13:09                                     ` Miquel Raynal
2022-09-02  2:38                                       ` Alexander Aring
2022-09-03  0:08                                         ` Miquel Raynal
2022-09-03 14:20                                           ` Alexander Aring
2022-09-03 14:31                                             ` Alexander Aring
2022-09-03 16:05                                             ` Miquel Raynal
2022-09-03 18:21                                               ` Alexander Aring
2022-09-03 18:29                                                 ` Alexander Aring
2022-09-03 19:07                                               ` Alexander Aring
2022-09-03 19:10                                                 ` Alexander Aring
2022-09-03 19:40                                                   ` Alexander Aring
2022-09-05  3:16                                                     ` Miquel Raynal
2022-09-05 22:35                                                       ` Alexander Aring
2022-09-02  2:23                                     ` Alexander Aring
2022-09-02  2:39                                       ` Alexander Aring
2022-09-02  2:09                                   ` Alexander Aring
2022-07-01 14:30 ` [PATCH wpan-next 02/20] net: ieee802154: Advertize coordinators discovery Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 03/20] net: ieee802154: Handle " Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 04/20] net: ieee802154: Trace the registration of new PANs Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 05/20] net: ieee802154: Define frame types Miquel Raynal
2022-07-11  2:06   ` Alexander Aring
2022-08-19 17:13     ` Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 06/20] net: ieee802154: Add support for user scanning requests Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 07/20] net: ieee802154: Define a beacon frame header Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 08/20] net: mac802154: Prepare forcing specific symbol duration Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 09/20] net: mac802154: Introduce a global device lock Miquel Raynal
2022-07-04  1:12   ` Alexander Aring
2022-08-19 17:06     ` Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 10/20] net: mac802154: Handle passive scanning Miquel Raynal
2022-07-15  3:33   ` Alexander Aring
2022-07-15  3:42     ` Alexander Aring
2022-08-19 17:22       ` Miquel Raynal
2022-08-01 23:42     ` Alexander Aring
2022-08-01 23:54       ` Alexander Aring
2022-07-01 14:30 ` [PATCH wpan-next 11/20] net: ieee802154: Add support for user beaconing requests Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 12/20] net: mac802154: Handle basic beaconing Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 13/20] net: ieee802154: Add support for user active scan requests Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 14/20] net: mac802154: Handle active scanning Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 15/20] net: ieee802154: Add support for allowing to answer BEACON_REQ Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 16/20] net: mac802154: Handle received BEACON_REQ Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 17/20] net: ieee802154: Handle limited devices with only datagram support Miquel Raynal
2022-07-15  3:16   ` Alexander Aring
2022-08-19 17:13     ` Miquel Raynal
2022-08-23 12:43       ` Alexander Aring
2022-07-01 14:30 ` [PATCH wpan-next 18/20] ieee802154: ca8210: Flag the driver as being limited Miquel Raynal
2022-07-01 14:30 ` [PATCH wpan-next 19/20] ieee802154: hwsim: Do not check the rtnl Miquel Raynal
2022-07-06  1:23   ` Alexander Aring
2022-08-01 23:58     ` Alexander Aring
2022-08-19 17:09     ` Miquel Raynal
2022-08-25 22:41       ` Miquel Raynal [this message]
2022-07-01 14:30 ` [PATCH wpan-next 20/20] ieee802154: hwsim: Allow devices to be coordinators Miquel Raynal
2022-07-11  2:01   ` Alexander Aring
2022-08-19 17:12     ` Miquel Raynal
2022-07-04  1:17 ` [PATCH wpan-next 00/20] net: ieee802154: Support scanning/beaconing Alexander Aring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220826004139.7f04e375@xps-13 \
    --to=miquel.raynal@bootlin.com \
    --cc=aahringo@redhat.com \
    --cc=alex.aring@gmail.com \
    --cc=davem@davemloft.net \
    --cc=david.girault@qorvo.com \
    --cc=edumazet@google.com \
    --cc=frederic.blain@qorvo.com \
    --cc=kuba@kernel.org \
    --cc=linux-wpan@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nico@ni.fr.eu.org \
    --cc=pabeni@redhat.com \
    --cc=romuald.despres@qorvo.com \
    --cc=stefan@datenfreihafen.org \
    --cc=thomas.petazzoni@bootlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.