linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: qrtr: ns: Fix the incorrect usage of rcu_read_lock()
@ 2020-10-02 14:15 Manivannan Sadhasivam
  2020-10-02 15:28 ` Doug Anderson
  0 siblings, 1 reply; 3+ messages in thread
From: Manivannan Sadhasivam @ 2020-10-02 14:15 UTC (permalink / raw)
  To: davem, kuba
  Cc: bjorn.andersson, netdev, linux-kernel, dianders, elder,
	Manivannan Sadhasivam

The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API
since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence,
fix it by excluding the locking for kernel_sendmsg().

Fixes: a7809ff90ce6 ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks")
Reported-by: Doug Anderson <dianders@chromium.org>
Tested-by: Alex Elder <elder@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
 net/qrtr/ns.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c
index 934999b56d60..0515433de922 100644
--- a/net/qrtr/ns.c
+++ b/net/qrtr/ns.c
@@ -203,15 +203,17 @@ static int announce_servers(struct sockaddr_qrtr *sq)
 	/* Announce the list of servers registered in this node */
 	radix_tree_for_each_slot(slot, &node->servers, &iter, 0) {
 		srv = radix_tree_deref_slot(slot);
+		rcu_read_unlock();
 
 		ret = service_announce_new(sq, srv);
 		if (ret < 0) {
 			pr_err("failed to announce new service\n");
-			goto err_out;
+			return ret;
 		}
+
+		rcu_read_lock();
 	}
 
-err_out:
 	rcu_read_unlock();
 
 	return ret;
@@ -352,7 +354,9 @@ static int ctrl_cmd_bye(struct sockaddr_qrtr *from)
 	/* Advertise removal of this client to all servers of remote node */
 	radix_tree_for_each_slot(slot, &node->servers, &iter, 0) {
 		srv = radix_tree_deref_slot(slot);
+		rcu_read_unlock();
 		server_del(node, srv->port);
+		rcu_read_lock();
 	}
 	rcu_read_unlock();
 
@@ -368,6 +372,7 @@ static int ctrl_cmd_bye(struct sockaddr_qrtr *from)
 	rcu_read_lock();
 	radix_tree_for_each_slot(slot, &local_node->servers, &iter, 0) {
 		srv = radix_tree_deref_slot(slot);
+		rcu_read_unlock();
 
 		sq.sq_family = AF_QIPCRTR;
 		sq.sq_node = srv->node;
@@ -379,11 +384,11 @@ static int ctrl_cmd_bye(struct sockaddr_qrtr *from)
 		ret = kernel_sendmsg(qrtr_ns.sock, &msg, &iv, 1, sizeof(pkt));
 		if (ret < 0) {
 			pr_err("failed to send bye cmd\n");
-			goto err_out;
+			return ret;
 		}
+		rcu_read_lock();
 	}
 
-err_out:
 	rcu_read_unlock();
 
 	return ret;
@@ -447,6 +452,7 @@ static int ctrl_cmd_del_client(struct sockaddr_qrtr *from,
 	rcu_read_lock();
 	radix_tree_for_each_slot(slot, &local_node->servers, &iter, 0) {
 		srv = radix_tree_deref_slot(slot);
+		rcu_read_unlock();
 
 		sq.sq_family = AF_QIPCRTR;
 		sq.sq_node = srv->node;
@@ -458,11 +464,11 @@ static int ctrl_cmd_del_client(struct sockaddr_qrtr *from,
 		ret = kernel_sendmsg(qrtr_ns.sock, &msg, &iv, 1, sizeof(pkt));
 		if (ret < 0) {
 			pr_err("failed to send del client cmd\n");
-			goto err_out;
+			return ret;
 		}
+		rcu_read_lock();
 	}
 
-err_out:
 	rcu_read_unlock();
 
 	return ret;
@@ -580,7 +586,9 @@ static int ctrl_cmd_new_lookup(struct sockaddr_qrtr *from,
 			if (!server_match(srv, &filter))
 				continue;
 
+			rcu_read_unlock();
 			lookup_notify(from, srv, true);
+			rcu_read_lock();
 		}
 	}
 	rcu_read_unlock();
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net: qrtr: ns: Fix the incorrect usage of rcu_read_lock()
  2020-10-02 14:15 [PATCH] net: qrtr: ns: Fix the incorrect usage of rcu_read_lock() Manivannan Sadhasivam
@ 2020-10-02 15:28 ` Doug Anderson
  2020-10-02 16:04   ` Manivannan Sadhasivam
  0 siblings, 1 reply; 3+ messages in thread
From: Doug Anderson @ 2020-10-02 15:28 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: David Miller, Jakub Kicinski, Bjorn Andersson, netdev, LKML, Alex Elder

Hi,

On Fri, Oct 2, 2020 at 7:15 AM Manivannan Sadhasivam
<manivannan.sadhasivam@linaro.org> wrote:
>
> The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API
> since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence,
> fix it by excluding the locking for kernel_sendmsg().
>
> Fixes: a7809ff90ce6 ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks")
> Reported-by: Doug Anderson <dianders@chromium.org>
> Tested-by: Alex Elder <elder@linaro.org>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> ---
>  net/qrtr/ns.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c
> index 934999b56d60..0515433de922 100644
> --- a/net/qrtr/ns.c
> +++ b/net/qrtr/ns.c
> @@ -203,15 +203,17 @@ static int announce_servers(struct sockaddr_qrtr *sq)
>         /* Announce the list of servers registered in this node */
>         radix_tree_for_each_slot(slot, &node->servers, &iter, 0) {
>                 srv = radix_tree_deref_slot(slot);
> +               rcu_read_unlock();

My RCU-fu is mediocre at best and my radix-tree knowledge is
non-existent.  However:

=> Reading through radix_tree_deref_slot() it says that if you are
only holding the read lock that you need to be calling
radix_tree_deref_retry().  Why don't I see that here?

=> Without any real knowledge, it seems super sketchy to drop the lock
while iterating over the tree.  Somehow that feels unsafe.  Hrm, there
seems to be a function radix_tree_iter_resume() that might be exactly
what you want, but I'm not totally sure.  The only user I can see
in-tree (other than radix tree regression testing) is btrfs-tests.c
but it's using it together with radix_tree_deref_slot_protected().

In any case, my totally untested and totally knowedge-free proposal
would look something like this:

  rcu_read_lock();
  /* Announce the list of servers registered in this node */
  radix_tree_for_each_slot(slot, &node->servers, &iter, 0) {
    srv = radix_tree_deref_slot(slot);
    if (!srv)
      continue;
    if (radix_tree_deref_retry(srv)) {
      slot = radix_tree_iter_retry(&iter);
      continue;
    }
    slot = radix_tree_iter_resume(slot, &iter);
    rcu_read_unlock();

    ret = service_announce_new(sq, srv);
    if (ret < 0) {
      pr_err("failed to announce new service\n");
      return ret;
    }

    rcu_read_lock();
  }

  rcu_read_unlock();

What a beast!  Given that this doesn't seem to be what anyone else in
the kernel is doing exactly, it makes me suspect that there's a more
fundamental design issue here, though...

-Doug

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net: qrtr: ns: Fix the incorrect usage of rcu_read_lock()
  2020-10-02 15:28 ` Doug Anderson
@ 2020-10-02 16:04   ` Manivannan Sadhasivam
  0 siblings, 0 replies; 3+ messages in thread
From: Manivannan Sadhasivam @ 2020-10-02 16:04 UTC (permalink / raw)
  To: Doug Anderson
  Cc: David Miller, Jakub Kicinski, Bjorn Andersson, netdev, LKML, Alex Elder

Hi Doug,

On Fri, Oct 02, 2020 at 08:28:51AM -0700, Doug Anderson wrote:
> Hi,
> 
> On Fri, Oct 2, 2020 at 7:15 AM Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
> > The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API
> > since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence,
> > fix it by excluding the locking for kernel_sendmsg().
> >
> > Fixes: a7809ff90ce6 ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks")
> > Reported-by: Doug Anderson <dianders@chromium.org>
> > Tested-by: Alex Elder <elder@linaro.org>
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > ---
> >  net/qrtr/ns.c | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c
> > index 934999b56d60..0515433de922 100644
> > --- a/net/qrtr/ns.c
> > +++ b/net/qrtr/ns.c
> > @@ -203,15 +203,17 @@ static int announce_servers(struct sockaddr_qrtr *sq)
> >         /* Announce the list of servers registered in this node */
> >         radix_tree_for_each_slot(slot, &node->servers, &iter, 0) {
> >                 srv = radix_tree_deref_slot(slot);
> > +               rcu_read_unlock();
> 
> My RCU-fu is mediocre at best and my radix-tree knowledge is
> non-existent.  However:
> 
> => Reading through radix_tree_deref_slot() it says that if you are
> only holding the read lock that you need to be calling
> radix_tree_deref_retry().  Why don't I see that here?
> 

Well, I drew inspiration from peer drivers and didn't look into the API
documentation properly, my bad :(

> => Without any real knowledge, it seems super sketchy to drop the lock
> while iterating over the tree.  Somehow that feels unsafe.  Hrm, there
> seems to be a function radix_tree_iter_resume() that might be exactly
> what you want, but I'm not totally sure.  The only user I can see
> in-tree (other than radix tree regression testing) is btrfs-tests.c
> but it's using it together with radix_tree_deref_slot_protected().
> 
> In any case, my totally untested and totally knowedge-free proposal
> would look something like this:
> 
>   rcu_read_lock();
>   /* Announce the list of servers registered in this node */
>   radix_tree_for_each_slot(slot, &node->servers, &iter, 0) {
>     srv = radix_tree_deref_slot(slot);
>     if (!srv)
>       continue;
>     if (radix_tree_deref_retry(srv)) {
>       slot = radix_tree_iter_retry(&iter);
>       continue;
>     }
>     slot = radix_tree_iter_resume(slot, &iter);
>     rcu_read_unlock();
> 
>     ret = service_announce_new(sq, srv);
>     if (ret < 0) {
>       pr_err("failed to announce new service\n");
>       return ret;
>     }
> 
>     rcu_read_lock();
>   }
> 
>   rcu_read_unlock();
> 
> What a beast!  Given that this doesn't seem to be what anyone else in
> the kernel is doing exactly, it makes me suspect that there's a more
> fundamental design issue here, though...
> 

That's how it is supposed to be. So I'm going to roll out next revision with
your suggestion for the rest of the deref_slot() calls also.

Thanks for your time looking into this.

Regards,
Mani

> -Doug

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-10-02 16:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02 14:15 [PATCH] net: qrtr: ns: Fix the incorrect usage of rcu_read_lock() Manivannan Sadhasivam
2020-10-02 15:28 ` Doug Anderson
2020-10-02 16:04   ` Manivannan Sadhasivam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).