linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: 周多明 <duoming@zju.edu.cn>
To: "Dan Carpenter" <dan.carpenter@oracle.com>
Cc: linux-hams@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, kuba@kernel.org,
	davem@davemloft.net, ralf@linux-mips.org, jreuter@yaina.de,
	thomas@osterried.de
Subject: Re: Re: [PATCH net V4 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()
Date: Tue, 15 Mar 2022 22:11:10 +0800 (GMT+08:00)	[thread overview]
Message-ID: <15e4111b.5339.17f8deb1f24.Coremail.duoming@zju.edu.cn> (raw)
In-Reply-To: <20220315102657.GX3315@kadam>

Hello,

On Tue, 15 Mar 2022 13:26:57 +0300, Dan Carpenter wrote:
> I'm happy that this is simpler.  I'm not super happy about the
> if (sk->sk_wq) check.  That seems like a fragile side-effect condition
> instead of something deliberate.  But I don't know networking so maybe
> this is something which we can rely on.

The variable sk->sk_wq is the address of waiting queue of sock, it is initialized to the 
address of sock->wq through the following path:
sock_create->__sock_create->ax25_create()->sock_init_data()->RCU_INIT_POINTER(sk->sk_wq, &sock->wq).
Because we have used sock_alloc() to allocate the socket in __sock_create(), sock or the address of
sock->wq is not null.
What`s more, sk->sk_wq is set to null only in sock_orphan().

Another solution:
We could also use sk->sk_socket to check. We set sk->sk_socket to sock in the following path:
sock_create()->__sock_create()->ax25_create()->sock_init_data()->sk_set_socket(sk, sock).
Because we have used sock_alloc() to allocate the socket in __sock_create(), sock or sk->sk_socket
is not null.
What`s more, sk->sk_socket is set to null only in sock_orphan().

I will change the if (sk->sk_wq) check to if(sk->sk_socket) check, because I think it is 
easier to understand.

> When you sent the earlier patch then I asked if the devices in
> ax25_kill_by_device() were always bound and if we could just use a local
> variable instead of something tied to the ax25_dev struct.  I still
> wonder about that.  In other words, could we just do this?
> 
> diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
> index 6bd097180772..4af9d9a939c6 100644
> --- a/net/ax25/af_ax25.c
> +++ b/net/ax25/af_ax25.c
> @@ -78,6 +78,7 @@ static void ax25_kill_by_device(struct net_device *dev)
>  	ax25_dev *ax25_dev;
>  	ax25_cb *s;
>  	struct sock *sk;
> +	bool found = false;
>  
>  	if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL)
>  		return;
> @@ -86,6 +87,7 @@ static void ax25_kill_by_device(struct net_device *dev)
>  again:
>  	ax25_for_each(s, &ax25_list) {
>  		if (s->ax25_dev == ax25_dev) {
> +			found = true;
>  			sk = s->sk;
>  			if (!sk) {
>  				spin_unlock_bh(&ax25_list_lock);
> @@ -115,6 +117,11 @@ static void ax25_kill_by_device(struct net_device *dev)
>  		}
>  	}
>  	spin_unlock_bh(&ax25_list_lock);
> +
> +	if (!found) {
> +		dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker);
> +		ax25_dev_put(ax25_dev);
> +	}
>  }

If we just use ax25_dev_device_up() to bring device up without using ax25_bind(),
the "found" flag could be false when we enter ax25_kill_by_device() and the refcounts 
underflow will happen. So we should use two additional variables.

If we use additional variables to fix the bug, I think there is a problem.
In the real world, the device could be detached only once. If the following
race condition happens, we could not deallocate ax25_dev and net_device anymore,
because we could not call ax25_kill_by_device() again.

       (Thread 1)                 |      (Thread 2)
    ax25_bind()                   |
                                  |  ax25_kill_by_device() //decrease refcounts
       (Thread 3)                 |
    ax25_bind()                   |
     ...                          |    ...
     ax25_dev_hold() //(1)        |  
     dev_hold_track() //(2)       |  
                                  |  ax25_dev_device_down()

In patch "[PATCH net V4 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()",
even the device has been detached, we could also decrease the refcouns by using
ax25_release(), which could ensure ax25_dev and net_device could be deallocated.
So I think "[PATCH net V4 1/2]" is better.

Best wishes,
Duoming Zhou

  reply	other threads:[~2022-03-15 14:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15  1:54 [PATCH net V4 1/2] ax25: Fix refcount leaks caused by ax25_cb_del() Duoming Zhou
2022-03-15 10:26 ` Dan Carpenter
2022-03-15 14:11   ` 周多明 [this message]
2022-03-15 14:19     ` Dan Carpenter
2022-03-15 14:33       ` 周多明

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15e4111b.5339.17f8deb1f24.Coremail.duoming@zju.edu.cn \
    --to=duoming@zju.edu.cn \
    --cc=dan.carpenter@oracle.com \
    --cc=davem@davemloft.net \
    --cc=jreuter@yaina.de \
    --cc=kuba@kernel.org \
    --cc=linux-hams@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=ralf@linux-mips.org \
    --cc=thomas@osterried.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).