All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Nadav Markus <nmarkus@paloaltonetworks.com>
Cc: Or Cohen <orcohen@paloaltonetworks.com>,
	davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org,
	Xiaoming Ni <nixiaoming@huawei.com>,
	matthieu.baerts@tessares.net, mkl@pengutronix.de
Subject: Re: [PATCH] net/nfc: fix use-after-free llcp_sock_bind/connect
Date: Wed, 5 May 2021 17:17:16 +0300	[thread overview]
Message-ID: <YJKo7IWyBk3PK2oS@unreal> (raw)
In-Reply-To: <CABV_C9PscjqNPTbK0JuNGsgCAX-xYg9=GG1KNyOh3hQU1TuzWQ@mail.gmail.com>

On Wed, May 05, 2021 at 04:36:05PM +0300, Nadav Markus wrote:
> On Wed, May 5, 2021 at 2:40 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, May 05, 2021 at 12:35:48PM +0300, Nadav Markus wrote:
> > > On Wed, May 5, 2021 at 7:46 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > > On Tue, May 04, 2021 at 07:01:01PM +0300, Or Cohen wrote:
> > > > > Hi, can you please elaborate?
> > > > >
> > > > > We don't understand why using kref_get_unless_zero will solve the
> > > > problem.
> > > >
> > > > Please don't reply in top-posting format.
> > > > ------
> > > >
> > > > The rationale behind _put()/_get() wrappers over kref is to allow
> > > > delayed release after all consumers are gone.
> > > >
> > > > In order to make it happen, the developer should ensure that consumers
> > > > don't have an access to the kref-ed struct. This is done with
> > > > kref_get_unless_zero().
> > > >
> > > > In your case, you simply increment some counter without checking if
> > > > nfc_llcp_local_get() actually succeeded.
> > > >
> > >
> > > Hi Leon - as far as we understand, the underlying issue is not
> incrementing
> > > the kref counter without checking if the function nfc_llcp_local_get
> > > succeeded or not. The function itself increments the reference count.
> > >
> > > The issue is that the nfc_llcp_local_put might be called twice on the
> > > llcp_sock->local field, however only one reference (the one that was
> gotten
> > > via nfc_llcp_local_get) is incremented. llcp_local_put will be called in
> > > two locations. The first one is just inside the bind function, if
> > > nfc_llcp_get_local_ssap fails. The second one is called
> unconditionally, at
> > > the socket destruction, at the function nfc_llcp_sock_free.
> > >
> > > Hence, our proposed solution is to prevent the second nfc_llcp_local_put
> > > from attempting to decrement the kref count, by setting local to NULL.
> This
> > > makes sense, as we immediately do so after decrementing the single ref
> > > count we took when calling nfc_llcp_local_get. Since we are under the
> sock
> > > lock, this also should be race safe, as no one should access the
> > > llcp_sock->local field without this lock's protection.
> > >
> > > >
> > > > For example, what protection do you have from races between
> > > > llcp_sock_bind(),
> > > > nfc_llcp_sock_free() and llcp_sock_connect()?
> > > >
> > >
> > > As we replied, the llcp_sock->local field is protected under the lock
> sock,
> > > as far as we understand.
> > >
> > > >
> > > > So in case you have some lock outside, it is unclear how
> use-after-free
> > > > is possible, because nfc_llcp_find_local() should return NULL.
> > > > In case, no lock exists, except reducing race window, you didn't fix
> > > > anything
> > > > and didn't sanitize lcp_sock too.
> > > >
> > >
> > > We don't quite get what race are we talking about here - our trigger
> > > program doesn't even utilize threads. All it has to do is to
> > > cause nfc_llcp_local_get to fail - this can be seen clearly in our
> original
> > > trigger program. To clarify, the two sockets that are created there
> point
> > > to the same nfc_llcp_local struct (via their local field). The
> destruction
> > > of the first socket causes the reference count of the pointed object to
> > > drop to zero (since the code increments the ref count of the object
> from 1
> > > to 2, but dercements it twice). The second socket later attempts to
> > > decrement the ref count of the same (already freed) nfc_llcp_local
> object,
> > > causing a kernel crash.
> >
> >
> > So at the end, we are talking about situation where _get()/_put() are
> > protected by the lock and local can't disappear. Can you please help me
> to find
> > this socket lock? Did I miss it in bind path?
> 
> The lock we are talking about is the lock_sock - lock_sock(sk). It appears
> near the start of the function.
> 
> >
> >
> > net/socket.c:
> >   int __sys_bind(int fd, struct sockaddr __user *umyaddr, int addrlen)
> >     sock->ops->bind(..)
> >      llcp_sock_bind(..)
> >
> > And if we put lock issue aside, all your change can be squeezed to the
> following:
> >
> > diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
> > index a3b46f888803..cc9ee634269d 100644
> > --- a/net/nfc/llcp_sock.c
> > +++ b/net/nfc/llcp_sock.c
> > @@ -99,7 +99,6 @@ static int llcp_sock_bind(struct socket *sock, struct
> sockaddr *addr, int alen)
> >         }
> >
> >         llcp_sock->dev = dev;
> > -       llcp_sock->local = nfc_llcp_local_get(local);
> >         llcp_sock->nfc_protocol = llcp_addr.nfc_protocol;
> >         llcp_sock->service_name_len = min_t(unsigned int,
> >                                             llcp_addr.service_name_len,
> > @@ -108,13 +107,11 @@ static int llcp_sock_bind(struct socket *sock,
> struct sockaddr *addr, int alen)
> >                                           llcp_sock->service_name_len,
> >                                           GFP_KERNEL);
> >         if (!llcp_sock->service_name) {
> > -               nfc_llcp_local_put(llcp_sock->local);
> >                 ret = -ENOMEM;
> >                 goto put_dev;
> >         }
> >         llcp_sock->ssap = nfc_llcp_get_sdp_ssap(local, llcp_sock);
> >         if (llcp_sock->ssap == LLCP_SAP_MAX) {
> > -               nfc_llcp_local_put(llcp_sock->local);
> >                 kfree(llcp_sock->service_name);
> >                 llcp_sock->service_name = NULL;
> >                 ret = -EADDRINUSE;
> > @@ -122,6 +119,7 @@ static int llcp_sock_bind(struct socket *sock, struct
> sockaddr *addr, int alen)
> >         }
> >
> >         llcp_sock->reserved_ssap = llcp_sock->ssap;
> > +       llcp_sock->local = nfc_llcp_local_get(local);
> >
> >         nfc_llcp_sock_link(&local->sockets, sk);
> >
> >
> > Thanks
> 
> While your suggested fix will work for the bind path (it implicitly makes
> sure that the 'local' field is always NULL, up until the point that
> everything is initialized properly), we have a problem in the 'connect'
> path.

It is a mess.

> 
> If we try to take the same approach inside the function llcp_sock_connect,
> there is a call to nfc_llcp_send_connect, which requires the local field to
> be set inside the socket. We thought about changing its interface to just
> accept the 'local' as an argument, but this function is exported in the .h
> file, and we are afraid of breaking external interfaces. Therefore, we
> think that we should take a consistent approach of explicitly setting the
> local field to NULL in all paths. Note that this explicit assignment should
> achieve the same result as your approach, of implicitly letting it stay
> NULL up until we can safely assign to it.
> 
> Please let us know WDYT.

It is in-kernel API with only one user, I would personally use that
opportunity and cleaned nfc_llcp_send_connect() from ridiculous checks.

This is always false:
  401         local = sock->local;
  402         if (local == NULL)

Always true:
  405         if (sock->service_name != NULL)

However the more I look on that code the more I come to the conclusion that it
is not worth to change it.

Thanks

  parent reply	other threads:[~2021-05-05 14:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-04  7:15 [PATCH] net/nfc: fix use-after-free llcp_sock_bind/connect Or Cohen
2021-05-04  8:12 ` Leon Romanovsky
2021-05-04 16:01   ` Or Cohen
2021-05-05  4:46     ` Leon Romanovsky
     [not found]       ` <CABV_C9OJ6v1deEknc+V3cJaT+CPjmzg6Wb06_Rsey3AXqOBNYg@mail.gmail.com>
2021-05-05 11:30         ` Nadav Markus
2021-05-05 11:39         ` Leon Romanovsky
     [not found]           ` <CABV_C9PscjqNPTbK0JuNGsgCAX-xYg9=GG1KNyOh3hQU1TuzWQ@mail.gmail.com>
2021-05-05 14:17             ` Leon Romanovsky [this message]
2021-05-05 14:31               ` Or Cohen
2021-05-04  7:16 Or Cohen
2021-05-04 19:10 ` patchwork-bot+netdevbpf
2021-05-05  4:50   ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJKo7IWyBk3PK2oS@unreal \
    --to=leon@kernel.org \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=matthieu.baerts@tessares.net \
    --cc=mkl@pengutronix.de \
    --cc=netdev@vger.kernel.org \
    --cc=nixiaoming@huawei.com \
    --cc=nmarkus@paloaltonetworks.com \
    --cc=orcohen@paloaltonetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.