netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
To: Xin Long <lucien.xin@gmail.com>
Cc: network dev <netdev@vger.kernel.org>,
	linux-sctp@vger.kernel.org, davem <davem@davemloft.net>,
	Neil Horman <nhorman@tuxdriver.com>
Subject: Re: [PATCH net] sctp: set newsk sk_socket before processing listening sk backlog
Date: Wed, 2 Oct 2019 14:48:54 -0300	[thread overview]
Message-ID: <20191002174854.GI3499@localhost.localdomain> (raw)
In-Reply-To: <20191002174127.GL3431@localhost.localdomain>

On Wed, Oct 02, 2019 at 02:41:27PM -0300, Marcelo Ricardo Leitner wrote:
> On Thu, Oct 03, 2019 at 01:26:46AM +0800, Xin Long wrote:
> > On Wed, Oct 2, 2019 at 8:55 PM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
> > >
> > > On Wed, Oct 02, 2019 at 04:23:52PM +0800, Xin Long wrote:
> > > > On Wed, Oct 2, 2019 at 9:04 AM Marcelo Ricardo Leitner
> > > > <marcelo.leitner@gmail.com> wrote:
> > > > >
> > > > > On Mon, Sep 30, 2019 at 09:10:18PM +0800, Xin Long wrote:
> > > > > > This patch is to fix a NULL-ptr deref crash in selinux_sctp_bind_connect:
> > > > > >
> > > > > >   [...] kasan: GPF could be caused by NULL-ptr deref or user memory access
> > > > > >   [...] RIP: 0010:selinux_sctp_bind_connect+0x16a/0x230
> > > > > >   [...] Call Trace:
> > > > > >   [...]  security_sctp_bind_connect+0x58/0x90
> > > > > >   [...]  sctp_process_asconf+0xa52/0xfd0 [sctp]
> > > > > >   [...]  sctp_sf_do_asconf+0x782/0x980 [sctp]
> > > > > >   [...]  sctp_do_sm+0x139/0x520 [sctp]
> > > > > >   [...]  sctp_assoc_bh_rcv+0x284/0x5c0 [sctp]
> > > > > >   [...]  sctp_backlog_rcv+0x45f/0x880 [sctp]
> > > > > >   [...]  __release_sock+0x120/0x370
> > > > > >   [...]  release_sock+0x4f/0x180
> > > > > >   [...]  sctp_accept+0x3f9/0x5a0 [sctp]
> > > > > >   [...]  inet_accept+0xe7/0x6f0
> > > > > >
> > > > > > It was caused by that the 'newsk' sk_socket was not set before going to
> > > > > > security sctp hook when doing accept() on a tcp-type socket:
> > > > > >
> > > > > >   inet_accept()->
> > > > > >     sctp_accept():
> > > > > >       lock_sock():
> > > > > >           lock listening 'sk'
> > > > > >                                           do_softirq():
> > > > > >                                             sctp_rcv():  <-- [1]
> > > > > >                                                 asconf chunk arrived and
> > > > > >                                                 enqueued in 'sk' backlog
> > > > > >       sctp_sock_migrate():
> > > > > >           set asoc's sk to 'newsk'
> > > > > >       release_sock():
> > > > > >           sctp_backlog_rcv():
> > > > > >             lock 'newsk'
> > > > > >             sctp_process_asconf()  <-- [2]
> > > > > >             unlock 'newsk'
> > > > > >     sock_graft():
> > > > > >         set sk_socket  <-- [3]
> > > > > >
> > > > > > As it shows, at [1] the asconf chunk would be put into the listening 'sk'
> > > > > > backlog, as accept() was holding its sock lock. Then at [2] asconf would
> > > > > > get processed with 'newsk' as asoc's sk had been set to 'newsk'. However,
> > > > > > 'newsk' sk_socket is not set until [3], while selinux_sctp_bind_connect()
> > > > > > would deref it, then kernel crashed.
> > > > >
> > > > > Note that sctp will migrate such incoming chunks from sk to newsk in
> > > > > sctp_rcv() if they arrived after the mass-migration performed at
> > > > > sctp_sock_migrate().
> > > > >
> > > > > That said, did you explore changing inet_accept() so that
> > > > > sk1->sk_prot->accept() would return sk2 still/already locked?
> > > > > That would be enough to block [2] from happening as then it would be
> > > > > queued on newsk backlog this time and avoid nearly duplicating
> > > > > inet_accept(). (too bad for this chunk, hit 2 backlogs..)
> > > > We don't have to bother inet_accept() for it. I had this one below,
> > > > and I was just thinking the locks order doesn't look nice. Do you
> > > > think this is more acceptable?
> > > >
> > > > @@ -4963,15 +4963,19 @@ static struct sock *sctp_accept(struct sock
> > > > *sk, int flags, int *err, bool kern)
> > > >          * asoc to the newsk.
> > > >          */
> > > >         error = sctp_sock_migrate(sk, newsk, asoc, SCTP_SOCKET_TCP);
> > > > -       if (error) {
> > > > -               sk_common_release(newsk);
> > > > -               newsk = NULL;
> > > > +       if (!error) {
> > > > +               lock_sock_nested(newsk, SINGLE_DEPTH_NESTING);
> > > > +               release_sock(sk);
> > >
> > > Interesting. It fixes the backlog processing, ok. Question:
> > >
> > > > +               release_sock(newsk);
> > >
> > > As newsk is hashed already and unlocked here to be locked again later
> > > on inet_accept(), it could receive a packet in between (thus before
> > > sock_graft() could have a chance to run), no?
> > 
> > You're right, it explains another call trace happened once in our testing.
> > 
> > The way to changing inet_accept() will also have to change all protocols'
> > .accept(). Given that this issue is only triggered in a very small moment,
> > can we just silently discard this asconf chunk if sk->sk_socket is NULL?
> > and let peer's T4-timer retransmit it.
> 
> No no. If the change doesn't hurt other protocols, we should try that
> first.  Otherwise this adds overhead to the network and we could get a
> bug report soon on "valid asconf being ignored".
> 
> If that doesn't pan out, maybe your initial suggestion is the way out.
> More custom code but keeps the expected behavior.
> 
> > 
> > @@ -3709,6 +3709,9 @@ enum sctp_disposition sctp_sf_do_asconf(struct net *net,
> >         struct sctp_addiphdr *hdr;
> >         __u32 serial;
> > 
> > +       if (asoc->base.sk->sk_socket)
> > +               return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);

What if we add this to sctp_backlog_rcv() instead?  As in, do not
process the backlog if so.
And force doing backlog on sctp_rcv() also.
As we are sure that there will be a subsequent lock/unlock and that it
will handle it, this could work.

  reply	other threads:[~2019-10-02 17:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30 13:10 [PATCH net] sctp: set newsk sk_socket before processing listening sk backlog Xin Long
2019-10-01 22:30 ` David Miller
2019-10-02  1:03 ` Marcelo Ricardo Leitner
2019-10-02  8:23   ` Xin Long
2019-10-02 12:24     ` Neil Horman
2019-10-02 12:55     ` Marcelo Ricardo Leitner
2019-10-02 17:26       ` Xin Long
2019-10-02 17:28         ` Xin Long
2019-10-02 17:41         ` Marcelo Ricardo Leitner
2019-10-02 17:48           ` Marcelo Ricardo Leitner [this message]
2019-10-02 18:23           ` Xin Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191002174854.GI3499@localhost.localdomain \
    --to=marcelo.leitner@gmail.com \
    --cc=davem@davemloft.net \
    --cc=linux-sctp@vger.kernel.org \
    --cc=lucien.xin@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).