All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Herbert <tom@herbertland.com>
To: Craig Gallek <kraigatgoog@gmail.com>
Cc: Josef Bacik <jbacik@fb.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: Soft lockup in inet_put_port on 4.6
Date: Tue, 13 Dec 2016 15:32:09 -0800	[thread overview]
Message-ID: <CALx6S369T_hvoTgyHbmeSiR2p3d68h+0tMKqMmcGYLrKiN3JMA@mail.gmail.com> (raw)
In-Reply-To: <CAEfhGiyFW1YHmmUYpFv+q4J4r=wSv3ZBZowG69BqkT3p-G4gNw@mail.gmail.com>

On Tue, Dec 13, 2016 at 3:03 PM, Craig Gallek <kraigatgoog@gmail.com> wrote:
> On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert <tom@herbertland.com> wrote:
>> I think there may be some suspicious code in inet_csk_get_port. At
>> tb_found there is:
>>
>>                 if (((tb->fastreuse > 0 && reuse) ||
>>                      (tb->fastreuseport > 0 &&
>>                       !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>                       sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
>>                     smallest_size == -1)
>>                         goto success;
>>                 if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true)) {
>>                         if ((reuse ||
>>                              (tb->fastreuseport > 0 &&
>>                               sk->sk_reuseport &&
>>                               !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>                               uid_eq(tb->fastuid, uid))) &&
>>                             smallest_size != -1 && --attempts >= 0) {
>>                                 spin_unlock_bh(&head->lock);
>>                                 goto again;
>>                         }
>>                         goto fail_unlock;
>>                 }
>>
>> AFAICT there is redundancy in these two conditionals.  The same clause
>> is being checked in both: (tb->fastreuseport > 0 &&
>> !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
>> uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is true the
>> first conditional should be hit, goto done,  and the second will never
>> evaluate that part to true-- unless the sk is changed (do we need
>> READ_ONCE for sk->sk_reuseport_cb?).
> That's an interesting point... It looks like this function also
> changed in 4.6 from using a single local_bh_disable() at the beginning
> with several spin_lock(&head->lock) to exclusively
> spin_lock_bh(&head->lock) at each locking point.  Perhaps the full bh
> disable variant was preventing the timers in your stack trace from
> running interleaved with this function before?

Could be, although dropping the lock shouldn't be able to affect the
search state. TBH, I'm a little lost in reading function, the
SO_REUSEPORT handling is pretty complicated. For instance,
rcu_access_pointer(sk->sk_reuseport_cb) is checked three times in that
function and also in every call to inet_csk_bind_conflict. I wonder if
we can simply this under the assumption that SO_REUSEPORT is only
allowed if the port number (snum) is explicitly specified.

Tom

  reply	other threads:[~2016-12-13 23:40 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-06 23:06 Soft lockup in inet_put_port on 4.6 Tom Herbert
2016-12-08 21:03 ` Hannes Frederic Sowa
2016-12-08 21:36   ` Josef Bacik
2016-12-09  0:30     ` Eric Dumazet
2016-12-09  1:01       ` Josef Bacik
2016-12-10  1:59         ` Josef Bacik
2016-12-10  3:47           ` Eric Dumazet
2016-12-10  4:14             ` Eric Dumazet
2016-12-12 18:05               ` Josef Bacik
2016-12-12 18:44                 ` Hannes Frederic Sowa
2016-12-12 21:23                   ` Josef Bacik
2016-12-12 22:24                   ` Josef Bacik
2016-12-13 20:51                     ` Tom Herbert
2016-12-13 23:03                       ` Craig Gallek
2016-12-13 23:32                         ` Tom Herbert [this message]
2016-12-15 18:53                           ` Josef Bacik
2016-12-15 22:39                             ` Tom Herbert
2016-12-15 23:25                               ` Craig Gallek
2016-12-16  0:07                             ` Hannes Frederic Sowa
2016-12-16 14:54                               ` Josef Bacik
2016-12-16 15:21                                 ` Josef Bacik
2016-12-16 22:08                                   ` Josef Bacik
2016-12-16 22:18                                     ` Tom Herbert
2016-12-16 22:50                                       ` Josef Bacik
2016-12-17 11:08                                         ` Hannes Frederic Sowa
2016-12-17 13:26                                           ` Josef Bacik
2016-12-20  1:56                                             ` David Miller
2016-12-20  2:07                                               ` Tom Herbert
2016-12-20  2:41                                                 ` Eric Dumazet
2016-12-20  3:40                                                   ` Josef Bacik
2016-12-20  4:52                                                     ` Eric Dumazet
2016-12-20  4:59                                                       ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALx6S369T_hvoTgyHbmeSiR2p3d68h+0tMKqMmcGYLrKiN3JMA@mail.gmail.com \
    --to=tom@herbertland.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=jbacik@fb.com \
    --cc=kraigatgoog@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.