All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Craig Gallek <kraigatgoog@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: net: hang in ip_finish_output
Date: Mon, 18 Jan 2016 18:20:24 -0800	[thread overview]
Message-ID: <1453170024.1223.251.camel@edumazet-glaptop2.roam.corp.google.com> (raw)
In-Reply-To: <CAEfhGiwsgd7-10ggn1EP4zTg3_mpqphWUnCo_7dSkRf=-jtmXQ@mail.gmail.com>

On Mon, 2016-01-18 at 13:33 -0500, Craig Gallek wrote:

> Thanks Eric, I'm still scratching my head over this one.  Your patches
> make sense, but I don't think they solve this particular issue.  I was
> still able to trigger the soft lockup with them applied.
> 
> I thought it has something to do with relying on RCU to dereference
> the sk_reuseport_cb pointer from a soft interrupt.  As far as I can
> tell, though, the only difference between rcu_dereference and
> rcu_dereference_bh (and rcu_read_lock and rcu_read_lock_bh) is the
> lock analysis code that gets compiled in in debug mode (in which case
> we should almost certainly be using the bh versions of the rcu
> functions).  However, I can still trigger the soft lookup even when I
> completely remove the RCU functions and use the (racy) raw pointer.
> 
> Below is a stack with your patches applied and the RCU functions
> completely removed.  I'm able to trigger it using a bunch of parallel
> instances of Dmitry's test program running on separate CPU sockets (eg
> for i in `seq 100`; do taskset -c 10,40 /tmp/rcu_stall & done)

Same reason really.

Right after sk2=socket(), setsockopt(sk2,...,SO_REUSEPORT, on) and
bind(sk2, ...), but _before_ the connect(sk2) is done, sk2 is added into
the soreuseport array, with a score which is smaller than the score of
first socket sk1 found in hash table (I am speaking of the regular UDP
hash table), if sk1 had the connect() done, giving a +8 to its score.

So the bug has nothing to do with rcu or rcu_bh, it is just an infinite
loop caused by different scores.


hash bucket [X] -> sk1 -> sk2 -> NULL

sk1 score = 14  (because it did a connect())
sk2 score = 6

I guess we should relax the test done after atomic_inc_not_zero_hint()
to only test the base keys : 
(net, ipv6_only_sock, inet->inet_rcv_saddr & inet->inet_num)

  reply	other threads:[~2016-01-19  2:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-18 18:33 net: hang in ip_finish_output Craig Gallek
2016-01-19  2:20 ` Eric Dumazet [this message]
2016-01-19  2:49   ` Eric Dumazet
2016-01-19 16:13     ` Craig Gallek
2016-01-19 16:36       ` [PATCH net] udp: fix potential infinite loop in SO_REUSEPORT logic Eric Dumazet
2016-01-19 17:15         ` Craig Gallek
2016-01-19 18:53         ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2016-01-15 17:57 net: hang in ip_finish_output Dmitry Vyukov
2016-01-16  0:20 ` Craig Gallek
2016-01-16  7:29   ` Eric Dumazet
2016-01-18  3:12     ` Eric Dumazet
2016-01-18 16:21       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1453170024.1223.251.camel@edumazet-glaptop2.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dvyukov@google.com \
    --cc=kraigatgoog@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.