All of lore.kernel.org
 help / color / mirror / Atom feed
* Fw: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade
@ 2017-03-02 18:32 Stephen Hemminger
  0 siblings, 0 replies; 3+ messages in thread
From: Stephen Hemminger @ 2017-03-02 18:32 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Wed, 01 Mar 2017 12:04:45 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade


https://bugzilla.kernel.org/show_bug.cgi?id=194723

--- Comment #2 from Lutz Vieweg (lvml@5t9.de) ---
Using tcpdump I found that when the connect() stalls, the initial SYN packet
appears at the "lo" interface, and is re-sent multiple times, but no ACK packet
is ever returned.

Error case with linux-4.10:

> 12:57:25.685640 IP 127.0.0.1.44074 > 127.0.0.1.dnp-sec: Flags [S], seq
> 1952288470, win 43690, options [mss 65495,sackOK,TS val 1942998659 ecr
> 0,nop,wscale 7], length 0
> 12:57:26.728890 IP 127.0.0.1.44074 > 127.0.0.1.dnp-sec: Flags [S], seq
> 1952288470, win 43690, options [mss 65495,sackOK,TS val 1942999703 ecr
> 0,nop,wscale 7], length 0
> 12:57:28.776935 IP 127.0.0.1.44074 > 127.0.0.1.dnp-sec: Flags [S], seq
> 1952288470, win 43690, options [mss 65495,sackOK,TS val 1943001751 ecr
> 0,nop,wscale 7], length 0  
...

Normal case:

> 13:01:43.037135 IP 127.0.0.1.44362 > 127.0.0.1.dnp-sec: Flags [S], seq
> 3181010757, win 43690, options [mss 65495,sackOK,TS val 3314900273 ecr
> 0,nop,wscale 7], length 0
> 13:01:43.037171 IP 127.0.0.1.dnp-sec > 127.0.0.1.44362: Flags [S.], seq
> 1934682061, ack 3181010758, win 43690, options [mss 65495,sackOK,TS val
> 2947413993 ecr 3314900273,nop,wscale 7], length 0
> 13:01:43.037196 IP 127.0.0.1.44362 > 127.0.0.1.dnp-sec: Flags [.], ack 1, win
> 342, options [nop,nop,TS val 3314900273 ecr 2947413993], length 0  


According to strace, the listening process does not even leave the select()
call it uses to wait for incoming connections to accept in the error case.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fw: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade
  2017-03-15 20:36 Stephen Hemminger
@ 2017-03-15 21:40 ` Eric Dumazet
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2017-03-15 21:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

On Wed, 2017-03-15 at 13:36 -0700, Stephen Hemminger wrote:
> 
> Begin forwarded message:
> 
> Date: Wed, 15 Mar 2017 19:41:59 +0000
> From: bugzilla-daemon@bugzilla.kernel.org
> To: stephen@networkplumber.org
> Subject: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade
> 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=194723
> 
> --- Comment #15 from Lutz Vieweg (lvml@5t9.de) ---
> At last, bisecting converged:
> 
> git bisect start
> # bad: [c470abd4fde40ea6a0846a2beab642a578c0b8cd] Linux 4.10
> git bisect bad c470abd4fde40ea6a0846a2beab642a578c0b8cd
> # good: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9
> git bisect good 69973b830859bc6529a7a0468ba0d80ee5117826
> # bad: [f4000cd99750065d5177555c0a805c97174d1b9f] Merge tag 'arm64-upstream' of
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
> git bisect bad f4000cd99750065d5177555c0a805c97174d1b9f
> # bad: [7079efc9d3e7f1f7cdd34082ec58209026315057] Merge tag 'fbdev-4.10' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
> git bisect bad 7079efc9d3e7f1f7cdd34082ec58209026315057
> # bad: [669bb4c58c3091cd54650e37c5f4e345dd12c564] Merge branch 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
> git bisect bad 669bb4c58c3091cd54650e37c5f4e345dd12c564
> # good: [7a8bca043cf1bb0433aa43d008b6c4de6c07d6a2] Merge branch 'sfc-tso-v2'
> git bisect good 7a8bca043cf1bb0433aa43d008b6c4de6c07d6a2
> # bad: [4f4f907a6729ae9e132810711c3a05e48311a948] Merge branch 'mvneta-64bit'
> git bisect bad 4f4f907a6729ae9e132810711c3a05e48311a948
> # good: [33f8a0458b2ce4546b681c5fae04427e3077a543] Merge tag
> 'wireless-drivers-next-for-davem-2016-11-25' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
> git bisect good 33f8a0458b2ce4546b681c5fae04427e3077a543
> # good: [80439a1704e811697ee01fd09dd95dd10790bc93] qede: Remove 'num_tc'.
> git bisect good 80439a1704e811697ee01fd09dd95dd10790bc93
> # good: [5067b6020770ef7c8102f47079c9e577d175ef2c] net/mlx5e: Remove flow encap
> entry in the correct place
> git bisect good 5067b6020770ef7c8102f47079c9e577d175ef2c
> # bad: [7091d8c7055d7310339435ae3af2fb490a92524d] net/sched: cls_flower: Add
> offload support using egress Hardware device
> git bisect bad 7091d8c7055d7310339435ae3af2fb490a92524d
> # good: [b14945ac3efdf5217182a344b037f96d6b0afae1] net: atarilance: use %8ph
> for printing hex string
> git bisect good b14945ac3efdf5217182a344b037f96d6b0afae1
> # bad: [25429d7b7dca01dc4f17205de023a30ca09390d0] tcp: allow to turn tcp
> timestamp randomization off
> git bisect bad 25429d7b7dca01dc4f17205de023a30ca09390d0
> # good: [1d6cff4fca4366d0529dbce170e0f33cfe213790] qed: Add iSCSI out of order
> packet handling.
> git bisect good 1d6cff4fca4366d0529dbce170e0f33cfe213790
> # bad: [95a22caee396cef0bb2ca8fafdd82966a49367bb] tcp: randomize tcp timestamp
> offsets for each connection
> git bisect bad 95a22caee396cef0bb2ca8fafdd82966a49367bb
> 
> 
> So the culprit seems to be this change: 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=95a22caee396cef0bb2ca8fafdd82966a49367bb
> 
> "tcp: randomize tcp timestamp offsets for each connection
> jiffies based timestamps allow for easy inference of number of devices
> behind NAT translators and also makes tracking of hosts simpler.
> 
> commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> added the main infrastructure that is needed for per-connection ts
> randomization, in particular writing/reading the on-wire tcp header
> format takes the offset into account so rest of stack can use normal
> tcp_time_stamp (jiffies).
> 
> So only two items are left:
>  - add a tsoffset for request sockets
>  - extend the tcp isn generator to also return another 32bit number
>    in addition to the ISN.
> 
> Re-use of ISN generator also means timestamps are still monotonically
> increasing for same connection quadruple, i.e. PAWS will still work.
> 
> Includes fixes from Eric Dumazet.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> "
> 
> I will try to attract some attention from above mentioned people.
> 

Finally time to get rid of buggy tw_recycle, that apparently some
distros set to one.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Fw: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade
@ 2017-03-15 20:36 Stephen Hemminger
  2017-03-15 21:40 ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Hemminger @ 2017-03-15 20:36 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Wed, 15 Mar 2017 19:41:59 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade


https://bugzilla.kernel.org/show_bug.cgi?id=194723

--- Comment #15 from Lutz Vieweg (lvml@5t9.de) ---
At last, bisecting converged:

git bisect start
# bad: [c470abd4fde40ea6a0846a2beab642a578c0b8cd] Linux 4.10
git bisect bad c470abd4fde40ea6a0846a2beab642a578c0b8cd
# good: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9
git bisect good 69973b830859bc6529a7a0468ba0d80ee5117826
# bad: [f4000cd99750065d5177555c0a805c97174d1b9f] Merge tag 'arm64-upstream' of
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
git bisect bad f4000cd99750065d5177555c0a805c97174d1b9f
# bad: [7079efc9d3e7f1f7cdd34082ec58209026315057] Merge tag 'fbdev-4.10' of
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
git bisect bad 7079efc9d3e7f1f7cdd34082ec58209026315057
# bad: [669bb4c58c3091cd54650e37c5f4e345dd12c564] Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
git bisect bad 669bb4c58c3091cd54650e37c5f4e345dd12c564
# good: [7a8bca043cf1bb0433aa43d008b6c4de6c07d6a2] Merge branch 'sfc-tso-v2'
git bisect good 7a8bca043cf1bb0433aa43d008b6c4de6c07d6a2
# bad: [4f4f907a6729ae9e132810711c3a05e48311a948] Merge branch 'mvneta-64bit'
git bisect bad 4f4f907a6729ae9e132810711c3a05e48311a948
# good: [33f8a0458b2ce4546b681c5fae04427e3077a543] Merge tag
'wireless-drivers-next-for-davem-2016-11-25' of
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
git bisect good 33f8a0458b2ce4546b681c5fae04427e3077a543
# good: [80439a1704e811697ee01fd09dd95dd10790bc93] qede: Remove 'num_tc'.
git bisect good 80439a1704e811697ee01fd09dd95dd10790bc93
# good: [5067b6020770ef7c8102f47079c9e577d175ef2c] net/mlx5e: Remove flow encap
entry in the correct place
git bisect good 5067b6020770ef7c8102f47079c9e577d175ef2c
# bad: [7091d8c7055d7310339435ae3af2fb490a92524d] net/sched: cls_flower: Add
offload support using egress Hardware device
git bisect bad 7091d8c7055d7310339435ae3af2fb490a92524d
# good: [b14945ac3efdf5217182a344b037f96d6b0afae1] net: atarilance: use %8ph
for printing hex string
git bisect good b14945ac3efdf5217182a344b037f96d6b0afae1
# bad: [25429d7b7dca01dc4f17205de023a30ca09390d0] tcp: allow to turn tcp
timestamp randomization off
git bisect bad 25429d7b7dca01dc4f17205de023a30ca09390d0
# good: [1d6cff4fca4366d0529dbce170e0f33cfe213790] qed: Add iSCSI out of order
packet handling.
git bisect good 1d6cff4fca4366d0529dbce170e0f33cfe213790
# bad: [95a22caee396cef0bb2ca8fafdd82966a49367bb] tcp: randomize tcp timestamp
offsets for each connection
git bisect bad 95a22caee396cef0bb2ca8fafdd82966a49367bb


So the culprit seems to be this change: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=95a22caee396cef0bb2ca8fafdd82966a49367bb

"tcp: randomize tcp timestamp offsets for each connection
jiffies based timestamps allow for easy inference of number of devices
behind NAT translators and also makes tracking of hosts simpler.

commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
added the main infrastructure that is needed for per-connection ts
randomization, in particular writing/reading the on-wire tcp header
format takes the offset into account so rest of stack can use normal
tcp_time_stamp (jiffies).

So only two items are left:
 - add a tsoffset for request sockets
 - extend the tcp isn generator to also return another 32bit number
   in addition to the ISN.

Re-use of ISN generator also means timestamps are still monotonically
increasing for same connection quadruple, i.e. PAWS will still work.

Includes fixes from Eric Dumazet.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"

I will try to attract some attention from above mentioned people.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-03-15 21:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-02 18:32 Fw: [Bug 194723] connect() to localhost stalls after 4.9 -> 4.10 upgrade Stephen Hemminger
2017-03-15 20:36 Stephen Hemminger
2017-03-15 21:40 ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.