linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: tcp connection hangs on connect
       [not found] ` <20010829082405.A5966@cubit.at>
@ 2001-08-30  0:20   ` Val Henson
  2001-08-30  0:55   ` David S. Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Val Henson @ 2001-08-30  0:20 UTC (permalink / raw)
  To: Alexey Kuznetsov, linux-kernel; +Cc: Philipp Reisner

Alexey Kuznetsov wrote:

> Hello!
>  
> > simply hangs after some minutes to an hour. The script runs on
> > a Linux-2.2.19 Box (we have also tested Linux-2.4.2)
>  
> This bug has been fixed in later 2.4s.
>  
> Corresponding fix to 2.2 is expected to be in 2.2.20 and it is available
> in Alan's 2.2.20-pre.
> 
> Alexey

This bug still exists in 2.4.10-pre2 (from the linuxppc_2_4_devel
tree).  The first TCP connection to the machine hangs after a few KB
of data.  Connections after that work fine.  A tcpdump shows strange
retransmit behavior before the hang but I haven't investigated it
further.

Philipp Reisner says it's fixed in 2.2.20-pre8 but I can't find any
equivalent fix in 2.4.8-2.4.10.

Have I found a new bug or did the patch not make it into 2.4 after
all?

-VAL

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: tcp connection hangs on connect
       [not found] ` <20010829082405.A5966@cubit.at>
  2001-08-30  0:20   ` tcp connection hangs on connect Val Henson
@ 2001-08-30  0:55   ` David S. Miller
  2001-08-30  1:53     ` Val Henson
  1 sibling, 1 reply; 12+ messages in thread
From: David S. Miller @ 2001-08-30  0:55 UTC (permalink / raw)
  To: val; +Cc: kuznet, linux-kernel, philipp.reisner

   From: Val Henson <val@nmt.edu>
   Date: Wed, 29 Aug 2001 18:20:40 -0600
   
   Have I found a new bug or did the patch not make it into 2.4 after
   all?
   
Without full and accurate tcpdump traces, your guess is as good as
ours.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: tcp connection hangs on connect
  2001-08-30  0:55   ` David S. Miller
@ 2001-08-30  1:53     ` Val Henson
  2001-08-30 16:21       ` kuznet
  0 siblings, 1 reply; 12+ messages in thread
From: Val Henson @ 2001-08-30  1:53 UTC (permalink / raw)
  To: David S. Miller; +Cc: kuznet, linux-kernel

On Wed, Aug 29, 2001 at 05:55:37PM -0700, David S. Miller wrote:
>    
> Without full and accurate tcpdump traces, your guess is as good as
> ours.

:) I was hoping Alexey would respond with "Oh yeah, here's that patch,
Dave please accept it."

Full tcpdumps from both hosts are at the end of this email.
198.17.100.42 is a 2.2.19 machine trying to scp a file from
198.17.100.41, running 2.4.10-pre2.  Here's the interesting bit:

18:23:18.980000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 16338:17786(1448) ack 1473 win 8216 <nop,nop,timestamp 18540 456453> (DF) [tos 0x8]
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540> (DF) [tos 0x8]
18:23:19.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 19234:20682(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8]
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 1 {19234:20682} > (DF) [tos 0x8]

Everything's fine till here, when 198.17.100.42 misses the 17787-19234
segment.  198.17.100.41 never retransmits this segment.

18:23:19.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 22130:23578(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8]
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 2 {22130:23578}{19234:20682} > (DF) [tos 0x8]
18:23:19.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 23578:25026(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8]
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 2 {22130:25026}{19234:20682} > (DF) [tos 0x8]

Long pause, eventually I kill the scp process.  Note the lack of a
retransmit of the 17787-19234 segment.

To reproduce:

As the very first TCP connection after a reboot, try to scp a large
file from the 2.4.x machine.

-VAL

Full tcpdump from 198.17.100.42's point of view:

18:23:15.740000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: S 3061649587:3061649587(0) win 32120 <mss 1460,sackOK,timestamp 456129 0,nop,wscale 0> (DF)
18:23:15.740000 eth0 B arp who-has 198.17.100.42 tell 198.17.100.41
18:23:15.740000 eth0 > arp reply 198.17.100.42 (0:30:65:ba:84:e6) is-at 0:30:65:ba:84:e6 (0:80:f6:10:50:26)
18:23:15.740000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: S 2117491474:2117491474(0) ack 3061649588 win 5792 <mss 1460,sackOK,timestamp 18215 456129,nop,wscale 0> (DF)
18:23:15.740000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1:1(0) ack 1 win 32120 <nop,nop,timestamp 456129 18215> (DF)
18:23:15.760000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1:26(25) ack 1 win 5792 <nop,nop,timestamp 18217 456129> (DF)
18:23:15.760000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1:1(0) ack 26 win 32120 <nop,nop,timestamp 456131 18217> (DF)
18:23:15.760000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1:25(24) ack 26 win 32120 <nop,nop,timestamp 456131 18217> (DF)
18:23:15.760000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 26:26(0) ack 25 win 5792 <nop,nop,timestamp 18217 456131> (DF)
18:23:15.760000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 26:666(640) ack 25 win 5792 <nop,nop,timestamp 18217 456131> (DF)
18:23:15.760000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 25:657(632) ack 666 win 31856 <nop,nop,timestamp 456131 18217> (DF)
18:23:15.790000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 666:666(0) ack 657 win 6952 <nop,nop,timestamp 18221 456131> (DF)
18:23:15.790000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 657:673(16) ack 666 win 31856 <nop,nop,timestamp 456134 18221> (DF)
18:23:15.790000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 666:666(0) ack 673 win 6952 <nop,nop,timestamp 18221 456134> (DF)
18:23:15.810000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 666:946(280) ack 673 win 6952 <nop,nop,timestamp 18222 456134> (DF)
18:23:15.830000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 673:673(0) ack 946 win 31856 <nop,nop,timestamp 456138 18222> (DF)
18:23:15.910000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 673:945(272) ack 946 win 31856 <nop,nop,timestamp 456146 18222> (DF)
18:23:15.940000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 946:946(0) ack 945 win 8216 <nop,nop,timestamp 18236 456146> (DF)
18:23:16.080000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 946:1522(576) ack 945 win 8216 <nop,nop,timestamp 18249 456146> (DF)
18:23:16.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 945:945(0) ack 1522 win 31856 <nop,nop,timestamp 456165 18249> (DF)
18:23:16.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1522:1538(16) ack 945 win 8216 <nop,nop,timestamp 18251 456165> (DF)
18:23:16.120000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 945:945(0) ack 1538 win 31856 <nop,nop,timestamp 456167 18251> (DF)
18:23:16.220000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 945:961(16) ack 1538 win 31856 <nop,nop,timestamp 456177 18251> (DF)
18:23:16.220000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 1538:1538(0) ack 961 win 8216 <nop,nop,timestamp 18263 456177> (DF)
18:23:16.220000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 961:1009(48) ack 1538 win 31856 <nop,nop,timestamp 456177 18263> (DF)
18:23:16.220000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 1538:1538(0) ack 1009 win 8216 <nop,nop,timestamp 18263 456177> (DF)
18:23:16.220000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1538:1586(48) ack 1009 win 8216 <nop,nop,timestamp 18263 456177> (DF)
18:23:16.230000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1009:1073(64) ack 1586 win 31856 <nop,nop,timestamp 456178 18263> (DF)
18:23:16.260000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1586:1650(64) ack 1073 win 8216 <nop,nop,timestamp 18267 456178> (DF)
18:23:16.280000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1073:1073(0) ack 1650 win 31856 <nop,nop,timestamp 456183 18267> (DF)
18:23:18.320000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1073:1233(160) ack 1650 win 31856 <nop,nop,timestamp 456387 18267> (DF)
18:23:18.320000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1650:1682(32) ack 1233 win 8216 <nop,nop,timestamp 18474 456387> (DF)
18:23:18.330000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1233:1297(64) ack 1682 win 31856 <nop,nop,timestamp 456388 18474> (DF)
18:23:18.330000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1682:1730(48) ack 1297 win 8216 <nop,nop,timestamp 18474 456388> (DF)
18:23:18.330000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1297:1377(80) ack 1730 win 31856 <nop,nop,timestamp 456388 18474> (DF) [tos 0x8] 
18:23:18.330000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1730:1778(48) ack 1377 win 8216 <nop,nop,timestamp 18474 456388> (DF) [tos 0x8] 
18:23:18.330000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1377:1425(48) ack 1778 win 31856 <nop,nop,timestamp 456388 18474> (DF) [tos 0x8] 
18:23:18.370000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 1778:1778(0) ack 1425 win 8216 <nop,nop,timestamp 18479 456388> (DF) [tos 0x8] 
18:23:18.410000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 1778:1858(80) ack 1425 win 8216 <nop,nop,timestamp 18483 456388> (DF) [tos 0x8] 
18:23:18.420000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: P 1425:1473(48) ack 1858 win 31856 <nop,nop,timestamp 456397 18483> (DF) [tos 0x8] 
18:23:18.420000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 1858:1858(0) ack 1473 win 8216 <nop,nop,timestamp 18483 456397> (DF) [tos 0x8] 
18:23:18.430000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 1858:3306(1448) ack 1473 win 8216 <nop,nop,timestamp 18484 456397> (DF) [tos 0x8] 
18:23:18.460000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 3306 win 31856 <nop,nop,timestamp 456401 18484> (DF) [tos 0x8] 
18:23:18.460000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 4754:6202(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
18:23:18.460000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 3306 win 31856 <nop,nop,timestamp 456401 18484,nop,nop, sack 1 {4754:6202} > (DF) [tos 0x8] 
18:23:18.460000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 7650:9098(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
18:23:18.460000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 3306 win 31856 <nop,nop,timestamp 456401 18484,nop,nop, sack 2 {7650:9098}{4754:6202} > (DF) [tos 0x8] 
18:23:18.460000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 3306:4754(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
18:23:18.460000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 6202 win 28960 <nop,nop,timestamp 456401 18487,nop,nop, sack 1 {7650:9098} > (DF) [tos 0x8] 
18:23:18.660000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 6202:7650(1448) ack 1473 win 8216 <nop,nop,timestamp 18508 456401> (DF) [tos 0x8] 
18:23:18.660000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 9098 win 30408 <nop,nop,timestamp 456421 18508> (DF) [tos 0x8] 
18:23:18.660000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 9098:10546(1448) ack 1473 win 8216 <nop,nop,timestamp 18508 456421> (DF) [tos 0x8] 
18:23:18.770000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 10546 win 31856 <nop,nop,timestamp 456432 18508> (DF) [tos 0x8] 
18:23:18.770000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 11994:13442(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
18:23:18.770000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 10546 win 31856 <nop,nop,timestamp 456432 18508,nop,nop, sack 1 {11994:13442} > (DF) [tos 0x8] 
18:23:18.770000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 14890:16338(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
18:23:18.770000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 10546 win 31856 <nop,nop,timestamp 456432 18508,nop,nop, sack 2 {14890:16338}{11994:13442} > (DF) [tos 0x8] 
18:23:18.770000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 10546:11994(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
18:23:18.770000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 13442 win 28960 <nop,nop,timestamp 456432 18518,nop,nop, sack 1 {14890:16338} > (DF) [tos 0x8] 
18:23:18.980000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 13442:14890(1448) ack 1473 win 8216 <nop,nop,timestamp 18540 456432> (DF) [tos 0x8] 
18:23:18.980000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 16338 win 30408 <nop,nop,timestamp 456453 18540> (DF) [tos 0x8] 
18:23:18.980000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 16338:17786(1448) ack 1473 win 8216 <nop,nop,timestamp 18540 456453> (DF) [tos 0x8] 
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540> (DF) [tos 0x8] 
18:23:19.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 19234:20682(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 1 {19234:20682} > (DF) [tos 0x8] 
18:23:19.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 22130:23578(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 2 {22130:23578}{19234:20682} > (DF) [tos 0x8] 
18:23:19.100000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 23578:25026(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
18:23:19.100000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 2 {22130:25026}{19234:20682} > (DF) [tos 0x8] 

<long pause, eventually I kill the scp process>

18:23:52.720000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: F 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:25026}{19234:20682} > (DF) [tos 0x8] 
18:23:52.720000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 25026:26474(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
18:23:52.720000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:26474}{19234:20682} > (DF) [tos 0x8] 
18:23:52.720000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 26474:27922(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
18:23:52.720000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:27922}{19234:20682} > (DF) [tos 0x8] 
18:23:52.720000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: . 27922:29370(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
18:23:52.720000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:29370}{19234:20682} > (DF) [tos 0x8] 
18:23:52.720000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: P 29370:30818(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
18:23:52.720000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:30818}{19234:20682} > (DF) [tos 0x8] 
18:23:52.720000 eth0 < 198.17.100.41.ssh > 198.17.100.42.1119: F 30818:30818(0) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
18:23:52.720000 eth0 > 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:30819}{19234:20682} > (DF) [tos 0x8] 

Full tcpdump from 198.17.100.41's point of view:

12:21:17.157458 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: S 3061649587:3061649587(0) win 32120 <mss 1460,sackOK,timestamp 456129 0,nop,wscale 0> (DF)
12:21:17.158821 eth0 > arp who-has 198.17.100.42 tell 198.17.100.41 (0:80:f6:10:50:26)
12:21:17.158925 eth0 < arp reply 198.17.100.42 is-at 0:30:65:ba:84:e6 (0:80:f6:10:50:26)
12:21:17.158954 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: S 2117491474:2117491474(0) ack 3061649588 win 5792 <mss 1460,sackOK,timestamp 18215 456129,nop,wscale 0> (DF)
12:21:17.159068 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1:1(0) ack 1 win 32120 <nop,nop,timestamp 456129 18215> (DF)
12:21:17.173245 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1:26(25) ack 1 win 5792 <nop,nop,timestamp 18217 456129> (DF)
12:21:17.173375 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1:1(0) ack 26 win 32120 <nop,nop,timestamp 456131 18217> (DF)
12:21:17.175290 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1:25(24) ack 26 win 32120 <nop,nop,timestamp 456131 18217> (DF)
12:21:17.175349 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 26:26(0) ack 25 win 5792 <nop,nop,timestamp 18217 456131> (DF)
12:21:17.176001 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 26:666(640) ack 25 win 5792 <nop,nop,timestamp 18217 456131> (DF)
12:21:17.176745 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 25:657(632) ack 666 win 31856 <nop,nop,timestamp 456131 18217> (DF)
12:21:17.210026 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 666:666(0) ack 657 win 6952 <nop,nop,timestamp 18221 456131> (DF)
12:21:17.210147 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 657:673(16) ack 666 win 31856 <nop,nop,timestamp 456134 18221> (DF)
12:21:17.210206 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 666:666(0) ack 673 win 6952 <nop,nop,timestamp 18221 456134> (DF)
12:21:17.226137 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 666:946(280) ack 673 win 6952 <nop,nop,timestamp 18222 456134> (DF)
12:21:17.241676 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 673:673(0) ack 946 win 31856 <nop,nop,timestamp 456138 18222> (DF)
12:21:17.329702 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 673:945(272) ack 946 win 31856 <nop,nop,timestamp 456146 18222> (DF)
12:21:17.360030 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 946:946(0) ack 945 win 8216 <nop,nop,timestamp 18236 456146> (DF)
12:21:17.497569 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 946:1522(576) ack 945 win 8216 <nop,nop,timestamp 18249 456146> (DF)
12:21:17.511696 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 945:945(0) ack 1522 win 31856 <nop,nop,timestamp 456165 18249> (DF)
12:21:17.511767 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1522:1538(16) ack 945 win 8216 <nop,nop,timestamp 18251 456165> (DF)
12:21:17.531683 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 945:945(0) ack 1538 win 31856 <nop,nop,timestamp 456167 18251> (DF)
12:21:17.635448 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 945:961(16) ack 1538 win 31856 <nop,nop,timestamp 456177 18251> (DF)
12:21:17.635498 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 1538:1538(0) ack 961 win 8216 <nop,nop,timestamp 18263 456177> (DF)
12:21:17.639100 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 961:1009(48) ack 1538 win 31856 <nop,nop,timestamp 456177 18263> (DF)
12:21:17.639160 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 1538:1538(0) ack 1009 win 8216 <nop,nop,timestamp 18263 456177> (DF)
12:21:17.639664 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1538:1586(48) ack 1009 win 8216 <nop,nop,timestamp 18263 456177> (DF)
12:21:17.644091 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1009:1073(64) ack 1586 win 31856 <nop,nop,timestamp 456178 18263> (DF)
12:21:17.675129 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1586:1650(64) ack 1073 win 8216 <nop,nop,timestamp 18267 456178> (DF)
12:21:17.691691 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1073:1073(0) ack 1650 win 31856 <nop,nop,timestamp 456183 18267> (DF)
12:21:19.731975 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1073:1233(160) ack 1650 win 31856 <nop,nop,timestamp 456387 18267> (DF)
12:21:19.740436 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1650:1682(32) ack 1233 win 8216 <nop,nop,timestamp 18474 456387> (DF)
12:21:19.742819 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1233:1297(64) ack 1682 win 31856 <nop,nop,timestamp 456388 18474> (DF)
12:21:19.743408 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1682:1730(48) ack 1297 win 8216 <nop,nop,timestamp 18474 456388> (DF)
12:21:19.746385 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1297:1377(80) ack 1730 win 31856 <nop,nop,timestamp 456388 18474> (DF) [tos 0x8] 
12:21:19.749056 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1730:1778(48) ack 1377 win 8216 <nop,nop,timestamp 18474 456388> (DF) [tos 0x8] 
12:21:19.751041 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1377:1425(48) ack 1778 win 31856 <nop,nop,timestamp 456388 18474> (DF) [tos 0x8] 
12:21:19.790043 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 1778:1778(0) ack 1425 win 8216 <nop,nop,timestamp 18479 456388> (DF) [tos 0x8] 
12:21:19.831486 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 1778:1858(80) ack 1425 win 8216 <nop,nop,timestamp 18483 456388> (DF) [tos 0x8] 
12:21:19.834435 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: P 1425:1473(48) ack 1858 win 31856 <nop,nop,timestamp 456397 18483> (DF) [tos 0x8] 
12:21:19.834508 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 1858:1858(0) ack 1473 win 8216 <nop,nop,timestamp 18483 456397> (DF) [tos 0x8] 
12:21:19.844416 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 1858:3306(1448) ack 1473 win 8216 <nop,nop,timestamp 18484 456397> (DF) [tos 0x8] 
12:21:19.844449 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 3306:4754(1448) ack 1473 win 8216 <nop,nop,timestamp 18484 456397> (DF) [tos 0x8] 
12:21:19.871765 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 3306 win 31856 <nop,nop,timestamp 456401 18484> (DF) [tos 0x8] 
12:21:19.871838 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 4754:6202(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
12:21:19.871858 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 6202:7650(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
12:21:19.872119 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 3306 win 31856 <nop,nop,timestamp 456401 18484,nop,nop, sack 1 {4754:6202} > (DF) [tos 0x8] 
12:21:19.872189 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 7650:9098(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
12:21:19.872456 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 3306 win 31856 <nop,nop,timestamp 456401 18484,nop,nop, sack 2 {7650:9098}{4754:6202} > (DF) [tos 0x8] 
12:21:19.872543 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 3306:4754(1448) ack 1473 win 8216 <nop,nop,timestamp 18487 456401> (DF) [tos 0x8] 
12:21:19.872810 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 6202 win 28960 <nop,nop,timestamp 456401 18487,nop,nop, sack 1 {7650:9098} > (DF) [tos 0x8] 
12:21:20.080039 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 6202:7650(1448) ack 1473 win 8216 <nop,nop,timestamp 18508 456401> (DF) [tos 0x8] 
12:21:20.080331 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 9098 win 30408 <nop,nop,timestamp 456421 18508> (DF) [tos 0x8] 
12:21:20.080400 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 9098:10546(1448) ack 1473 win 8216 <nop,nop,timestamp 18508 456421> (DF) [tos 0x8] 
12:21:20.080420 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 10546:11994(1448) ack 1473 win 8216 <nop,nop,timestamp 18508 456421> (DF) [tos 0x8] 
12:21:20.181759 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 10546 win 31856 <nop,nop,timestamp 456432 18508> (DF) [tos 0x8] 
12:21:20.181840 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 11994:13442(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
12:21:20.181860 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 13442:14890(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
12:21:20.182111 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 10546 win 31856 <nop,nop,timestamp 456432 18508,nop,nop, sack 1 {11994:13442} > (DF) [tos 0x8] 
12:21:20.182177 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 14890:16338(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
12:21:20.182436 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 10546 win 31856 <nop,nop,timestamp 456432 18508,nop,nop, sack 2 {14890:16338}{11994:13442} > (DF) [tos 0x8] 
12:21:20.182524 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 10546:11994(1448) ack 1473 win 8216 <nop,nop,timestamp 18518 456432> (DF) [tos 0x8] 
12:21:20.182796 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 13442 win 28960 <nop,nop,timestamp 456432 18518,nop,nop, sack 1 {14890:16338} > (DF) [tos 0x8] 
12:21:20.400039 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 13442:14890(1448) ack 1473 win 8216 <nop,nop,timestamp 18540 456432> (DF) [tos 0x8] 
12:21:20.400326 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 16338 win 30408 <nop,nop,timestamp 456453 18540> (DF) [tos 0x8] 
12:21:20.400394 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 16338:17786(1448) ack 1473 win 8216 <nop,nop,timestamp 18540 456453> (DF) [tos 0x8] 
12:21:20.400414 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 17786:19234(1448) ack 1473 win 8216 <nop,nop,timestamp 18540 456453> (DF) [tos 0x8] 
12:21:20.511768 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540> (DF) [tos 0x8] 
12:21:20.511847 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 19234:20682(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
12:21:20.511865 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 20682:22130(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
12:21:20.512118 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 1 {19234:20682} > (DF) [tos 0x8] 
12:21:20.512184 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 22130:23578(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
12:21:20.512442 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 2 {22130:23578}{19234:20682} > (DF) [tos 0x8] 
12:21:20.512512 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 23578:25026(1448) ack 1473 win 8216 <nop,nop,timestamp 18551 456465> (DF) [tos 0x8] 
12:21:20.512782 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 456465 18540,nop,nop, sack 2 {22130:25026}{19234:20682} > (DF) [tos 0x8] 

<long pause, eventually I kill the scp process>

12:21:54.133626 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: F 1473:1473(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:25026}{19234:20682} > (DF) [tos 0x8] 
12:21:54.133727 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 25026:26474(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
12:21:54.134047 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:26474}{19234:20682} > (DF) [tos 0x8] 
12:21:54.134130 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 26474:27922(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
12:21:54.134392 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:27922}{19234:20682} > (DF) [tos 0x8] 
12:21:54.134480 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: . 27922:29370(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
12:21:54.134738 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:29370}{19234:20682} > (DF) [tos 0x8] 
12:21:54.134814 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: P 29370:30818(1448) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
12:21:54.135102 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:30818}{19234:20682} > (DF) [tos 0x8] 
12:21:54.135956 eth0 > 198.17.100.41.ssh > 198.17.100.42.1119: F 30818:30818(0) ack 1474 win 8216 <nop,nop,timestamp 21913 459827> (DF) [tos 0x8] 
12:21:54.136058 eth0 < 198.17.100.42.1119 > 198.17.100.41.ssh: . 1474:1474(0) ack 17786 win 31856 <nop,nop,timestamp 459827 18540,nop,nop, sack 2 {22130:30819}{19234:20682} > (DF) [tos 0x8] 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: tcp connection hangs on connect
  2001-08-30  1:53     ` Val Henson
@ 2001-08-30 16:21       ` kuznet
  2001-08-30 18:10         ` Lost TCP retransmission timer (was Re: tcp connection hangs on connect) Val Henson
  2001-08-30 21:56         ` Lost TCP retransmission timer David S. Miller
  0 siblings, 2 replies; 12+ messages in thread
From: kuznet @ 2001-08-30 16:21 UTC (permalink / raw)
  To: Val Henson; +Cc: davem, linux-kernel

Hello!

> :) I was hoping Alexey would respond with "Oh yeah, here's that patch,

Your hopes were groundless.
Actually, you could change subject, this apparently has nothing
to do with your problem and this is misleading.

I have no idea what happens in your case, apparently, retransmission
timer is lost on sender, which is absolutely impossible. :-)
Well, send me cat of /proc/tcp after the stall happened.

Alexey

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Lost TCP retransmission timer (was Re: tcp connection hangs on connect)
  2001-08-30 16:21       ` kuznet
@ 2001-08-30 18:10         ` Val Henson
  2001-08-30 21:56         ` Lost TCP retransmission timer David S. Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Val Henson @ 2001-08-30 18:10 UTC (permalink / raw)
  To: kuznet; +Cc: davem, linux-kernel

On Thu, Aug 30, 2001 at 08:21:16PM +0400, kuznet@ms2.inr.ac.ru wrote:
> 
> Your hopes were groundless.
> Actually, you could change subject, this apparently has nothing
> to do with your problem and this is misleading.

You're right.  I thought the subject was "tcp connection hangs." :)

> I have no idea what happens in your case, apparently, retransmission
> timer is lost on sender, which is absolutely impossible. :-)
> Well, send me cat of /proc/tcp after the stall happened.

At least one of the tcpdumps I took showed at least one successful
retransmission before the failure.  Here's /proc/tcp at 1 second
intervals, starting just before the connection starts and ending after
I kill it.  It looks to me like the timer is going off but the segment
isn't getting transmitted.

  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 00000000:00000000 02:000AFC42 00000000     0        0 1502 3 c05c0040 21 4 5 2 -1                               
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 00000000:00000000 02:000AFBDD 00000000     0        0 1502 2 c05c0040 21 4 5 2 -1                               
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 00003890:00000000 01:00000011 00000000     0        0 1502 9 c05c0040 21 4 1 2 2                                
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:00000007 00000001     0        0 1502 15 c05c0040 46 4 1 1 2                               
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:000000B5 00000003     0        0 1502 15 c05c0040 184 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:0000004F 00000003     0        0 1502 15 c05c0040 184 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:00000159 00000004     0        0 1502 15 c05c0040 368 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:000000F3 00000004     0        0 1502 15 c05c0040 368 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:0000008D 00000004     0        0 1502 15 c05c0040 368 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:00000027 00000004     0        0 1502 15 c05c0040 368 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:000002A1 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:0000023B 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:000001D5 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:0000016F 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:00000109 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:000000A3 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 773 1 c05c0360 300 0 0 2 -1                               
   1: C6116429:0016 C611642A:046B 01 000032E8:00000000 01:0000003D 00000005     0        0 1502 15 c05c0040 736 4 1 1 2                              
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 7

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Lost TCP retransmission timer
  2001-08-30 16:21       ` kuznet
  2001-08-30 18:10         ` Lost TCP retransmission timer (was Re: tcp connection hangs on connect) Val Henson
@ 2001-08-30 21:56         ` David S. Miller
  2001-08-30 22:11           ` Val Henson
  1 sibling, 1 reply; 12+ messages in thread
From: David S. Miller @ 2001-08-30 21:56 UTC (permalink / raw)
  To: val; +Cc: kuznet, linux-kernel

   From: Val Henson <val@nmt.edu>
   Date: Thu, 30 Aug 2001 12:10:25 -0600

   On Thu, Aug 30, 2001 at 08:21:16PM +0400, kuznet@ms2.inr.ac.ru wrote:
   > Your hopes were groundless.
   > Actually, you could change subject, this apparently has nothing
   > to do with your problem and this is misleading.
   
   You're right.  I thought the subject was "tcp connection hangs." :)

BTW, you mentioned that you are seeing this on PPC, do you have any
way to verify if the bug can be triggered on any other platform?

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Lost TCP retransmission timer
  2001-08-30 21:56         ` Lost TCP retransmission timer David S. Miller
@ 2001-08-30 22:11           ` Val Henson
  2001-09-01 14:36             ` kuznet
  0 siblings, 1 reply; 12+ messages in thread
From: Val Henson @ 2001-08-30 22:11 UTC (permalink / raw)
  To: David S. Miller; +Cc: kuznet, linux-kernel

On Thu, Aug 30, 2001 at 02:56:09PM -0700, David S. Miller wrote:
> 
> BTW, you mentioned that you are seeing this on PPC, do you have any
> way to verify if the bug can be triggered on any other platform?

I'm currently away from my x86 workstation, but I can try it this
weekend.

The requirements for triggering this are:

2.4.6 or higher kernel
2.4.6 machine pushes lots of data on _first_ TCP connection after boot
Lots of packets from 2.4.6 machine are dropped (I'm using 10 Mb hub)

-VAL

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Lost TCP retransmission timer
  2001-08-30 22:11           ` Val Henson
@ 2001-09-01 14:36             ` kuznet
  2001-09-04 21:07               ` Val Henson
  0 siblings, 1 reply; 12+ messages in thread
From: kuznet @ 2001-09-01 14:36 UTC (permalink / raw)
  To: Val Henson; +Cc: davem, linux-kernel

Hello!

> 2.4.6 machine pushes lots of data on _first_ TCP connection after boot

Lots? I see only about 24K of data transmitted in both your samples.

Actually, the problem is more or less clear from your /proc/net/tcp.
You use some funny device or netfilter plugin, which leak memory.
You can look into 7th column of /proc/net/tcp to estimate amount of leaked
buffers. When it reaches ~15, connection stalls. Seems, it raises
monotonically, so that it looks like all the buffers leak.

What is output device?

Alexey

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Lost TCP retransmission timer
  2001-09-01 14:36             ` kuznet
@ 2001-09-04 21:07               ` Val Henson
  2001-09-05 15:23                 ` kuznet
  0 siblings, 1 reply; 12+ messages in thread
From: Val Henson @ 2001-09-04 21:07 UTC (permalink / raw)
  To: kuznet; +Cc: davem, linux-kernel

On Sat, Sep 01, 2001 at 06:36:35PM +0400, kuznet@ms2.inr.ac.ru wrote:
> 
> Lots? I see only about 24K of data transmitted in both your samples.

:) Okay, "lots" is relative.  If you only telnet in and exit right
away, you won't see it.

> Actually, the problem is more or less clear from your /proc/net/tcp.
> You use some funny device or netfilter plugin, which leak memory.
> You can look into 7th column of /proc/net/tcp to estimate amount of leaked
> buffers. When it reaches ~15, connection stalls. Seems, it raises
> monotonically, so that it looks like all the buffers leak.
> 
> What is output device?

ncr885e, which I believe I am the de facto maintainer of... Dan Cox
wrote it while he was working for Synergy Microsystems.  When he quit,
I was hired to replace him.  Anyone else using this driver?  I have a
patch for it that fixes some other things (currently only in the
LinuxPPC tree) but not this.

Thanks for the help, I think this is my problem to fix now.

-VAL

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Lost TCP retransmission timer
  2001-09-04 21:07               ` Val Henson
@ 2001-09-05 15:23                 ` kuznet
  0 siblings, 0 replies; 12+ messages in thread
From: kuznet @ 2001-09-05 15:23 UTC (permalink / raw)
  To: Val Henson; +Cc: davem, linux-kernel

Hello!

> ncr885e,

At least the version of this driver in standrad kernel has no chances to work.
I have never seen so broken driver before this to be honest. :-)

> patch for it that fixes some other things (currently only in the
> LinuxPPC tree) but not this.

Well, send me this patch at least. Probably I will be able to fix at least
major holes here.

Alexey

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: tcp connection hangs on connect
  2001-08-03  7:00 tcp connection hangs on connect Philipp Reisner
@ 2001-08-05  2:04 ` Alexey Kuznetsov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexey Kuznetsov @ 2001-08-05  2:04 UTC (permalink / raw)
  To: philipp.reisner; +Cc: linux-kernel

Hello!

> simply hangs after some minutes to an hour. The script runs on
> a Linux-2.2.19 Box (we have also tested Linux-2.4.2)

This bug has been fixed in later 2.4s.

Corresponding fix to 2.2 is expected to be in 2.2.20 and it is available
in Alan's 2.2.20-pre.

Alexey

^ permalink raw reply	[flat|nested] 12+ messages in thread

* tcp connection hangs on connect
@ 2001-08-03  7:00 Philipp Reisner
  2001-08-05  2:04 ` Alexey Kuznetsov
  0 siblings, 1 reply; 12+ messages in thread
From: Philipp Reisner @ 2001-08-03  7:00 UTC (permalink / raw)
  To: linux-kernel

Hi,

I have discovered here something that causes TCP connections to
hang if one of the initial packets is lost.

We have a script which runs scp every 10 seconds, and the script
simply hangs after some minutes to an hour. The script runs on
a Linux-2.2.19 Box (we have also tested Linux-2.4.2) and the ssh server
is running on some Windows box.

Here is the good case:

15:26:11.100413 192.168.53.4.4819 > 212.31.78.62.22: S 2626815412:2626815412(0) +win 16060 <mss 1460,sackOK,timestamp 809461847[|tcp]> (DF)
15:26:11.119964 212.31.78.62.22 > 192.168.53.4.4819: S 3560917622:3560917622(0) +ack 2626815413 win 17520 <mss 1460,nop,wscale 0,nop,nop,timestamp[|tcp]> (DF)
15:26:11.120011 192.168.53.4.4819 > 212.31.78.62.22: . ack 1 win 16060 +<nop,nop,timestamp 809461848 0> (DF)
15:26:11.228046 212.31.78.62.22 > 192.168.53.4.4819: P 1:24(23) ack 1 win 17520 +<nop,nop,timestamp 6062108 809461848> (DF)

Here is the hang:

12:01:24.753703 192.168.53.4.4442 > 212.31.78.62.22: S 2538486974:2538486974(0) +win 16060 <mss 1460,sackOK,timestamp 808233194[|tcp]> (DF)
12:01:24.798610 212.31.78.62.22 > 192.168.53.4.4442: S 3871618076:3871618076(0) +ack 2538486975 win 17520 <mss 1460,nop,wscale 0,nop,nop,timestamp[|tcp]> (DF)
12:01:24.798729 192.168.53.4.4442 > 212.31.78.62.22: . ack 1 win 16060 +<nop,nop,timestamp 808233198 0> (DF)
12:01:28.048197 212.31.78.62.22 > 192.168.53.4.4442: S 3871618076:3871618076(0) +ack 2538486975 win 17520 <mss 1460,nop,wscale 0,nop,nop,timestamp[|tcp]> (DF)
12:01:34.611132 212.31.78.62.22 > 192.168.53.4.4442: S 3871618076:3871618076(0) +ack 2538486975 win 17520 <mss 1460,nop,wscale 0,nop,nop,timestamp[|tcp]> (DF)

192.168.53.4: Is the Linux box.
212.31.78.62: Is the Windows box.

It looks like that the packet at 12:01:24.798729 never reaches the Windows
box. Ok -- That is probabely why the Windows box resends it's syn packet
(at 12:01:28.048197 and 12:01:34.611132).

BTW, the Linux box is convinced that the connection is established
(confirmed with lsof), while the Windows box probabely does not think
that there is a connection.

The question is, why is Linux not responding to the resent syn packets ?

PS: the process (scp/ssh) on the Linux side of the connection wants
    to read from the socket (confirmed with strace).

-Philipp



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2001-09-05 15:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20010828171705.F890@boardwalk>
     [not found] ` <20010829082405.A5966@cubit.at>
2001-08-30  0:20   ` tcp connection hangs on connect Val Henson
2001-08-30  0:55   ` David S. Miller
2001-08-30  1:53     ` Val Henson
2001-08-30 16:21       ` kuznet
2001-08-30 18:10         ` Lost TCP retransmission timer (was Re: tcp connection hangs on connect) Val Henson
2001-08-30 21:56         ` Lost TCP retransmission timer David S. Miller
2001-08-30 22:11           ` Val Henson
2001-09-01 14:36             ` kuznet
2001-09-04 21:07               ` Val Henson
2001-09-05 15:23                 ` kuznet
2001-08-03  7:00 tcp connection hangs on connect Philipp Reisner
2001-08-05  2:04 ` Alexey Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).