linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug report: tcp staled when send-q != 0, timers == 0.
@ 2001-04-09 14:43 Eugene B. Berdnikov
  2001-04-10 17:38 ` kuznet
  0 siblings, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-09 14:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: kuznet

    Hi all.

 In brief: a stale state of the tcp send queue was observed for 2.2.17
 while send-q counter and connection window sizes are not zero: 

 % netstat -n -eot | grep 1018
 tcp        0  13064 194.190.166.31:22       194.190.161.106:1018
	    ESTABLISHED 0          11964      off (0.00/0/0)

 Host 194.190.166.31: 2.2.17 on PPro-180 (HP NetServer E40),
 compiled with CONFIG_M686=y and running sshd1 server.

 SYMPTOMS

 1. When data is sent to 194.190.166.31, it is ack'ed with non-zero
    window size, but zero data block is returned (whereas send-q!=0).
    Strace shows, that server gets data via read(2) and replies
    with write(2) to the same descriptor. The send-q counter is
    encremented by the number of bytes sent, but timers remain zero.
    No data is returned to the network.

 2. When data is is sent to the controlling pty, sshd also steps
    through the read(2) and write(2), with the same result
    (send-q is encremented, timers do not change, no traffic).

 3. Keepalive is enabled in ssh client and server (on both sides).
    The curious thing is that every keepalive interval (2h as default)
    host 194.190.166.31 sends exactly one packet with ethernet
    MTU size:

    17:50:30.391347 > 194.190.166.31.ssh > 194.190.161.106.1018:
	 P 1:1449(1448) ack 1 win 32640
	 <nop,nop,timestamp 72137447 1733874370> (DF) [tos 0x10]
    17:50:31.102567 < 194.190.161.106.1018 > 194.190.166.31.ssh:
	 . 1:1(0) ack 1449 win 32120
	 <nop,nop,timestamp 1734601828 72137447> (DF) [tos 0x10]

    The value send-q is properly decremented every such transmission.

  4. This connection was traced for a long time, and when send-q counter
     reaches zero (due to "keepalive" exchange), it gets out from its
     stale state. Now it behaves as an normal connection.
     I still keep it for investigation (if any:).

 INFO

 Here is some supplementary information on the network configuration for
 this machine. I belive it is not directly concerned to the bug discussed,
 but put here for the completeness.

 Host has Intel EtherExpress-100 ethernet card (standard driver from 2.2.17)
 and SkyMedia-200 DVB sattelite receiver, running driver "sm200_lnx"
 from Telemann. Staled connection passes over ethernet only.

 # lsmod
 Module                  Size  Used by
 sm200_lnx              18800   1 
 eepro100               16180   1  (autoclean)

 # ip -s l l
 1: lo: <LOOPBACK,UP> mtu 3924 qdisc noqueue 
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
     RX: bytes  packets  errors  dropped overrun mcast   
     349139839  1395237  0       0       0       0      
     TX: bytes  packets  errors  dropped carrier collsns 
     349139839  1395237  0       0       0       0      
 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
     link/ether 00:a0:c9:9e:c9:7d brd ff:ff:ff:ff:ff:ff
     RX: bytes  packets  errors  dropped overrun mcast   
     2338597780 7637867  0       0       0       0      
     TX: bytes  packets  errors  dropped carrier collsns 
     3016099313 7378871  0       0       0       293385 
 58: sm200: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
     link/ether 00:90:bc:01:1a:da brd ff:ff:ff:ff:ff:ff
     RX: bytes  packets  errors  dropped overrun mcast   
     603326460  896671   0       0       0       0      
     TX: bytes  packets  errors  dropped carrier collsns 
     0          0        0       0       0       0      
 
 # ip -s r l
 192.168.30.31 dev eth0  scope link 
 194.190.166.31 dev eth0  scope link  src 194.190.166.31 
 192.168.30.0/24 dev eth0  proto kernel  scope link  src 192.168.30.31 
 127.0.0.0/8 dev lo  scope link 
 default via 192.168.30.34 dev eth0  src 194.190.166.31 

 Here is the line from /proc/net/tcp for this connection when it was stale:

 128: 1FA6BEC2:0016 6AA1BEC2:03FA 01 00003394:00000000 00:00000000 00000000
     0        0 11964                               

 That's all that I consider as interesting.

 I was told by ANK, that it might be helpfull to find a socket in the memory
 and dump it contents, while it was staled. However, the time was lost...

 If anybody tell me which additional information can be extracted for the
 diagnostic, I'll try to get it. In any case, I plan to run something through
 this connection in hope to reproduce this state again.
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-09 14:43 Bug report: tcp staled when send-q != 0, timers == 0 Eugene B. Berdnikov
@ 2001-04-10 17:38 ` kuznet
  2001-04-10 21:19   ` Eugene B. Berdnikov
  0 siblings, 1 reply; 16+ messages in thread
From: kuznet @ 2001-04-10 17:38 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: linux-kernel, Dave Miller

Hello!

>  In brief: a stale state of the tcp send queue was observed for 2.2.17
>  while send-q counter and connection window sizes are not zero: 

I think I pinned down this. The patch is appended.


>  diagnostic, I'll try to get it. In any case, I plan to run something through
>  this connection in hope to reproduce this state again.

If my guess is right, you can easily put this socket to funny state
just catting a large file and kill -STOP'ing ssh. ssh will close window,
but sshd will not send zero probes. Any socket with keepalives enabled
enters this state after the first keepalive is sent.
[ Note, that it is not Butenko's problem, it is still to be discovered. 8) ]

I think you will not able to reproduce full problem: socket will revive
after the first received ACK. It is another bug and its probability is
astronomically low.

Alexey


--- linux/net/ipv4/tcp_input.c.orig	Mon Apr  9 22:46:56 2001
+++ linux/net/ipv4/tcp_input.c	Tue Apr 10 21:23:33 2001
@@ -733,8 +733,6 @@
 	if (tp->retransmits) {
 		if (tp->packets_out == 0) {
 			tp->retransmits = 0;
-			tp->fackets_out = 0;
-			tp->retrans_out = 0;
 			tp->backoff = 0;
 			tcp_set_rto(tp);
 		} else {
@@ -781,8 +779,10 @@
 	if(sk->zapped)
 		return(1);	/* Dead, can't ack any more so why bother */
 
-	if (tp->pending == TIME_KEEPOPEN)
+	if (tp->pending == TIME_KEEPOPEN) {
 	  	tp->probes_out = 0;
+		tp->pending = 0;
+	}
 
 	tp->rcv_tstamp = tcp_time_stamp;
 
@@ -850,8 +850,6 @@
 		if (tp->retransmits) {
 			if (tp->packets_out == 0) {
 				tp->retransmits = 0;
-				tp->fackets_out = 0;
-				tp->retrans_out = 0;
 			}
 		} else {
 			/* We don't have a timestamp. Can only use
@@ -878,6 +876,8 @@
 			tcp_ack_packets_out(sk, tp);
 	} else {
 		tcp_clear_xmit_timer(sk, TIME_RETRANS);
+		tp->fackets_out = 0;
+		tp->retrans_out = 0;
 	}
 
 	flag &= (FLAG_DATA | FLAG_WIN_UPDATE);
--- linux/net/ipv4/tcp_output.c.orig	Mon Apr  9 22:47:06 2001
+++ linux/net/ipv4/tcp_output.c	Tue Apr 10 21:23:33 2001
@@ -546,6 +546,8 @@
 		 */
 		kfree_skb(next_skb);
 		sk->tp_pinfo.af_tcp.packets_out--;
+		if (sk->tp_pinfo.af_tcp.fackets_out)
+			sk->tp_pinfo.af_tcp.fackets_out--;
 	}
 }
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-10 17:38 ` kuznet
@ 2001-04-10 21:19   ` Eugene B. Berdnikov
  2001-04-11 10:16     ` Eugene B. Berdnikov
  2001-04-11 16:35     ` kuznet
  0 siblings, 2 replies; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-10 21:19 UTC (permalink / raw)
  To: kuznet; +Cc: Eugene B. Berdnikov, linux-kernel, Dave Miller

  Hello.

On Tue, Apr 10, 2001 at 09:38:43PM +0400, kuznet@ms2.inr.ac.ru wrote:
> If my guess is right, you can easily put this socket to funny state
> just catting a large file and kill -STOP'ing ssh. ssh will close window,
> but sshd will not send zero probes.

 [1] I have checked your statement on 2 different machines, running 2.2.17.
 No confirmation. But this is much more funny than it simply sounds. :)

 The thing is that one machine (which run ssh client in my bug report)
 do send ACKs when ssh is SIGSTOP'ed. The other one does not send ACKs,
 but much more curious is that it does not send ACKs even when input
 buffer is filled, and client IS NOT stopped! :))) Hence connection dies
 due to retransmission timeout on the server side.

 I did not believe my own eyes and tried this test several times, with
 ssh1 and openssh, copying ssh configs, but results were always the same.

 Both hosts are running 2.2.17 on K6 processors, compiled via egcs-1.1.2,
 with minor differences in the kernel configuration. If you really check
 your statements before writing, you surely have a 2.2.17 which behave some
 another way, which I can't reproduce. Isn't funny? :)))

 I can send configs (and even binary kernels with modules) for verification.
 If this is not a complete fault, we have a very-very sad situation, when
 tcp core behaviour depends on the secondary configuration options.
 I have no other ideas how it can be explained.

 [2] Your second statement is that sshd with keepalive enabled does not send
 zero probes when input window is closed. Be sure, in my case it sends:

 01:04:05.025715 194.190.166.31.22 > 194.190.161.106.1006: . ack 1 win 32120 <nop,nop,timestamp 117938386 1780393243> (DF) [tos 0x10]
 01:04:05.025816 194.190.161.106.1006 > 194.190.166.31.22: . ack 17376 win 0 <nop,nop,timestamp 1780405324 117898941> (DF) [tos 0x10]
 01:06:05.953026 194.190.166.31.22 > 194.190.161.106.1006: . ack 1 win 32120 <nop,nop,timestamp 117950477 1780405324> (DF) [tos 0x10]
 01:06:05.953122 194.190.161.106.1006 > 194.190.166.31.22: . ack 17376 win 0 <nop,nop,timestamp 1780417417 117898941> (DF) [tos 0x10]

 BTW, I strongly rule out a possibility to stop my ssh client when I
 encounter the reported bug.

> Any socket with keepalives enabled
> enters this state after the first keepalive is sent.

 I do not understand how connection with closed window can wait until
 first keepalive - it must do zero probes instead.

> [ Note, that it is not Butenko's problem, it is still to be discovered. 8) ]
> 
> I think you will not able to reproduce full problem: socket will revive
> after the first received ACK. It is another bug and its probability is
> astronomically low.

 Hmm... I observed this bug on the host, which never performs more
 than 10 conn/sec and has peak loadvg ~ 0.15.
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-10 21:19   ` Eugene B. Berdnikov
@ 2001-04-11 10:16     ` Eugene B. Berdnikov
  2001-04-11 16:56       ` kuznet
  2001-04-11 16:35     ` kuznet
  1 sibling, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-11 10:16 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, Dave Miller

  Hello.

 I'd like to make additional comments to my previous message.

On Wed, Apr 11, 2001 at 01:19:01AM +0400, Eugene B. Berdnikov wrote:
>  The thing is that one machine (which run ssh client in my bug report)
>  do send ACKs when ssh is SIGSTOP'ed. The other one does not send ACKs,
>  but much more curious is that it does not send ACKs even when input
>  buffer is filled, and client IS NOT stopped! :))) Hence connection dies
>  due to retransmission timeout on the server side.
[...]
>  Both hosts are running 2.2.17 on K6 processors, compiled via egcs-1.1.2,
>  with minor differences in the kernel configuration.

 My observation of the "buggy" 2.2.17 was on the host, connected via
 modem and running ppp-2.4.0b2 with MTU=256. Today I got this kernel and
 modules, and run it on another machine with Cel-450 and 3c590B-TX
 ethernet card.  It also exhibits loss of ACKs. My study shows, that
 it depends upon MTU and keepalive flags:

   mtu 382 + keepalive yes -> loss
   mtu 382 + keepalive no  -> ok
   mtu 383 + any keepalive -> ok

 I tested several values of MTU over and below 382, and it seems me that
 value 382 is a boundary between normal and error behaviour.

 Then I have tested kernel 2.2.14-5.0 from the RedHat-6.2 distribution on
 the same machine (Cel-450 + 3c59x driver). It also shows loss of ACKs,
 but with another MTU boundary and independently of keepalive:

   mtu <= 420 + any keepalive  -> loss
   mtu >= 421 + any keepalive  -> ok

 At last, I tried several MTUs on 3d computer, running "right" 2.2.17, and
 could not find conditions, under which any loss of ACKs can be detected.

 So, the conclusion is that this loss depends upon kernel version,
 configuration options and MTU on the interface. I suspect this is
 another bug, accationaly found in discussion. :)

 I complete my statements with the illustration dump. Commands were like

 ifconfig ppp0 mtu 256
 ssh -o 'keepalive yes' 194.190.166.31 \
	'while true ; do cat /etc/passwd ; done' 2>&1 | less
 tcpdump -nl host 194.190.166.31

 [...]

 10:20:11.196983 > 172.16.42.57.1023 > 194.190.166.31.ssh: . 655:655(0) ack 30120 win 15708 <nop,nop,timestamp 8830012 121274899> (DF) [tos 0x10] 
 10:20:11.266845 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121274900 8829912> (DF) [tos 0x10] 
 10:20:11.356837 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30324:30528(204) ack 655 win 32616 <nop,nop,timestamp 121274900 8829912> (DF) [tos 0x10] 
 10:20:11.426832 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30528:30732(204) ack 655 win 32616 <nop,nop,timestamp 121274902 8829919> (DF) [tos 0x10] 
 10:20:11.476844 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30732:30936(204) ack 655 win 32616 <nop,nop,timestamp 121274902 8829919> (DF) [tos 0x10] 
 10:20:11.546843 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30936:31140(204) ack 655 win 32616 <nop,nop,timestamp 121274962 8829928> (DF) [tos 0x10] 
 10:20:11.636840 < 194.190.166.31.ssh > 172.16.42.57.1023: P 31140:31344(204) ack 655 win 32616 <nop,nop,timestamp 121274962 8829928> (DF) [tos 0x10] 
 10:20:11.706843 < 194.190.166.31.ssh > 172.16.42.57.1023: P 31344:31548(204) ack 655 win 32616 <nop,nop,timestamp 121274963 8829935> (DF) [tos 0x10] 
 10:20:11.766854 < 194.190.166.31.ssh > 172.16.42.57.1023: P 31548:31752(204) ack 655 win 32616 <nop,nop,timestamp 121274963 8829935> (DF) [tos 0x10] 
 10:20:11.866832 < 194.190.166.31.ssh > 172.16.42.57.1023: P 31752:31956(204) ack 655 win 32616 <nop,nop,timestamp 121274964 8829939> (DF) [tos 0x10] 
 10:20:11.926839 < 194.190.166.31.ssh > 172.16.42.57.1023: P 31956:32160(204) ack 655 win 32616 <nop,nop,timestamp 121274964 8829939> (DF) [tos 0x10] 
 10:20:11.996837 < 194.190.166.31.ssh > 172.16.42.57.1023: P 32160:32364(204) ack 655 win 32616 <nop,nop,timestamp 121274984 8829949> (DF) [tos 0x10] 
 10:20:12.066835 < 194.190.166.31.ssh > 172.16.42.57.1023: P 32364:32568(204) ack 655 win 32616 <nop,nop,timestamp 121274984 8829949> (DF) [tos 0x10] 
 10:20:12.126850 < 194.190.166.31.ssh > 172.16.42.57.1023: P 32568:32772(204) ack 655 win 32616 <nop,nop,timestamp 121274985 8829956> (DF) [tos 0x10] 
 10:20:12.216832 < 194.190.166.31.ssh > 172.16.42.57.1023: P 32772:32976(204) ack 655 win 32616 <nop,nop,timestamp 121274985 8829956> (DF) [tos 0x10] 
 10:20:12.286854 < 194.190.166.31.ssh > 172.16.42.57.1023: P 32976:33180(204) ack 655 win 32616 <nop,nop,timestamp 121274986 8829962> (DF) [tos 0x10] 
 10:20:12.356846 < 194.190.166.31.ssh > 172.16.42.57.1023: P 33180:33384(204) ack 655 win 32616 <nop,nop,timestamp 121274986 8829962> (DF) [tos 0x10] 
 10:20:12.426838 < 194.190.166.31.ssh > 172.16.42.57.1023: P 33384:33588(204) ack 655 win 32616 <nop,nop,timestamp 121275007 8829969> (DF) [tos 0x10] 
 10:20:12.516835 < 194.190.166.31.ssh > 172.16.42.57.1023: P 33588:33792(204) ack 655 win 32616 <nop,nop,timestamp 121275007 8829969> (DF) [tos 0x10] 
 10:20:12.576830 < 194.190.166.31.ssh > 172.16.42.57.1023: P 33792:33996(204) ack 655 win 32616 <nop,nop,timestamp 121275008 8829976> (DF) [tos 0x10] 
 10:20:12.646843 < 194.190.166.31.ssh > 172.16.42.57.1023: P 33996:34200(204) ack 655 win 32616 <nop,nop,timestamp 121275008 8829976> (DF) [tos 0x10] 
 10:20:12.706842 < 194.190.166.31.ssh > 172.16.42.57.1023: P 34200:34404(204) ack 655 win 32616 <nop,nop,timestamp 121275009 8829982> (DF) [tos 0x10] 
 10:20:12.776850 < 194.190.166.31.ssh > 172.16.42.57.1023: P 34404:34608(204) ack 655 win 32616 <nop,nop,timestamp 121275009 8829982> (DF) [tos 0x10] 
 10:20:12.846842 < 194.190.166.31.ssh > 172.16.42.57.1023: P 34608:34812(204) ack 655 win 32616 <nop,nop,timestamp 121275029 8829990> (DF) [tos 0x10] 
 10:20:12.936834 < 194.190.166.31.ssh > 172.16.42.57.1023: P 34812:35016(204) ack 655 win 32616 <nop,nop,timestamp 121275030 8829999> (DF) [tos 0x10] 
 10:20:13.006840 < 194.190.166.31.ssh > 172.16.42.57.1023: P 35016:35220(204) ack 655 win 32616 <nop,nop,timestamp 121275031 8830006> (DF) [tos 0x10] 
 10:20:13.046850 < 194.190.166.31.ssh > 172.16.42.57.1023: P 35220:35424(204) ack 655 win 32616 <nop,nop,timestamp 121275032 8830012> (DF) [tos 0x10] 
 10:20:21.376855 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121275972 8830012> (DF) [tos 0x10] 

 And from here 194.190.166.31 retransmits until timeout:

 10:20:40.146846 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121277852 8830012> (DF) [tos 0x10] 
 10:21:17.746854 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121281612 8830012> (DF) [tos 0x10] 
 10:22:32.956845 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121289132 8830012> (DF) [tos 0x10] 
 10:24:32.966837 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121301132 8830012> (DF) [tos 0x10] 
 10:26:32.986843 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121313132 8830012> (DF) [tos 0x10] 
 10:28:32.966854 < 194.190.166.31.ssh > 172.16.42.57.1023: P 30120:30324(204) ack 655 win 32616 <nop,nop,timestamp 121325132 8830012> (DF) [tos 0x10] 
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-10 21:19   ` Eugene B. Berdnikov
  2001-04-11 10:16     ` Eugene B. Berdnikov
@ 2001-04-11 16:35     ` kuznet
  2001-04-11 18:50       ` Eugene B. Berdnikov
  1 sibling, 1 reply; 16+ messages in thread
From: kuznet @ 2001-04-11 16:35 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: berd, linux-kernel, davem

Hello!

> > If my guess is right, you can easily put this socket to funny state
> > just catting a large file and kill -STOP'ing ssh. ssh will close window,
> > but sshd will not send zero probes.
> 
>  [1] I have checked your statement on 2 different machines, running 2.2.17.
>  No confirmation. But this is much more funny than it simply sounds. :)

_That_ socket which was stuck must show this behaviour.

To get this on new socket you should leave session idle for >2hours
until the first keeplaive. After this it will never probe under
any curcumstances. The bug was that keepalive corrupts state of timer
and probe0 timer is not started after this.


>  buffer is filled, and client IS NOT stopped! :))) Hence connection dies
>  due to retransmission timeout on the server side.

It is known linuxism. If the ratio connection_mss/link_mtu less than ~1/4
or connection is flood with tiny packets, after rcvbuf is full linux
enters memory paranoia mode pretending that all the packets are lost.
Ugly, unpleasant, but luckily harmless under any normal curcumstances.

One way to workaround is to set rx_copybreak on ethernet drivers to 400-500.

The bug is really difficult. It is not cured even in current 2.4
(only with zerocopy patch).


>  I do not understand how connection with closed window can wait until
>  first keepalive - it must do zero probes instead.

If socket has ever sent keepalive, it will not be able to send zero window
probes after this.


>  Hmm... I observed this bug on the host, which never performs more
>  than 10 conn/sec and has peak loadvg ~ 0.15.

8)8)8) Probability is probability.

Alexey

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 10:16     ` Eugene B. Berdnikov
@ 2001-04-11 16:56       ` kuznet
  2001-04-11 18:35         ` Eugene B. Berdnikov
  0 siblings, 1 reply; 16+ messages in thread
From: kuznet @ 2001-04-11 16:56 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: linux-kernel, davem

Hello!

>  At last, I tried several MTUs on 3d computer, running "right" 2.2.17, and
>  could not find conditions, under which any loss of ACKs can be detected.

8)8)8)

ppp also inclined to the mss/mtu bug, it allocates too large buffers
and never breaks them. The difference between kernels looks funny, but
I think it finds explanation in differences between mss/mtu's.

Alexey

[ I will be absent since tomorrow for some time. ]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 16:56       ` kuznet
@ 2001-04-11 18:35         ` Eugene B. Berdnikov
  2001-04-11 19:04           ` kuznet
  0 siblings, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-11 18:35 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, davem

  Hello.

On Wed, Apr 11, 2001 at 08:56:41PM +0400, kuznet@ms2.inr.ac.ru wrote:
> ppp also inclined to the mss/mtu bug, it allocates too large buffers
> and never breaks them. The difference between kernels looks funny, but
> I think it finds explanation in differences between mss/mtu's.

 In my experiments linux simply sets mss=mtu-40 at the start of ethernet
 connections. I do not know why, but belive it's ok. How the version of
 kernel and configuration options can affect mss later?
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 16:35     ` kuznet
@ 2001-04-11 18:50       ` Eugene B. Berdnikov
  2001-04-11 19:09         ` kuznet
  0 siblings, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-11 18:50 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, davem

On Wed, Apr 11, 2001 at 08:35:51PM +0400, kuznet@ms2.inr.ac.ru wrote:
> To get this on new socket you should leave session idle for >2hours
> until the first keeplaive. After this it will never probe under
> any curcumstances. The bug was that keepalive corrupts state of timer
> and probe0 timer is not started after this.

 Maybe. However, I did not understand, have you any reasonable explanation
 how can I get such a socket. Indeed, I have been dealing with active
 connection: I traced a squid redirector at a peak time of users web
 activity. Several lines of log per second. That's why I was surprised
 when this window become frozen.

 If your model does not cover such situation, pls, take it in mind. :)
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 18:35         ` Eugene B. Berdnikov
@ 2001-04-11 19:04           ` kuznet
  2001-04-11 19:28             ` Eugene B. Berdnikov
  0 siblings, 1 reply; 16+ messages in thread
From: kuznet @ 2001-04-11 19:04 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: linux-kernel, davem

Hello!

>  In my experiments linux simply sets mss=mtu-40 at the start of ethernet
>  connections. I do not know why, but belive it's ok. How the version of
>  kernel and configuration options can affect mss later?

You can figure out this yourself. In fact you measured this.

With mss=1460 the problem does not exist.

The problem begins f.e. when mss is less and packet arrives on ethernet.
It eats the same 1.5k of memory, but carries only ~mss bytes of tcp payload.
See? We do not know this forward, advertise large window, have not enough
rcvbuf to get it filled and cannot do anything but dropping new packets.

ppp is more difficult. Actually, I do not know exactly how it works now.
At least, ppp in 2.4 trims skb if it has too much of unused space.

Alexey

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 18:50       ` Eugene B. Berdnikov
@ 2001-04-11 19:09         ` kuznet
  2001-04-11 19:18           ` Eugene B. Berdnikov
  2001-04-13  8:54           ` Eugene B. Berdnikov
  0 siblings, 2 replies; 16+ messages in thread
From: kuznet @ 2001-04-11 19:09 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: linux-kernel, davem

Hello!

>  If your model does not cover such situation, pls, take it in mind. :)

Taken.

Is the machine UP? The only other known dubious place is smp specific...

BTW if that cursed socket is still alive, try to make the experiment
with filling window on it. It must stuck, or my theory is completely wrong.

Alexey

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 19:09         ` kuznet
@ 2001-04-11 19:18           ` Eugene B. Berdnikov
  2001-04-13  8:54           ` Eugene B. Berdnikov
  1 sibling, 0 replies; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-11 19:18 UTC (permalink / raw)
  To: kuznet; +Cc: Eugene B. Berdnikov, linux-kernel, davem

  Hello.

On Wed, Apr 11, 2001 at 11:09:35PM +0400, kuznet@ms2.inr.ac.ru wrote:
> Is the machine UP? The only other known dubious place is smp specific...

 It is a HP NetServer E40 with signle PPro-180. SMP is turned off in .config.

> BTW if that cursed socket is still alive, try to make the experiment
> with filling window on it. It must stuck, or my theory is completely wrong.

 OK. I'll try tomorrow on return to my working place. Both peer hosts are
 on UPSes and possibility to lose this connection is low. :)
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 19:04           ` kuznet
@ 2001-04-11 19:28             ` Eugene B. Berdnikov
  2001-04-11 19:37               ` kuznet
  0 siblings, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-11 19:28 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, davem

  Hello.

On Wed, Apr 11, 2001 at 11:04:04PM +0400, kuznet@ms2.inr.ac.ru wrote:
> >  In my experiments linux simply sets mss=mtu-40 at the start of ethernet
> >  connections. I do not know why, but belive it's ok. How the version of
> >  kernel and configuration options can affect mss later?
[...]
> The problem begins f.e. when mss is less and packet arrives on ethernet.
> It eats the same 1.5k of memory, but carries only ~mss bytes of tcp payload.
> See? We do not know this forward, advertise large window, have not enough
> rcvbuf to get it filled and cannot do anything but dropping new packets.

 However, I can't understand the dependency upon the kernel version, etc...

 Let me steak on this question again. In my experiments I found the
 dependency on the keepalive setting for connection on 2.2.17:

   mtu 382 + keepalive yes -> loss
   mtu 382 + keepalive no  -> ok

 I made 2 tries for each setting. Does your model of "mss/mtu bug" cover
 such a picture? If the answer is "yes", I am almost satisfied. :-)

 If this behaviour is not deterministic, and is driven by probability,
 does it mean that I can get other results with large number of tests?
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 19:28             ` Eugene B. Berdnikov
@ 2001-04-11 19:37               ` kuznet
  0 siblings, 0 replies; 16+ messages in thread
From: kuznet @ 2001-04-11 19:37 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: linux-kernel, davem

Hello!

>    mtu 382 + keepalive yes -> loss
>    mtu 382 + keepalive no  -> ok

Well, I ignored this because it looked as full sense. Sorry. 8)

>  such a picture? If the answer is "yes", I am almost satisfied. :-)

No, the answer is strict "no". Until keepalive is triggered the first
time, it cannot affect connection in _any_ way.


... sorry, I have to run. Let's defer the furter investigation.

Alexey

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-11 19:09         ` kuznet
  2001-04-11 19:18           ` Eugene B. Berdnikov
@ 2001-04-13  8:54           ` Eugene B. Berdnikov
       [not found]             ` <200104181928.XAA04912@ms2.inr.ac.ru>
  1 sibling, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-13  8:54 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel, davem

  Hello.

On Wed, Apr 11, 2001 at 11:09:35PM +0400, kuznet@ms2.inr.ac.ru wrote:
> BTW if that cursed socket is still alive, try to make the experiment
> with filling window on it. It must stuck, or my theory is completely wrong.

 Filling the socket via writing to pty (controlled by sshd), I found
 the state, which seems very similar to that I have reported:

 # netstat -n -eot | grep 1018
 tcp        0  37684 194.190.166.31:22       194.190.161.106:1018
    ESTABLISHED 0          11964      off (0.00/0/0)

 You see, timers are zero and send-q is not. Zero probes were NOT observed,
 exactly as you predict, but keepalive is correct.

 However, this is _not_ a staled state. When I resume ssh on 194.190.166.31,
 buffer gets empty and connection behaves as normal. I made this experiment
 waiting for keepalive packets from both sides, as well as resuming ssh
 before keepalives. In both cases connection did not become stale.

 So, my conclusion is that your statement about zero probe breakdown due
 to keepalives is right, and, I hope, your patch also makes the right thing.
 However, this is not an answer for the question how such a stale connection
 could arise, and predicted machanism to get it "stuck" does not work.

 [I hope we will continue this discussion later.]
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
       [not found]             ` <200104181928.XAA04912@ms2.inr.ac.ru>
@ 2001-04-21 15:45               ` Eugene B. Berdnikov
  2001-04-21 17:02                 ` kuznet
  0 siblings, 1 reply; 16+ messages in thread
From: Eugene B. Berdnikov @ 2001-04-21 15:45 UTC (permalink / raw)
  To: kuznet; +Cc: linux-kernel

  Hello.

On Wed, Apr 18, 2001 at 11:28:43PM +0400, kuznet@ms2.inr.ac.ru wrote:
> >  However, this is _not_ a staled state. When I resume ssh on 194.190.166.31,
> >  buffer gets empty and connection behaves as normal. I made this experiment
> >  waiting for keepalive packets from both sides, as well as resuming ssh
> >  before keepalives. In both cases connection did not become stale.
> 
> Yes, I have said that it is practically impossible to reproduce this.
> My guess was that it is due to inaccurate counting of sacks when path
> mtu discovery happens or when segments are fragmented due to SWS avoidance
> override.

 Im my case P-MTU discovery and fragmentation should be ruled out, but
 sacks are really frequent: my hosts are connected via poor leased line.

> Actually, the most dubious place is your statement that this connection
> was not idle for 2 hours. It is _necessary_ condition
> for my scenario to work...

 I only wrote that it was active when got stuck. It may be idle before -
 I do not remember, but have a habit to keep connections for weeks. :)

 As my experiments show, any connection, entering keepalive once,
 have lose its ability to send zero probes - forever.

> >  [I hope we will continue this discussion later.]
> 
> I am ready.

 OK. Let us return to the "mss/mtu bug". The most mystifying thing for
 me is the dependance of the MTU threshold on the kernel version, etc.

 I also wrote that it depends on keepalive flag. It seems, it was a mistake.
 My additional experiments show that there is no distinct threshold of MTU:
 trying the same value many times, I observed loss of acks in some cases,
 and in some did not. So, the MTU boundary is not strict. Well, let it be.

 But the question is what the minimum "reliable" MTU. There are lots of
 situations when data comes rapidly in small packets (say, monitoring logs).
 Is there a danger to lose such connections on a heavily loaded host?
-- 
 Eugene Berdnikov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug report: tcp staled when send-q != 0, timers == 0.
  2001-04-21 15:45               ` Eugene B. Berdnikov
@ 2001-04-21 17:02                 ` kuznet
  0 siblings, 0 replies; 16+ messages in thread
From: kuznet @ 2001-04-21 17:02 UTC (permalink / raw)
  To: Eugene B. Berdnikov; +Cc: linux-kernel

Hello!

>  Im my case P-MTU discovery

Sorry, I lied. Not pmtu discovery but exaclty opposite effect
is important here: collapsing of small frames to larger ones.
Each such merge results in loss of 1 "sack" in 2.2.

>  I only wrote that it was active when got stuck. It may be idle before -
>  I do not remember, but have a habit to keep connections for weeks. :)

Good. 8)

>  As my experiments show, any connection, entering keepalive once,
>  have lose its ability to send zero probes - forever.

Exactly.


>  OK. Let us return to the "mss/mtu bug". The most mystifying thing for
>  me is the dependance of the MTU threshold on the kernel version, etc.

Well, you can reinvestigate this to get more reliable results...

Actually, this problem is so difficult that the study would be purely
academical; there is no hope to fix it in 2.2. It is partially
repaired during 2.3 and completely resolved only in 2.4.4.


>  But the question is what the minimum "reliable" MTU. There are lots of
>  situations when data comes rapidly in small packets (say, monitoring logs).
>  Is there a danger to lose such connections on a heavily loaded host?

There is no real danger. Bad things can happen only when receiver does not
read data for very long time, in this case connection times out not
receiving any acks.

What's about minimum/maximum mtu... it does not exist. F.e. if sender floods
1 byte frames in TCP_NODELAY mode and receiver does not read them, 2.2 will
fail not depending on mtu. See? Even 40 bytes of IP+TCP headers (not counting
for additional overhead) guarantee that memory will exhaust by order earlier
than receiver can close window.

Alexey

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2001-04-21 17:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-04-09 14:43 Bug report: tcp staled when send-q != 0, timers == 0 Eugene B. Berdnikov
2001-04-10 17:38 ` kuznet
2001-04-10 21:19   ` Eugene B. Berdnikov
2001-04-11 10:16     ` Eugene B. Berdnikov
2001-04-11 16:56       ` kuznet
2001-04-11 18:35         ` Eugene B. Berdnikov
2001-04-11 19:04           ` kuznet
2001-04-11 19:28             ` Eugene B. Berdnikov
2001-04-11 19:37               ` kuznet
2001-04-11 16:35     ` kuznet
2001-04-11 18:50       ` Eugene B. Berdnikov
2001-04-11 19:09         ` kuznet
2001-04-11 19:18           ` Eugene B. Berdnikov
2001-04-13  8:54           ` Eugene B. Berdnikov
     [not found]             ` <200104181928.XAA04912@ms2.inr.ac.ru>
2001-04-21 15:45               ` Eugene B. Berdnikov
2001-04-21 17:02                 ` kuznet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).