All of lore.kernel.org
 help / color / mirror / Atom feed
* Transfer stalls with NAT under 2.6.24.3
@ 2008-03-26  8:47 Sven Riedel
  2008-03-26  9:24 ` Patrick McHardy
  0 siblings, 1 reply; 116+ messages in thread
From: Sven Riedel @ 2008-03-26  8:47 UTC (permalink / raw)
  To: netfilter

Hi,
I've run into a strange problem where large file transfers start 
stalling over a NATed connection. Packet traces reveal that ACK packets 
are sometimes not being passed through to the inside (NATed) host, which 
results in a transfer stall until a tcp timeout occurrs and the other 
side retransmits the ACK.

This only seems to happen if the conntrack table on the firewall already 
contains an entry for the same source and destination in TIME_WAIT 
state. If no conntrack entries exist for the same source and 
destination, the packets flow fine.

The problem seems to be alevated by setting ip_conntrac_tcp_be_liberal 
to 1, but this seems to be only a workaround not a real solution.

Scatter gather and tcp segment offloading have been disabled in the 
relevant NICs on the firewall during debugging, to make sure this isn't 
a hardware issue.

Is this issue known/is there a patch available or would further 
information be needed to help debug the problem?

Regards,
Sven

-- 
sven.riedel@securenet.de

SecureNet GmbH
Intranet & Internet Solutions
Frankfurter Ring 193a
D-80807 München
Tel: +49 89 32133-632
Fax: +49 89 32133-699
Zentrale: -600
www.securenet.de

Sitz der Gesellschaft: München
HRB München 118876
Geschäftsführer: Thomas Schreiber


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: Transfer stalls with NAT under 2.6.24.3
  2008-03-26  8:47 Transfer stalls with NAT under 2.6.24.3 Sven Riedel
@ 2008-03-26  9:24 ` Patrick McHardy
  2008-03-26 10:21   ` Sven Riedel
  0 siblings, 1 reply; 116+ messages in thread
From: Patrick McHardy @ 2008-03-26  9:24 UTC (permalink / raw)
  To: Sven Riedel; +Cc: netfilter, Netfilter Developer Mailing List

Sven Riedel wrote:
> Hi,
> I've run into a strange problem where large file transfers start 
> stalling over a NATed connection. Packet traces reveal that ACK 
> packets are sometimes not being passed through to the inside (NATed) 
> host, which results in a transfer stall until a tcp timeout occurrs 
> and the other side retransmits the ACK.
>
> This only seems to happen if the conntrack table on the firewall 
> already contains an entry for the same source and destination in 
> TIME_WAIT state. If no conntrack entries exist for the same source and 
> destination, the packets flow fine.
>
> The problem seems to be alevated by setting ip_conntrac_tcp_be_liberal 
> to 1, but this seems to be only a workaround not a real solution.
>
> Scatter gather and tcp segment offloading have been disabled in the 
> relevant NICs on the firewall during debugging, to make sure this 
> isn't a hardware issue.
>
> Is this issue known/is there a patch available or would further 
> information be needed to help debug the problem?

2.6.24.3 includes a patches that was supposed to fix problems
with connections in TIME_WAIT state. Does 2.6.24.2 work better
for you?

Please enable conntrack logging for TCP by executing:

echo 6 >/proc/sys/net/netfilter/nf_conntrack_log_invalid

and check whether you get any messages in the ring buffer.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: Transfer stalls with NAT under 2.6.24.3
  2008-03-26  9:24 ` Patrick McHardy
@ 2008-03-26 10:21   ` Sven Riedel
  2008-03-26 15:47     ` Patrick McHardy
  0 siblings, 1 reply; 116+ messages in thread
From: Sven Riedel @ 2008-03-26 10:21 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter, Netfilter Developer Mailing List

Patrick McHardy wrote:
> Sven Riedel wrote:
>> Hi,
>> I've run into a strange problem where large file transfers start 
>> stalling over a NATed connection. Packet traces reveal that ACK 
>> packets are sometimes not being passed through to the inside (NATed) 
>> host, which results in a transfer stall until a tcp timeout occurrs 
>> and the other side retransmits the ACK.
>>
>> This only seems to happen if the conntrack table on the firewall 
>> already contains an entry for the same source and destination in 
>> TIME_WAIT state. If no conntrack entries exist for the same source and 
>> destination, the packets flow fine.
>>
>> The problem seems to be alevated by setting ip_conntrac_tcp_be_liberal 
>> to 1, but this seems to be only a workaround not a real solution.
>>
>> Scatter gather and tcp segment offloading have been disabled in the 
>> relevant NICs on the firewall during debugging, to make sure this 
>> isn't a hardware issue.
>>
>> Is this issue known/is there a patch available or would further 
>> information be needed to help debug the problem?
> 
> 2.6.24.3 includes a patches that was supposed to fix problems
> with connections in TIME_WAIT state. Does 2.6.24.2 work better
> for you?

The firewall system in question is currently productive. I _might_ be 
able to try the other kernel tomorrow morning. Once I am able to try it 
I'll let you know.

> 
> Please enable conntrack logging for TCP by executing:
> 
> echo 6 >/proc/sys/net/netfilter/nf_conntrack_log_invalid
> 
> and check whether you get any messages in the ring buffer.

Yep, lots ;)

In the following 100.100.100.100 is the external machine and 
200.200.200.200 is the NAT IP-Address on the firewall. A 5MB file was 
transferred via scp to 100.100.100.100 from the internal network.

The output during a "clean" run, with an empty conntrack table and no 
stalls:
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42121
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E351B138AA40101050AE50974FBE5097A53)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42122
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E361B138AA40101050AE50974FBE5097FAB)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42123
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E361B138AA40101050AE50974FBE5098503)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42124
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E371B138AA40101050AE50974FBE5098A5B)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42125
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E381B138AA40101050AE50974FBE5098FB3)
printk: 24 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42248
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720852 ACK=3828837755 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A45585F911B138E140101050AE50FB2C3E50FD2D3)
printk: 31 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42465
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355721284 ACK=3829614779 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455861861B1392E10101050AE51B935BE51BB8C3)
printk: 25 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42718
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355721716 ACK=3830353499 WINDOW=42408
RES=0x00 ACK URGP=0 OPT (0101080A455863DA1B1398B70101050AE526E3ABE526E903)
printk: 57 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=42976
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355722052 ACK=3830954051 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455865791B139CBE0101050AE52FFD8BE530284B)
printk: 27 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=72 TOS=0x00 PREC=0x00 TTL=56 
ID=43306
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355722580 ACK=3831787163 WINDOW=42408
RES=0x00 ACK URGP=0 OPT
(0101080A455867731B13A19501010512E53CCB53E53CD653E53CBE93E53CC3EB)
printk: 74 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=43789
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355723252 ACK=3832978011 WINDOW=42408
RES=0x00 ACK URGP=0 OPT (0101080A45586A571B13A8CF0101050AE54EDEABE54EE403)







During a run with stalls:

nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=80 TOS=0x00 PREC=0x00 TTL=56 
ID=44105
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596614326 WINDOW=49248
RES=0x00 ACK URGP=0 OPT
(0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61)

^^^^ Transfer stalled here for ~10 seconds.


printk: 22 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=72 TOS=0x00 PREC=0x00 TTL=56 
ID=44113
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596632110 WINDOW=49248
RES=0x00 ACK URGP=0 OPT
(0101080A45587D301B13D81801010512491E8751491E8CA9491E7B71491E81F9)
printk: 12 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=44114
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596635150 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455881B21B13E35A0101050A491E8751491E8CA9)
printk: 14 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=7320
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160350311 ACK=597280038 WINDOW=27360
RES=0x00 ACK URGP=0 OPT (0101080A455883D31B13E8820101050A49286E71492873C9)
printk: 32 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=7451
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160350503 ACK=597578342 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455885161B13EBD30101050A492CEBA9492CF659)
printk: 35 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=7786
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160350983 ACK=598415558 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455887081B13F0890101050A4939B2094939E221)
printk: 54 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=8021
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160351319 ACK=598980542 WINDOW=42408
RES=0x00 ACK URGP=0 OPT (0101080A455889151B13F5C00101050A4942510149425659)
printk: 43 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=8205
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160351559 ACK=599403254 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A45588B011B13FA9B0101050A4948C4394948C991)
printk: 40 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=8531
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160352039 ACK=600218582 WINDOW=45144
RES=0x00 ACK URGP=0 OPT (0101080A45588D371B1400160101050A4955351949553A71)
printk: 49 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=8871
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160352519 ACK=601058534 WINDOW=38304
RES=0x00 ACK URGP=0 OPT (0101080A45588F521B1405500101050A4962062949620B81)
printk: 45 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
ID=8988
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160352663 ACK=601307510 WINDOW=41040
RES=0x00 ACK URGP=0 OPT (0101080A4558910A1B1409AB0101050A4965D2B94965D811)


Regards,
Sven
-- 
sven.riedel@securenet.de

SecureNet GmbH
Intranet & Internet Solutions
Frankfurter Ring 193a
D-80807 München
Tel: +49 89 32133-632
Fax: +49 89 32133-699
Zentrale: -600
www.securenet.de

Sitz der Gesellschaft: München
HRB München 118876
Geschäftsführer: Thomas Schreiber


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: Transfer stalls with NAT under 2.6.24.3
  2008-03-26 10:21   ` Sven Riedel
@ 2008-03-26 15:47     ` Patrick McHardy
  2008-03-26 18:45       ` Jozsef Kadlecsik
  0 siblings, 1 reply; 116+ messages in thread
From: Patrick McHardy @ 2008-03-26 15:47 UTC (permalink / raw)
  To: Sven Riedel; +Cc: netfilter, Netfilter Developer Mailing List

Sven Riedel wrote:
> Patrick McHardy wrote:
>> Sven Riedel wrote:
>>> Is this issue known/is there a patch available or would further 
>>> information be needed to help debug the problem?
>>
>> 2.6.24.3 includes a patches that was supposed to fix problems
>> with connections in TIME_WAIT state. Does 2.6.24.2 work better
>> for you?
> 
> The firewall system in question is currently productive. I _might_ be 
> able to try the other kernel tomorrow morning. Once I am able to try it 
> I'll let you know.
> 
>>
>> Please enable conntrack logging for TCP by executing:
>>
>> echo 6 >/proc/sys/net/netfilter/nf_conntrack_log_invalid
>>
>> and check whether you get any messages in the ring buffer.
> 
> Yep, lots ;)
> 
> In the following 100.100.100.100 is the external machine and 
> 200.200.200.200 is the NAT IP-Address on the firewall. A 5MB file was 
> transferred via scp to 100.100.100.100 from the internal network.
> 
> The output during a "clean" run, with an empty conntrack table and no 
> stalls:
> nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
> SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 
> ID=42121
> ...
> 
> During a run with stalls:
> 
> nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
> SRC=100.100.100.100 DST=200.200.200.200 LEN=80 TOS=0x00 PREC=0x00 TTL=56 
> ID=44105
> DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596614326 WINDOW=49248
> RES=0x00 ACK URGP=0 OPT
> (0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61) 
> 
> 
> ^^^^ Transfer stalled here for ~10 seconds.
> 
> 
> printk: 22 messages suppressed.
> nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
> SRC=100.100.100.100 DST=200.200.200.200 LEN=72 TOS=0x00 PREC=0x00 TTL=56 
> ID=44113

Thanks, can you send a binary tcpdump (... -w file) of a connection
that triggers these messages please?

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: Transfer stalls with NAT under 2.6.24.3
  2008-03-26 15:47     ` Patrick McHardy
@ 2008-03-26 18:45       ` Jozsef Kadlecsik
  2008-03-26 19:16         ` Krzysztof Oledzki
  2008-03-31  6:53         ` Sven Riedel
  0 siblings, 2 replies; 116+ messages in thread
From: Jozsef Kadlecsik @ 2008-03-26 18:45 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Sven Riedel, netfilter, Netfilter Developer Mailing List

On Wed, 26 Mar 2008, Patrick McHardy wrote:

> > During a run with stalls:
> > 
> > nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
> > SRC=100.100.100.100 DST=200.200.200.200 LEN=80 TOS=0x00 PREC=0x00 TTL=56
> > ID=44105
> > DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596614326 WINDOW=49248
> > RES=0x00 ACK URGP=0 OPT
> > (0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61) 
> > 
> Thanks, can you send a binary tcpdump (... -w file) of a connection
> that triggers these messages please?

Yes, a tcpdump of a full session which is stalled could help a lot.

But it almost look like as a SACK related problem: isn't there a (new) 
device between the communicating parties which performs ISN randomization 
and fails to adjust SACK?

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: Transfer stalls with NAT under 2.6.24.3
  2008-03-26 18:45       ` Jozsef Kadlecsik
@ 2008-03-26 19:16         ` Krzysztof Oledzki
  2008-03-31  6:53         ` Sven Riedel
  1 sibling, 0 replies; 116+ messages in thread
From: Krzysztof Oledzki @ 2008-03-26 19:16 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Patrick McHardy, Sven Riedel, netfilter,
	Netfilter Developer Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1303 bytes --]



On Wed, 26 Mar 2008, Jozsef Kadlecsik wrote:

> On Wed, 26 Mar 2008, Patrick McHardy wrote:
>
>>> During a run with stalls:
>>>
>>> nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
>>> SRC=100.100.100.100 DST=200.200.200.200 LEN=80 TOS=0x00 PREC=0x00 TTL=56
>>> ID=44105
>>> DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596614326 WINDOW=49248
>>> RES=0x00 ACK URGP=0 OPT
>>> (0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61)
>>>
>> Thanks, can you send a binary tcpdump (... -w file) of a connection
>> that triggers these messages please?
>
> Yes, a tcpdump of a full session which is stalled could help a lot.
>
> But it almost look like as a SACK related problem: isn't there a (new)
> device between the communicating parties which performs ISN randomization
> and fails to adjust SACK?

Yep.

$ ./optparse 0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61
No-Operation
No-Operation
TSOPT - Time Stamp Option(8) tv=1163426108 er=454282805
No-Operation
No-Operation
SACK(24) 1226737489:1226738857(1368) 1226734449:1226736121(1672) 1226719401:1226726241(6840)

So, SEQ=4160349927 & ACK=596614326 vs. 12267????? is obviously wrong.

Best regards,

 			Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: Transfer stalls with NAT under 2.6.24.3
  2008-03-26 18:45       ` Jozsef Kadlecsik
  2008-03-26 19:16         ` Krzysztof Oledzki
@ 2008-03-31  6:53         ` Sven Riedel
  2008-07-04 14:54           ` TCP connection stalls under 2.6.24.7 Thomas Jarosch
  1 sibling, 1 reply; 116+ messages in thread
From: Sven Riedel @ 2008-03-31  6:53 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Patrick McHardy, netfilter, Netfilter Developer Mailing List

Hi,
we had a minor emergency here last week, so I wasn't able to test the 
old kernel. I'll see that I'll try that tomorrow.

Jozsef Kadlecsik wrote:
> On Wed, 26 Mar 2008, Patrick McHardy wrote:
> 
>>> During a run with stalls:
>>>
>>> nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
>>> SRC=100.100.100.100 DST=200.200.200.200 LEN=80 TOS=0x00 PREC=0x00 TTL=56
>>> ID=44105
>>> DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596614326 WINDOW=49248
>>> RES=0x00 ACK URGP=0 OPT
>>> (0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61)
>>>
>> Thanks, can you send a binary tcpdump (... -w file) of a connection
>> that triggers these messages please?
> 
> Yes, a tcpdump of a full session which is stalled could help a lot.
Ok, I'll send one along later today.

> But it almost look like as a SACK related problem: isn't there a (new)
> device between the communicating parties which performs ISN randomization
> and fails to adjust SACK?

There are at least two devices between the communication partners: a DSL 
modem and a firewall on the remote end (outside of my control). Both 
devices have been there already and didn't create any problems with the 
old iptables setup. The only thing that changed on that communication 
path is the firewall hardware, the NIC on the firewall and the 
netfilter/iptables version used by the firewall.

Regards,
Sven


-- 
sven.riedel@securenet.de

SecureNet GmbH
Intranet & Internet Solutions
Frankfurter Ring 193a
D-80807 München
Tel: +49 89 32133-632
Fax: +49 89 32133-699
Zentrale: -600
www.securenet.de

Sitz der Gesellschaft: München
HRB München 118876
Geschäftsführer: Thomas Schreiber


^ permalink raw reply	[flat|nested] 116+ messages in thread

* TCP connection stalls under 2.6.24.7
  2008-03-31  6:53         ` Sven Riedel
@ 2008-07-04 14:54           ` Thomas Jarosch
  2008-07-04 20:58             ` Jozsef Kadlecsik
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-04 14:54 UTC (permalink / raw)
  To: Netfilter Developer Mailing List
  Cc: Patrick McHardy, Jozsef Kadlecsik, Sven Riedel

Hello together,

we upgraded from kernel 2.6.23.16 to 2.6.24.7 and are now seeing
stalling (smtp) TCP connections on two boxes. We still have the old kernel
on a "rescue" partition. If I boot it up, the connections work immediately.

The connection work fine if the transmitted data is smaller than ~220kb,
so you still can send small messages. I've sent a tcpdump to Patrick in 
private as it contained sensitive information. The picture is similar to 
Sven's issue reported backed in march: Some ACK packets
are missing (as if the remote side never sent them).

I downgraded the box to 2.6.24 to make sure it was
not caused by any -stable patch. Same thing.

Did any default TCP settings change from 2.6.23.16 to 2.6.24?

I also tried to disable path MTU discovery, TCP window scaling and
lowered the MTU of the ppp0 interface to 1400 (DSL connection).
This had no visible effect.

@Sven: Were you able to test 2.6.24.2?

Patrick suggested to enable nf_conntrack_log_invalid.
I enabled it via "echo 255 > /proc/sys/net/netfilter/nf_conntrack_log_invalid"
but that change didn't print anything to syslog.

Any ideas?

Have a nice weekend,
Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-04 14:54           ` TCP connection stalls under 2.6.24.7 Thomas Jarosch
@ 2008-07-04 20:58             ` Jozsef Kadlecsik
  2008-07-04 21:04               ` Jozsef Kadlecsik
  2008-07-07  9:18               ` Thomas Jarosch
  0 siblings, 2 replies; 116+ messages in thread
From: Jozsef Kadlecsik @ 2008-07-04 20:58 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Netfilter Developer Mailing List, Patrick McHardy, Sven Riedel

Hi,

On Fri, 4 Jul 2008, Thomas Jarosch wrote:

> we upgraded from kernel 2.6.23.16 to 2.6.24.7 and are now seeing
> stalling (smtp) TCP connections on two boxes. We still have the old kernel
> on a "rescue" partition. If I boot it up, the connections work immediately.
> 
> The connection work fine if the transmitted data is smaller than ~220kb,
> so you still can send small messages. I've sent a tcpdump to Patrick in 
> private as it contained sensitive information. The picture is similar to 
> Sven's issue reported backed in march: Some ACK packets
> are missing (as if the remote side never sent them).
> 
> I downgraded the box to 2.6.24 to make sure it was
> not caused by any -stable patch. Same thing.
> 
> Did any default TCP settings change from 2.6.23.16 to 2.6.24?

A TCP reopening fix was added to 2.6.24, but as it says, the patch affects 
only TCP connection reopening.

> I also tried to disable path MTU discovery, TCP window scaling and
> lowered the MTU of the ppp0 interface to 1400 (DSL connection).
> This had no visible effect.

Have you got SACK enabled? If yes, try to disable it: TCP connection 
tracking has got some trouble with SACK support. :-(
 
> @Sven: Were you able to test 2.6.24.2?
> 
> Patrick suggested to enable nf_conntrack_log_invalid.
> I enabled it via "echo 255 > /proc/sys/net/netfilter/nf_conntrack_log_invalid"
> but that change didn't print anything to syslog.

You have got a netfilter logging module loaded in, don't you? If yes and 
nf_conntrack_log_invalid produces no output, then I'd say it's not a 
netfilter related problem.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-04 20:58             ` Jozsef Kadlecsik
@ 2008-07-04 21:04               ` Jozsef Kadlecsik
  2008-07-07  9:18               ` Thomas Jarosch
  1 sibling, 0 replies; 116+ messages in thread
From: Jozsef Kadlecsik @ 2008-07-04 21:04 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Netfilter Developer Mailing List, Patrick McHardy, Sven Riedel

On Fri, 4 Jul 2008, Jozsef Kadlecsik wrote:

> Have you got SACK enabled? If yes, try to disable it: TCP connection 
> tracking has got some trouble with SACK support. :-(

s/has/had/ as we are speaking of 2.6.24.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-04 20:58             ` Jozsef Kadlecsik
  2008-07-04 21:04               ` Jozsef Kadlecsik
@ 2008-07-07  9:18               ` Thomas Jarosch
  2008-07-07 13:18                 ` Thomas Jarosch
  1 sibling, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-07  9:18 UTC (permalink / raw)
  To: Netfilter Developer Mailing List
  Cc: Jozsef Kadlecsik, Patrick McHardy, Sven Riedel

Hello Jozsef,

On Friday, 4. July 2008 22:58:06 Jozsef Kadlecsik wrote:
> Have you got SACK enabled? If yes, try to disable it: TCP connection
> tracking has got some trouble with SACK support. :-(

Thanks for the suggestion. I disabled it but it made no difference.

> You have got a netfilter logging module loaded in, don't you? If yes and
> nf_conntrack_log_invalid produces no output, then I'd say it's not a
> netfilter related problem.

Yes, we log local REJECTs to aid debugging if something is blocked.

I'll upgrade to 2.6.25.10 and see if it helps,
there is a TCP connection timeout fix in there: 
http://kerneltrap.org/mailarchive/linux-kernel/2008/6/14/2122714

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-07  9:18               ` Thomas Jarosch
@ 2008-07-07 13:18                 ` Thomas Jarosch
  2008-07-10 13:17                   ` Jozsef Kadlecsik
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-07 13:18 UTC (permalink / raw)
  To: netdev
  Cc: Jozsef Kadlecsik, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List

[-- Attachment #1: Type: text/plain, Size: 22030 bytes --]

Hello together,

On Monday, 7. July 2008 11:18:32 you wrote:
> I'll upgrade to 2.6.25.10 and see if it helps,
> there is a TCP connection timeout fix in there:
> http://kerneltrap.org/mailarchive/linux-kernel/2008/6/14/2122714

After upgrading to 2.6.25.10, the TCP connection still stalls.

I temporarily disabled PMTU discovery, TCP window scaling, TCP SACK
and manually forced the MTU to 1400 with no noticable effect.

I also added a "iptables -I INPUT -s IP.OF.MAIL.RELAY -j ACCEPT"
to make sure it's not related to conntrack on the double.

So here are the current results:
- 2.6.23.16: Working
- 2.6.24: Stalling connection
- 2.6.24.7: Stalling connection
- 2.6.25.10: Stalling connection

Attached is a tcpdump of a stalling connection with the
sensitive information replaced by "xxxxx", so please ignore the broken
checkums at the beginning. The dump was created using 2.6.24.7.

Jozsef Kadlecsik suggested this is not related to netfilter,
so I'm now asking for help on netdev.

Here's the text output from tcpdump:
-----------------------------------------------------------
13:40:14.140625 IP linux.53132 > mailserver.smtp: S 943411848:943411848(0) win 5808 <mss 1452,sackOK,timestamp 5386646 0,nop,wscale 2>
13:40:14.206523 IP mailserver.smtp > linux.53132: S 4213328541:4213328541(0) ack 943411849 win 65535 <mss 1400>
13:40:14.206548 IP linux.53132 > mailserver.smtp: . ack 1 win 5808
13:40:14.271316 IP mailserver.smtp > linux.53132: P 1:84(83) ack 1 win 65535
13:40:14.271336 IP linux.53132 > mailserver.smtp: . ack 84 win 5808
13:40:14.271395 IP linux.53132 > mailserver.smtp: P 1:26(25) ack 84 win 5808
13:40:14.341555 IP mailserver.smtp > linux.53132: P 84:257(173) ack 26 win 65535
13:40:14.341737 IP linux.53132 > mailserver.smtp: P 26:38(12) ack 257 win 6432
13:40:14.405342 IP mailserver.smtp > linux.53132: P 257:275(18) ack 38 win 65535
13:40:14.405419 IP linux.53132 > mailserver.smtp: P 38:68(30) ack 275 win 6432
13:40:14.471391 IP mailserver.smtp > linux.53132: P 275:293(18) ack 68 win 65535
13:40:14.471485 IP linux.53132 > mailserver.smtp: P 68:82(14) ack 293 win 6432
13:40:14.535423 IP mailserver.smtp > linux.53132: . ack 82 win 65535
13:40:14.539343 IP mailserver.smtp > linux.53132: P 293:324(31) ack 82 win 65535
13:40:14.539405 IP linux.53132 > mailserver.smtp: P 82:224(142) ack 324 win 6432
13:40:14.619489 IP mailserver.smtp > linux.53132: P 324:553(229) ack 224 win 65535
13:40:14.619633 IP linux.53132 > mailserver.smtp: . 224:1624(1400) ack 553 win 7504
13:40:14.619671 IP linux.53132 > mailserver.smtp: . 1624:3024(1400) ack 553 win 7504
13:40:14.746337 IP mailserver.smtp > linux.53132: . ack 1624 win 65535
13:40:14.746378 IP linux.53132 > mailserver.smtp: . 3024:4424(1400) ack 553 win 7504
13:40:14.746414 IP linux.53132 > mailserver.smtp: . 4424:5824(1400) ack 553 win 7504
13:40:14.863352 IP mailserver.smtp > linux.53132: . ack 4424 win 65535
13:40:14.863381 IP linux.53132 > mailserver.smtp: . 5824:7224(1400) ack 553 win 7504
13:40:14.863412 IP linux.53132 > mailserver.smtp: . 7224:8624(1400) ack 553 win 7504
13:40:14.888119 IP linux.53132 > mailserver.smtp: . 8624:10024(1400) ack 553 win 7504
13:40:14.955509 IP mailserver.smtp > linux.53132: . ack 5824 win 65535
13:40:14.955539 IP linux.53132 > mailserver.smtp: . 10024:11424(1400) ack 553 win 7504
13:40:14.955569 IP linux.53132 > mailserver.smtp: P 11424:12512(1088) ack 553 win 7504
13:40:15.048337 IP mailserver.smtp > linux.53132: . ack 8624 win 65535
13:40:15.048365 IP linux.53132 > mailserver.smtp: . 12512:13912(1400) ack 553 win 7504
13:40:15.048397 IP linux.53132 > mailserver.smtp: . 13912:15312(1400) ack 553 win 7504
13:40:15.073100 IP linux.53132 > mailserver.smtp: . 15312:16712(1400) ack 553 win 7504
13:40:15.165394 IP mailserver.smtp > linux.53132: . ack 10024 win 65535
13:40:15.165422 IP linux.53132 > mailserver.smtp: . 16712:18112(1400) ack 553 win 7504
13:40:15.165452 IP linux.53132 > mailserver.smtp: . 18112:19512(1400) ack 553 win 7504
13:40:15.271312 IP mailserver.smtp > linux.53132: . ack 13912 win 65535
13:40:15.271343 IP linux.53132 > mailserver.smtp: P 19512:20704(1192) ack 553 win 7504
13:40:15.271386 IP linux.53132 > mailserver.smtp: . 20704:22104(1400) ack 553 win 7504
13:40:15.296088 IP linux.53132 > mailserver.smtp: . 22104:23504(1400) ack 553 win 7504
13:40:15.320793 IP linux.53132 > mailserver.smtp: . 23504:24904(1400) ack 553 win 7504
13:40:15.375251 IP mailserver.smtp > linux.53132: . ack 15312 win 65535
13:40:15.375273 IP linux.53132 > mailserver.smtp: . 24904:26304(1400) ack 553 win 7504
13:40:15.375303 IP linux.53132 > mailserver.smtp: . 26304:27704(1400) ack 553 win 7504
13:40:15.447472 IP mailserver.smtp > linux.53132: . ack 18112 win 65535
13:40:15.447524 IP linux.53132 > mailserver.smtp: . 27704:29104(1400) ack 553 win 7504
13:40:15.447559 IP linux.53132 > mailserver.smtp: . 29104:30504(1400) ack 553 win 7504
13:40:15.472265 IP linux.53132 > mailserver.smtp: . 30504:31904(1400) ack 553 win 7504
13:40:15.585446 IP mailserver.smtp > linux.53132: . ack 20704 win 65535
13:40:15.585487 IP linux.53132 > mailserver.smtp: P 31904:32992(1088) ack 553 win 7504
13:40:15.585614 IP linux.53132 > mailserver.smtp: . 32992:34392(1400) ack 553 win 7504
13:40:15.610316 IP linux.53132 > mailserver.smtp: . 34392:35792(1400) ack 553 win 7504
13:40:15.677292 IP mailserver.smtp > linux.53132: . ack 23504 win 65535
13:40:15.677313 IP linux.53132 > mailserver.smtp: . 35792:37192(1400) ack 553 win 7504
13:40:15.677342 IP linux.53132 > mailserver.smtp: . 37192:38592(1400) ack 553 win 7504
13:40:15.702048 IP linux.53132 > mailserver.smtp: . 38592:39992(1400) ack 553 win 7504
13:40:15.796288 IP mailserver.smtp > linux.53132: . ack 24904 win 65535
13:40:15.796314 IP linux.53132 > mailserver.smtp: . 39992:41392(1400) ack 553 win 7504
13:40:15.796350 IP linux.53132 > mailserver.smtp: . 41392:42792(1400) ack 553 win 7504
13:40:15.856442 IP mailserver.smtp > linux.53132: . ack 27704 win 65535
13:40:15.856470 IP linux.53132 > mailserver.smtp: . 42792:44192(1400) ack 553 win 7504
13:40:15.856515 IP linux.53132 > mailserver.smtp: . 44192:45592(1400) ack 553 win 7504
13:40:15.881218 IP linux.53132 > mailserver.smtp: . 45592:46992(1400) ack 553 win 7504
13:40:15.977365 IP mailserver.smtp > linux.53132: . ack 30504 win 65535
13:40:15.977389 IP linux.53132 > mailserver.smtp: . 46992:48392(1400) ack 553 win 7504
13:40:16.001505 IP linux.53132 > mailserver.smtp: . 48392:49792(1400) ack 553 win 7504
13:40:16.001534 IP linux.53132 > mailserver.smtp: . 49792:51192(1400) ack 553 win 7504
13:40:16.141214 IP mailserver.smtp > linux.53132: . ack 34392 win 65535
13:40:16.141249 IP linux.53132 > mailserver.smtp: . 51192:52592(1400) ack 553 win 7504
13:40:16.141280 IP linux.53132 > mailserver.smtp: . 52592:53992(1400) ack 553 win 7504
13:40:16.165987 IP linux.53132 > mailserver.smtp: . 53992:55392(1400) ack 553 win 7504
13:40:16.190691 IP linux.53132 > mailserver.smtp: . 55392:56792(1400) ack 553 win 7504
13:40:16.215342 IP mailserver.smtp > linux.53132: . ack 35792 win 65535
13:40:16.215393 IP linux.53132 > mailserver.smtp: . 56792:58192(1400) ack 553 win 7504
13:40:16.240096 IP linux.53132 > mailserver.smtp: . 58192:59592(1400) ack 553 win 7504
13:40:16.329180 IP mailserver.smtp > linux.53132: . ack 38592 win 65535
13:40:16.329220 IP linux.53132 > mailserver.smtp: . 59592:60992(1400) ack 553 win 7504
13:40:16.329255 IP linux.53132 > mailserver.smtp: P 60992:61664(672) ack 553 win 7504
13:40:16.341471 IP linux.53132 > mailserver.smtp: . 61664:63064(1400) ack 553 win 7504
13:40:16.425284 IP mailserver.smtp > linux.53132: . ack 39992 win 65535
13:40:16.425322 IP linux.53132 > mailserver.smtp: . 63064:64464(1400) ack 553 win 7504
13:40:16.425357 IP linux.53132 > mailserver.smtp: . 64464:65864(1400) ack 553 win 7504
13:40:16.505348 IP mailserver.smtp > linux.53132: . ack 42792 win 65535
13:40:16.505387 IP linux.53132 > mailserver.smtp: . 65864:67264(1400) ack 553 win 7504
13:40:16.505420 IP linux.53132 > mailserver.smtp: . 67264:68664(1400) ack 553 win 7504
13:40:16.530126 IP linux.53132 > mailserver.smtp: . 68664:70064(1400) ack 553 win 7504
13:40:16.622359 IP mailserver.smtp > linux.53132: . ack 45592 win 65535
13:40:16.622387 IP linux.53132 > mailserver.smtp: . 70064:71464(1400) ack 553 win 7504
13:40:16.622417 IP linux.53132 > mailserver.smtp: . 71464:72864(1400) ack 553 win 7504
13:40:16.647124 IP linux.53132 > mailserver.smtp: . 72864:74264(1400) ack 553 win 7504
13:40:16.751201 IP mailserver.smtp > linux.53132: . ack 48392 win 65535
13:40:16.751228 IP linux.53132 > mailserver.smtp: . 74264:75664(1400) ack 553 win 7504
13:40:16.751259 IP linux.53132 > mailserver.smtp: . 75664:77064(1400) ack 553 win 7504
13:40:16.775965 IP linux.53132 > mailserver.smtp: . 77064:78464(1400) ack 553 win 7504
13:40:16.840381 IP mailserver.smtp > linux.53132: . ack 49792 win 65535
13:40:16.840419 IP linux.53132 > mailserver.smtp: . 78464:79864(1400) ack 553 win 7504
13:40:16.840450 IP linux.53132 > mailserver.smtp: . 79864:81264(1400) ack 553 win 7504
13:40:16.927375 IP mailserver.smtp > linux.53132: . ack 52592 win 65535
13:40:16.927401 IP linux.53132 > mailserver.smtp: . 81264:82664(1400) ack 553 win 7504
13:40:16.927433 IP linux.53132 > mailserver.smtp: . 82664:84064(1400) ack 553 win 7504
13:40:16.952139 IP linux.53132 > mailserver.smtp: . 84064:85464(1400) ack 553 win 7504
13:40:17.045338 IP mailserver.smtp > linux.53132: . ack 55392 win 65535
13:40:17.045374 IP linux.53132 > mailserver.smtp: . 85464:86864(1400) ack 553 win 7504
13:40:17.045406 IP linux.53132 > mailserver.smtp: . 86864:88264(1400) ack 553 win 7504
13:40:17.070113 IP linux.53132 > mailserver.smtp: . 88264:89664(1400) ack 553 win 7504
13:40:17.162120 IP mailserver.smtp > linux.53132: . ack 58192 win 65535
13:40:17.162148 IP linux.53132 > mailserver.smtp: . 89664:91064(1400) ack 553 win 7504
13:40:17.162179 IP linux.53132 > mailserver.smtp: . 91064:92464(1400) ack 553 win 7504
13:40:17.186886 IP linux.53132 > mailserver.smtp: . 92464:93864(1400) ack 553 win 7504
13:40:17.255239 IP mailserver.smtp > linux.53132: . ack 59592 win 65535
13:40:17.255268 IP linux.53132 > mailserver.smtp: . 93864:95264(1400) ack 553 win 7504
13:40:17.255298 IP linux.53132 > mailserver.smtp: . 95264:96664(1400) ack 553 win 7504
13:40:17.368334 IP mailserver.smtp > linux.53132: . ack 63064 win 65535
13:40:17.368390 IP linux.53132 > mailserver.smtp: . 96664:98064(1400) ack 553 win 7504
13:40:17.368423 IP linux.53132 > mailserver.smtp: P 98064:98528(464) ack 553 win 7504
13:40:17.377076 IP linux.53132 > mailserver.smtp: . 98528:99928(1400) ack 553 win 7504
13:40:17.401781 IP linux.53132 > mailserver.smtp: . 99928:101328(1400) ack 553 win 7504
13:40:17.465163 IP mailserver.smtp > linux.53132: . ack 64464 win 65535
13:40:17.465230 IP linux.53132 > mailserver.smtp: . 101328:102728(1400) ack 553 win 7504
13:40:17.465265 IP linux.53132 > mailserver.smtp: . 102728:104128(1400) ack 553 win 7504
13:40:17.544242 IP mailserver.smtp > linux.53132: . ack 67264 win 65535
13:40:17.544272 IP linux.53132 > mailserver.smtp: . 104128:105528(1400) ack 553 win 7504
13:40:17.544303 IP linux.53132 > mailserver.smtp: . 105528:106928(1400) ack 553 win 7504
13:40:17.569011 IP linux.53132 > mailserver.smtp: . 106928:108328(1400) ack 553 win 7504
13:40:17.661252 IP mailserver.smtp > linux.53132: . ack 70064 win 65535
13:40:17.661289 IP linux.53132 > mailserver.smtp: . 108328:109728(1400) ack 553 win 7504
13:40:17.661320 IP linux.53132 > mailserver.smtp: . 109728:111128(1400) ack 553 win 7504
13:40:17.686027 IP linux.53132 > mailserver.smtp: . 111128:112528(1400) ack 553 win 7504
13:40:17.792315 IP mailserver.smtp > linux.53132: . ack 72864 win 65535
13:40:17.792346 IP linux.53132 > mailserver.smtp: . 112528:113928(1400) ack 553 win 7504
13:40:17.792377 IP linux.53132 > mailserver.smtp: . 113928:115328(1400) ack 553 win 7504
13:40:17.817082 IP linux.53132 > mailserver.smtp: . 115328:116728(1400) ack 553 win 7504
13:40:17.875197 IP mailserver.smtp > linux.53132: . ack 74264 win 65535
13:40:17.875215 IP linux.53132 > mailserver.smtp: . 116728:118128(1400) ack 553 win 7504
13:40:17.899923 IP linux.53132 > mailserver.smtp: . 118128:119528(1400) ack 553 win 7504
13:40:17.980287 IP mailserver.smtp > linux.53132: . ack 77064 win 65535
13:40:17.980334 IP linux.53132 > mailserver.smtp: . 119528:120928(1400) ack 553 win 7504
13:40:17.980365 IP linux.53132 > mailserver.smtp: . 120928:122328(1400) ack 553 win 7504
13:40:18.005072 IP linux.53132 > mailserver.smtp: . 122328:123728(1400) ack 553 win 7504
13:40:18.085234 IP mailserver.smtp > linux.53132: . ack 78464 win 65535
13:40:18.085265 IP linux.53132 > mailserver.smtp: . 123728:125128(1400) ack 553 win 7504
13:40:18.085295 IP linux.53132 > mailserver.smtp: . 125128:126528(1400) ack 553 win 7504
13:40:18.156177 IP mailserver.smtp > linux.53132: . ack 81264 win 65535
13:40:18.156206 IP linux.53132 > mailserver.smtp: . 126528:127928(1400) ack 553 win 7504
13:40:18.156237 IP linux.53132 > mailserver.smtp: . 127928:129328(1400) ack 553 win 7504
13:40:18.180942 IP linux.53132 > mailserver.smtp: . 129328:130728(1400) ack 553 win 7504
13:40:18.274172 IP mailserver.smtp > linux.53132: . ack 84064 win 65535
13:40:18.274216 IP linux.53132 > mailserver.smtp: . 130728:132128(1400) ack 553 win 7504
13:40:18.274248 IP linux.53132 > mailserver.smtp: . 132128:133528(1400) ack 553 win 7504
13:40:18.298950 IP linux.53132 > mailserver.smtp: . 133528:134928(1400) ack 553 win 7504
13:40:18.390240 IP mailserver.smtp > linux.53132: . ack 86864 win 65535
13:40:18.390279 IP linux.53132 > mailserver.smtp: . 134928:136328(1400) ack 553 win 7504
13:40:18.390310 IP linux.53132 > mailserver.smtp: . 136328:137728(1400) ack 553 win 7504
13:40:18.415017 IP linux.53132 > mailserver.smtp: . 137728:139128(1400) ack 553 win 7504
13:40:18.495173 IP mailserver.smtp > linux.53132: . ack 88264 win 65535
13:40:18.495211 IP linux.53132 > mailserver.smtp: . 139128:140528(1400) ack 553 win 7504
13:40:18.495241 IP linux.53132 > mailserver.smtp: . 140528:141928(1400) ack 553 win 7504
13:40:18.684146 IP mailserver.smtp > linux.53132: . ack 93864 win 65535
13:40:18.684178 IP linux.53132 > mailserver.smtp: . 141928:143328(1400) ack 553 win 7504
13:40:18.684209 IP linux.53132 > mailserver.smtp: . 143328:144728(1400) ack 553 win 7504
13:40:18.708919 IP linux.53132 > mailserver.smtp: . 144728:146128(1400) ack 553 win 7504
13:40:18.733633 IP linux.53132 > mailserver.smtp: . 146128:147528(1400) ack 553 win 7504
13:40:18.758340 IP linux.53132 > mailserver.smtp: . 147528:148928(1400) ack 553 win 7504
13:40:18.801152 IP mailserver.smtp > linux.53132: . ack 96664 win 65535
13:40:18.801206 IP linux.53132 > mailserver.smtp: . 148928:150328(1400) ack 553 win 7504
13:40:18.801236 IP linux.53132 > mailserver.smtp: . 150328:151728(1400) ack 553 win 7504
13:40:18.825942 IP linux.53132 > mailserver.smtp: . 151728:153128(1400) ack 553 win 7504
13:40:18.916231 IP mailserver.smtp > linux.53132: . ack 98528 win 65535
13:40:18.916287 IP linux.53132 > mailserver.smtp: . 153128:154528(1400) ack 553 win 7504
13:40:18.916320 IP linux.53132 > mailserver.smtp: . 154528:155928(1400) ack 553 win 7504
13:40:18.941026 IP linux.53132 > mailserver.smtp: . 155928:157328(1400) ack 553 win 7504
13:40:19.000201 IP mailserver.smtp > linux.53132: . ack 101328 win 65535
13:40:19.000240 IP linux.53132 > mailserver.smtp: . 157328:158728(1400) ack 553 win 7504
13:40:19.000271 IP linux.53132 > mailserver.smtp: . 158728:160128(1400) ack 553 win 7504
13:40:19.024978 IP linux.53132 > mailserver.smtp: . 160128:161528(1400) ack 553 win 7504
13:40:19.118224 IP mailserver.smtp > linux.53132: . ack 104128 win 65535
13:40:19.118256 IP linux.53132 > mailserver.smtp: . 161528:162928(1400) ack 553 win 7504
13:40:19.118286 IP linux.53132 > mailserver.smtp: . 162928:164328(1400) ack 553 win 7504
13:40:19.142994 IP linux.53132 > mailserver.smtp: . 164328:165728(1400) ack 553 win 7504
13:40:19.235024 IP mailserver.smtp > linux.53132: . ack 106928 win 65535
13:40:19.235100 IP linux.53132 > mailserver.smtp: . 165728:167128(1400) ack 553 win 7504
13:40:19.235135 IP linux.53132 > mailserver.smtp: . 167128:168528(1400) ack 553 win 7504
13:40:19.256950 IP linux.53132 > mailserver.smtp: . 168528:169928(1400) ack 553 win 7504
13:40:19.327125 IP mailserver.smtp > linux.53132: . ack 108328 win 65535
13:40:19.327174 IP linux.53132 > mailserver.smtp: . 169928:171328(1400) ack 553 win 7504
13:40:19.327205 IP linux.53132 > mailserver.smtp: . 171328:172728(1400) ack 553 win 7504
13:40:19.411138 IP mailserver.smtp > linux.53132: . ack 111128 win 65535
13:40:19.411173 IP linux.53132 > mailserver.smtp: . 172728:174128(1400) ack 553 win 7504
13:40:19.411205 IP linux.53132 > mailserver.smtp: . 174128:175528(1400) ack 553 win 7504
13:40:19.435913 IP linux.53132 > mailserver.smtp: P 175528:176352(824) ack 553 win 7504
13:40:19.528188 IP mailserver.smtp > linux.53132: . ack 113928 win 65535
13:40:19.528224 IP linux.53132 > mailserver.smtp: . 176352:177752(1400) ack 553 win 7504
13:40:19.528258 IP linux.53132 > mailserver.smtp: . 177752:179152(1400) ack 553 win 7504
13:40:19.646160 IP mailserver.smtp > linux.53132: . ack 116728 win 65535
13:40:19.646199 IP linux.53132 > mailserver.smtp: . 179152:180552(1400) ack 553 win 7504
13:40:19.646233 IP linux.53132 > mailserver.smtp: . 180552:181952(1400) ack 553 win 7504
13:40:19.803080 IP mailserver.smtp > linux.53132: . ack 119528 win 65535
13:40:19.803106 IP linux.53132 > mailserver.smtp: . 181952:183352(1400) ack 553 win 7504
13:40:19.803139 IP linux.53132 > mailserver.smtp: . 183352:184752(1400) ack 553 win 7504
13:40:19.920136 IP mailserver.smtp > linux.53132: . ack 122328 win 65535
13:40:19.920185 IP linux.53132 > mailserver.smtp: . 184752:186152(1400) ack 553 win 7504
13:40:19.920218 IP linux.53132 > mailserver.smtp: . 186152:187552(1400) ack 553 win 7504
13:40:20.037145 IP mailserver.smtp > linux.53132: . ack 125128 win 65535
13:40:20.037176 IP linux.53132 > mailserver.smtp: . 187552:188952(1400) ack 553 win 7504
13:40:20.037209 IP linux.53132 > mailserver.smtp: . 188952:190352(1400) ack 553 win 7504
13:40:20.153935 IP mailserver.smtp > linux.53132: . ack 126528 win 65535
13:40:20.153966 IP linux.53132 > mailserver.smtp: . 190352:191752(1400) ack 553 win 7504
13:40:20.213044 IP mailserver.smtp > linux.53132: . ack 129328 win 65535
13:40:20.213063 IP linux.53132 > mailserver.smtp: . 191752:193152(1400) ack 553 win 7504
13:40:20.213093 IP linux.53132 > mailserver.smtp: . 193152:194552(1400) ack 553 win 7504
13:40:20.331045 IP mailserver.smtp > linux.53132: . ack 132128 win 65535
13:40:20.331106 IP linux.53132 > mailserver.smtp: . 194552:195952(1400) ack 553 win 7504
13:40:20.331141 IP linux.53132 > mailserver.smtp: . 195952:197352(1400) ack 553 win 7504
13:40:20.448086 IP mailserver.smtp > linux.53132: . ack 134928 win 65535
13:40:20.448153 IP linux.53132 > mailserver.smtp: . 197352:198752(1400) ack 553 win 7504
13:40:20.448188 IP linux.53132 > mailserver.smtp: . 198752:200152(1400) ack 553 win 7504
13:40:20.565142 IP mailserver.smtp > linux.53132: . ack 136328 win 65535
13:40:20.565178 IP linux.53132 > mailserver.smtp: . 200152:201552(1400) ack 553 win 7504
13:40:20.627890 IP mailserver.smtp > linux.53132: . ack 139128 win 65535
13:40:20.627916 IP linux.53132 > mailserver.smtp: . 201552:202952(1400) ack 553 win 7504
13:40:20.627949 IP linux.53132 > mailserver.smtp: . 202952:204352(1400) ack 553 win 7504
13:40:23.945532 IP linux.53132 > mailserver.smtp: . 139128:140528(1400) ack 553 win 7504
13:40:24.124744 IP mailserver.smtp > linux.53132: . ack 140528 win 65535
13:40:24.124779 IP linux.53132 > mailserver.smtp: . 204352:205752(1400) ack 553 win 7504
13:40:30.761559 IP linux.53132 > mailserver.smtp: . 140528:141928(1400) ack 553 win 7504
13:40:30.879206 IP mailserver.smtp > linux.53132: . ack 147528 win 65535
13:40:30.879244 IP linux.53132 > mailserver.smtp: . 205752:207152(1400) ack 553 win 7504
13:40:30.879274 IP linux.53132 > mailserver.smtp: . 207152:208552(1400) ack 553 win 7504
13:40:44.157537 IP linux.53132 > mailserver.smtp: . 147528:148928(1400) ack 553 win 7504
13:40:44.277506 IP mailserver.smtp > linux.53132: . ack 150328 win 65535
13:40:44.277546 IP linux.53132 > mailserver.smtp: . 208552:209952(1400) ack 553 win 7504
13:40:44.277579 IP linux.53132 > mailserver.smtp: . 209952:211352(1400) ack 553 win 7504
13:41:10.837536 IP linux.53132 > mailserver.smtp: . 150328:151728(1400) ack 553 win 7504
13:41:10.955575 IP mailserver.smtp > linux.53132: . ack 154528 win 65535
13:41:10.955610 IP linux.53132 > mailserver.smtp: . 211352:212752(1400) ack 553 win 7504
13:41:10.955642 IP linux.53132 > mailserver.smtp: . 212752:214152(1400) ack 553 win 7504
13:42:04.073557 IP linux.53132 > mailserver.smtp: . 154528:155928(1400) ack 553 win 7504
13:42:04.198891 IP mailserver.smtp > linux.53132: . ack 155928 win 65535
13:42:04.198938 IP linux.53132 > mailserver.smtp: . 214152:215552(1400) ack 553 win 7504
13:42:04.198970 IP linux.53132 > mailserver.smtp: . 215552:216952(1400) ack 553 win 7504
13:43:50.437541 IP linux.53132 > mailserver.smtp: . 155928:157328(1400) ack 553 win 7504
13:43:50.696615 IP mailserver.smtp > linux.53132: . ack 157328 win 65535
13:43:50.696641 IP linux.53132 > mailserver.smtp: . 216952:218352(1400) ack 553 win 7504
13:43:50.696671 IP linux.53132 > mailserver.smtp: . 218352:219752(1400) ack 553 win 7504
13:44:51.681540 IP mailserver.smtp > linux.41085: P 3630759848:3630759915(67) ack 1371960018 win 65535
13:44:51.681568 IP linux.41085 > mailserver.smtp: R 1371960018:1371960018(0) win 0
13:44:51.681583 IP mailserver.smtp > linux.41085: F 67:67(0) ack 1 win 65535
13:44:51.681594 IP linux.41085 > mailserver.smtp: R 1371960018:1371960018(0) win 0
-----------------------------------------------------------

It just looks like some ACKs never made it to the linux box.
Any idea how I can further troubleshoot the stalling connection?

Please CC: comments, I'm only on netfilter-devel.

Thanks in advance,
Thomas

[-- Attachment #2: smtp.tcpdump.bz2 --]
[-- Type: application/x-bzip2, Size: 173068 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-07 13:18                 ` Thomas Jarosch
@ 2008-07-10 13:17                   ` Jozsef Kadlecsik
  2008-07-10 14:12                     ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Jozsef Kadlecsik @ 2008-07-10 13:17 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: netdev, Patrick McHardy, Sven Riedel, Netfilter Developer Mailing List

On Mon, 7 Jul 2008, Thomas Jarosch wrote:

> On Monday, 7. July 2008 11:18:32 you wrote:
> > I'll upgrade to 2.6.25.10 and see if it helps,
> > there is a TCP connection timeout fix in there:
> > http://kerneltrap.org/mailarchive/linux-kernel/2008/6/14/2122714
> 
> After upgrading to 2.6.25.10, the TCP connection still stalls.
> 
> I temporarily disabled PMTU discovery, TCP window scaling, TCP SACK
> and manually forced the MTU to 1400 with no noticable effect.
> 
> I also added a "iptables -I INPUT -s IP.OF.MAIL.RELAY -j ACCEPT"
> to make sure it's not related to conntrack on the double.
> 
> So here are the current results:
> - 2.6.23.16: Working
> - 2.6.24: Stalling connection
> - 2.6.24.7: Stalling connection
> - 2.6.25.10: Stalling connection
> 
> Attached is a tcpdump of a stalling connection with the
> sensitive information replaced by "xxxxx", so please ignore the broken
> checkums at the beginning. The dump was created using 2.6.24.7.
> 
> Jozsef Kadlecsik suggested this is not related to netfilter,
> so I'm now asking for help on netdev.
> 
> Here's the text output from tcpdump:
> -----------------------------------------------------------
> 13:40:14.140625 IP linux.53132 > mailserver.smtp: S 943411848:943411848(0) win 5808 <mss 1452,sackOK,timestamp 5386646 0,nop,wscale 2>
> 13:40:14.206523 IP mailserver.smtp > linux.53132: S 4213328541:4213328541(0) ack 943411849 win 65535 <mss 1400>
[...]
> 13:42:04.198938 IP linux.53132 > mailserver.smtp: . 214152:215552(1400) ack 553 win 7504
> 13:42:04.198970 IP linux.53132 > mailserver.smtp: . 215552:216952(1400) ack 553 win 7504
> 13:43:50.437541 IP linux.53132 > mailserver.smtp: . 155928:157328(1400) ack 553 win 7504
> 13:43:50.696615 IP mailserver.smtp > linux.53132: . ack 157328 win 65535
> 13:43:50.696641 IP linux.53132 > mailserver.smtp: . 216952:218352(1400) ack 553 win 7504
> 13:43:50.696671 IP linux.53132 > mailserver.smtp: . 218352:219752(1400) ack 553 win 7504

It looks as the smtp server receives the packets slowly and it's just 
behind the client. There's no more packet to/from port 53132 in the 
tcpdump.

> 13:44:51.681540 IP mailserver.smtp > linux.41085: P 3630759848:3630759915(67) ack 1371960018 win 65535
> 13:44:51.681568 IP linux.41085 > mailserver.smtp: R 1371960018:1371960018(0) win 0
> 13:44:51.681583 IP mailserver.smtp > linux.41085: F 67:67(0) ack 1 win 65535
> 13:44:51.681594 IP linux.41085 > mailserver.smtp: R 1371960018:1371960018(0) win 0

But the first packet above from the server looks just wrong in the 
context: the port of the client "changed". This is why the client sends 
the RST packet back as there's no such TCP connection there.

Makes no sense at all...

[Wild guessing: broken virtualized SMTP server migration?]

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-10 13:17                   ` Jozsef Kadlecsik
@ 2008-07-10 14:12                     ` Thomas Jarosch
  2008-07-10 21:21                       ` Jozsef Kadlecsik
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-10 14:12 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: netdev, Patrick McHardy, Sven Riedel, Netfilter Developer Mailing List

On Thursday, 10. July 2008 15:17:53 you wrote:
> It looks as the smtp server receives the packets slowly and it's just
> behind the client. There's no more packet to/from port 53132 in the
> tcpdump.

Thanks for looking into this, Jozsef. If you take a look at the timing 
information, the connection was already running ~270 seconds without real 
data transfer. The mailserver then aborts the SMTP connection with the error 
msg: "421 mailbackup.webpage.t-com.de Lost connection to [217.85.147.6]"
Only after that the port is "changed" to the wrong one.

The time ranges between the retransmissions seem
to really go downhill after the first retransmission.

The linux box is connected via a mostly idle 2 mbit SDSL line, the mailserver
is located at the provider. So theoretically this shouldn't be slow at all.
This is also proved as 2.6.23.17 works without trouble.

As noted before, small mails below ~220kb always seem to get through.
Is there any feature in TCP that could trigger such a behaviour?
This smells like some queue getting full. I'll double check
there is not some kind of traffic shaping in place.

> > 13:44:51.681540 IP mailserver.smtp > linux.41085: P
> ...
>
> But the first packet above from the server looks just wrong in the
> context: the port of the client "changed". This is why the client sends
> the RST packet back as there's no such TCP connection there.
>
> Makes no sense at all...
>
> [Wild guessing: broken virtualized SMTP server migration?]

Oh, that really looks strange. Maybe the error handling
of the server/load balancer/whatever is broken.

The Fritz!box router-in-between worked fine for a day, but now we just had 
another mail stuck in the queue. So it seems to soften the problem a bit,
but does not solve it.

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-10 14:12                     ` Thomas Jarosch
@ 2008-07-10 21:21                       ` Jozsef Kadlecsik
  2008-07-11 14:33                         ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Jozsef Kadlecsik @ 2008-07-10 21:21 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: netdev, Patrick McHardy, Sven Riedel, Netfilter Developer Mailing List

On Thu, 10 Jul 2008, Thomas Jarosch wrote:

> On Thursday, 10. July 2008 15:17:53 you wrote:
> > It looks as the smtp server receives the packets slowly and it's just
> > behind the client. There's no more packet to/from port 53132 in the
> > tcpdump.
> 
> Thanks for looking into this, Jozsef. If you take a look at the timing 
> information, the connection was already running ~270 seconds without real 
> data transfer. The mailserver then aborts the SMTP connection with the error 
> msg: "421 mailbackup.webpage.t-com.de Lost connection to [217.85.147.6]"
> Only after that the port is "changed" to the wrong one.
>
> The time ranges between the retransmissions seem
> to really go downhill after the first retransmission.
> 
> The linux box is connected via a mostly idle 2 mbit SDSL line, the mailserver
> is located at the provider. So theoretically this shouldn't be slow at all.
> This is also proved as 2.6.23.17 works without trouble.

You did not mention the type of your driver. Isn't there some changes in 
the driver code between 2.6.23.17 and 2.6.24 which could cause such 
stallings?

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-10 21:21                       ` Jozsef Kadlecsik
@ 2008-07-11 14:33                         ` Thomas Jarosch
  2008-07-15 11:47                           ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-11 14:33 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: netdev, Patrick McHardy, Sven Riedel, Netfilter Developer Mailing List

[-- Attachment #1: Type: text/plain, Size: 18655 bytes --]

On Thursday, 10. July 2008 23:21:37 Jozsef Kadlecsik wrote:
> You did not mention the type of your driver. Isn't there some changes in
> the driver code between 2.6.23.17 and 2.6.24 which could cause such
> stallings?

It's a 8139too and the "same" hardware works fine at other sites.
I tried the nmap trick mentioned by Dâniel Fraga with no noticable difference.

Here's another tcpdump created today, maybe it shows some new/different
information. This time I didn't capture the complete packets to keep it small:

15:43:30.580952 IP linux.52292 > mailserver.smtp: S 1353475948:1353475948(0) win 5808 <mss 1452,sackOK,timestamp 87884244[|tcp]>
15:43:30.646396 IP mailserver.smtp > linux.52292: S 3868230700:3868230700(0) ack 1353475949 win 65535 <mss 1448>
15:43:30.646421 IP linux.52292 > mailserver.smtp: . ack 1 win 5808
15:43:30.711442 IP mailserver.smtp > linux.52292: P 1:79(78) ack 1 win 65535
15:43:30.711457 IP linux.52292 > mailserver.smtp: . ack 79 win 5808
15:43:30.711525 IP linux.52292 > mailserver.smtp: P 1:26(25) ack 79 win 5808
15:43:30.781440 IP mailserver.smtp > linux.52292: P 79:246(167) ack 26 win 65535
15:43:30.781565 IP linux.52292 > mailserver.smtp: P 26:38(12) ack 246 win 6432
15:43:30.844482 IP mailserver.smtp > linux.52292: P 246:264(18) ack 38 win 65535
15:43:30.844551 IP linux.52292 > mailserver.smtp: P 38:68(30) ack 264 win 6432
15:43:30.908539 IP mailserver.smtp > linux.52292: P 264:282(18) ack 68 win 65535
15:43:30.908580 IP linux.52292 > mailserver.smtp: P 68:82(14) ack 282 win 6432
15:43:30.978507 IP mailserver.smtp > linux.52292: P 282:313(31) ack 82 win 65535
15:43:30.978647 IP linux.52292 > mailserver.smtp: P 82:224(142) ack 313 win 6432
15:43:31.056371 IP mailserver.smtp > linux.52292: P 313:542(229) ack 224 win 65535
15:43:31.056476 IP linux.52292 > mailserver.smtp: . 224:1672(1448) ack 542 win 7504
15:43:31.056510 IP linux.52292 > mailserver.smtp: . 1672:3120(1448) ack 542 win 7504
15:43:31.056541 IP linux.52292 > mailserver.smtp: P 3120:4320(1200) ack 542 win 7504
15:43:31.239389 IP mailserver.smtp > linux.52292: . ack 3120 win 65535
15:43:31.239425 IP linux.52292 > mailserver.smtp: . 4320:5768(1448) ack 542 win 7504
15:43:31.239461 IP linux.52292 > mailserver.smtp: . 5768:7216(1448) ack 542 win 7504
15:43:31.420453 IP mailserver.smtp > linux.52292: . ack 7216 win 65535
15:43:31.420481 IP linux.52292 > mailserver.smtp: . 7216:8664(1448) ack 542 win 7504
15:43:31.420513 IP linux.52292 > mailserver.smtp: . 8664:10112(1448) ack 542 win 7504
15:43:31.420542 IP linux.52292 > mailserver.smtp: . 10112:11560(1448) ack 542 win 7504
15:43:31.601274 IP mailserver.smtp > linux.52292: . ack 10112 win 65535
15:43:31.601300 IP linux.52292 > mailserver.smtp: . 11560:13008(1448) ack 542 win 7504
15:43:31.601331 IP linux.52292 > mailserver.smtp: . 13008:14456(1448) ack 542 win 7504
15:43:31.730384 IP mailserver.smtp > linux.52292: . ack 13008 win 65535
15:43:31.730423 IP linux.52292 > mailserver.smtp: . 14456:15904(1448) ack 542 win 7504
15:43:31.730457 IP linux.52292 > mailserver.smtp: . 15904:17352(1448) ack 542 win 7504
15:43:31.867331 IP mailserver.smtp > linux.52292: . ack 15904 win 65535
15:43:31.867348 IP linux.52292 > mailserver.smtp: . 17352:18800(1448) ack 542 win 7504
15:43:31.867378 IP linux.52292 > mailserver.smtp: . 18800:20248(1448) ack 542 win 7504
15:43:31.867407 IP linux.52292 > mailserver.smtp: P 20248:20704(456) ack 542 win 7504
15:43:32.012416 IP mailserver.smtp > linux.52292: . ack 18800 win 65535
15:43:32.012449 IP linux.52292 > mailserver.smtp: . 20704:22152(1448) ack 542 win 7504
15:43:32.012482 IP linux.52292 > mailserver.smtp: . 22152:23600(1448) ack 542 win 7504
15:43:32.079207 IP mailserver.smtp > linux.52292: . ack 20248 win 65535
15:43:32.079251 IP linux.52292 > mailserver.smtp: . 23600:25048(1448) ack 542 win 7504
15:43:32.224314 IP mailserver.smtp > linux.52292: . ack 23600 win 65535
15:43:32.224348 IP linux.52292 > mailserver.smtp: . 25048:26496(1448) ack 542 win 7504
15:43:32.224382 IP linux.52292 > mailserver.smtp: . 26496:27944(1448) ack 542 win 7504
15:43:32.224411 IP linux.52292 > mailserver.smtp: P 27944:28896(952) ack 542 win 7504
15:43:32.289328 IP mailserver.smtp > linux.52292: . ack 25048 win 65535
15:43:32.289348 IP linux.52292 > mailserver.smtp: . 28896:30344(1448) ack 542 win 7504
15:43:32.415220 IP mailserver.smtp > linux.52292: . ack 27944 win 65535
15:43:32.415242 IP linux.52292 > mailserver.smtp: . 30344:31792(1448) ack 542 win 7504
15:43:32.415284 IP linux.52292 > mailserver.smtp: . 31792:33240(1448) ack 542 win 7504
15:43:32.499213 IP mailserver.smtp > linux.52292: . ack 28896 win 65535
15:43:32.499234 IP linux.52292 > mailserver.smtp: . 33240:34688(1448) ack 542 win 7504
15:43:32.590355 IP mailserver.smtp > linux.52292: . ack 31792 win 65535
15:43:32.590376 IP linux.52292 > mailserver.smtp: . 34688:36136(1448) ack 542 win 7504
15:43:32.590408 IP linux.52292 > mailserver.smtp: P 36136:37088(952) ack 542 win 7504
15:43:32.708128 IP mailserver.smtp > linux.52292: . ack 33240 win 65535
15:43:32.708151 IP linux.52292 > mailserver.smtp: . 37088:38536(1448) ack 542 win 7504
15:43:32.789176 IP mailserver.smtp > linux.52292: . ack 36136 win 65535
15:43:32.789201 IP linux.52292 > mailserver.smtp: . 38536:39984(1448) ack 542 win 7504
15:43:32.789245 IP linux.52292 > mailserver.smtp: . 39984:41432(1448) ack 542 win 7504
15:43:32.915336 IP mailserver.smtp > linux.52292: . ack 38536 win 65535
15:43:32.915364 IP linux.52292 > mailserver.smtp: . 41432:42880(1448) ack 542 win 7504
15:43:32.915395 IP linux.52292 > mailserver.smtp: . 42880:44328(1448) ack 542 win 7504
15:43:33.024198 IP mailserver.smtp > linux.52292: . ack 41432 win 65535
15:43:33.024237 IP linux.52292 > mailserver.smtp: . 44328:45776(1448) ack 542 win 7504
15:43:33.024271 IP linux.52292 > mailserver.smtp: . 45776:47224(1448) ack 542 win 7504
15:43:33.024300 IP linux.52292 > mailserver.smtp: . 47224:48672(1448) ack 542 win 7504
15:43:33.119325 IP mailserver.smtp > linux.52292: . ack 42880 win 65535
15:43:33.119367 IP linux.52292 > mailserver.smtp: P 48672:49376(704) ack 542 win 7504
15:43:33.226202 IP mailserver.smtp > linux.52292: . ack 45776 win 65535
15:43:33.226225 IP linux.52292 > mailserver.smtp: . 49376:50824(1448) ack 542 win 7504
15:43:33.226260 IP linux.52292 > mailserver.smtp: . 50824:52272(1448) ack 542 win 7504
15:43:33.440286 IP mailserver.smtp > linux.52292: . ack 47224 win 65535
15:43:33.440320 IP linux.52292 > mailserver.smtp: . 52272:53720(1448) ack 542 win 7504
15:43:33.440403 IP mailserver.smtp > linux.52292: . ack 50824 win 65535
15:43:33.440420 IP linux.52292 > mailserver.smtp: . 53720:55168(1448) ack 542 win 7504
15:43:33.440453 IP linux.52292 > mailserver.smtp: . 55168:56616(1448) ack 542 win 7504
15:43:33.440477 IP linux.52292 > mailserver.smtp: P 56616:57568(952) ack 542 win 7504
15:43:33.539092 IP mailserver.smtp > linux.52292: . ack 52272 win 65535
15:43:33.539132 IP linux.52292 > mailserver.smtp: . 57568:59016(1448) ack 542 win 7504
15:43:33.621109 IP mailserver.smtp > linux.52292: . ack 55168 win 65535
15:43:33.621153 IP linux.52292 > mailserver.smtp: . 59016:60464(1448) ack 542 win 7504
15:43:33.621201 IP linux.52292 > mailserver.smtp: . 60464:61912(1448) ack 542 win 7504
15:43:33.749209 IP mailserver.smtp > linux.52292: . ack 57568 win 65535
15:43:33.749232 IP linux.52292 > mailserver.smtp: . 61912:63360(1448) ack 542 win 7504
15:43:33.749262 IP linux.52292 > mailserver.smtp: . 63360:64808(1448) ack 542 win 7504
15:43:33.865258 IP mailserver.smtp > linux.52292: . ack 60464 win 65535
15:43:33.865301 IP linux.52292 > mailserver.smtp: . 64808:66256(1448) ack 542 win 7504
15:43:33.865336 IP linux.52292 > mailserver.smtp: . 66256:67704(1448) ack 542 win 7504
15:43:33.959118 IP mailserver.smtp > linux.52292: . ack 61912 win 65535
15:43:33.959158 IP linux.52292 > mailserver.smtp: . 67704:69152(1448) ack 542 win 7504
15:43:34.608440 IP linux.52292 > mailserver.smtp: . 61912:63360(1448) ack 542 win 7504
15:43:34.729935 IP mailserver.smtp > linux.52292: . ack 64808 win 65535
15:43:34.729964 IP linux.52292 > mailserver.smtp: . 69152:70600(1448) ack 542 win 7504

...

15:43:45.324086 IP linux.52292 > mailserver.smtp: . 349832:351280(1448) ack 542 win 7504
15:43:45.324117 IP linux.52292 > mailserver.smtp: . 351280:352728(1448) ack 542 win 7504
15:43:45.324146 IP linux.52292 > mailserver.smtp: . 352728:354176(1448) ack 542 win 7504
15:43:45.445990 IP mailserver.smtp > linux.52292: . ack 303824 win 65535
15:43:45.446010 IP linux.52292 > mailserver.smtp: . 354176:355624(1448) ack 542 win 7504
15:43:45.446060 IP linux.52292 > mailserver.smtp: . 355624:357072(1448) ack 542 win 7504
15:43:45.569904 IP mailserver.smtp > linux.52292: . ack 306720 win 65535
15:43:45.569928 IP linux.52292 > mailserver.smtp: . 357072:358520(1448) ack 542 win 7504
15:43:45.569960 IP linux.52292 > mailserver.smtp: . 358520:359968(1448) ack 542 win 7504
15:43:45.569989 IP linux.52292 > mailserver.smtp: . 359968:361416(1448) ack 542 win 7504
15:43:45.667941 IP mailserver.smtp > linux.52292: . ack 308168 win 65535
15:43:45.667964 IP linux.52292 > mailserver.smtp: . 361416:362864(1448) ack 542 win 7504
15:43:45.752939 IP mailserver.smtp > linux.52292: . ack 311064 win 65535
15:43:45.752958 IP linux.52292 > mailserver.smtp: . 362864:364312(1448) ack 542 win 7504
15:43:45.752989 IP linux.52292 > mailserver.smtp: . 364312:365760(1448) ack 542 win 7504
15:43:45.753018 IP linux.52292 > mailserver.smtp: . 365760:367208(1448) ack 542 win 7504
15:43:45.872905 IP mailserver.smtp > linux.52292: . ack 313960 win 65535
15:43:45.872930 IP linux.52292 > mailserver.smtp: . 367208:368656(1448) ack 542 win 7504
15:43:45.872962 IP linux.52292 > mailserver.smtp: . 368656:370104(1448) ack 542 win 7504
15:43:45.993879 IP mailserver.smtp > linux.52292: . ack 316856 win 65535
15:43:45.993913 IP linux.52292 > mailserver.smtp: . 370104:371552(1448) ack 542 win 7504
15:43:45.993946 IP linux.52292 > mailserver.smtp: . 371552:373000(1448) ack 542 win 7504
15:43:45.993975 IP linux.52292 > mailserver.smtp: . 373000:374448(1448) ack 542 win 7504
15:43:46.088019 IP mailserver.smtp > linux.52292: . ack 318304 win 65535
15:43:46.088081 IP linux.52292 > mailserver.smtp: . 374448:375896(1448) ack 542 win 7504
15:43:46.176935 IP mailserver.smtp > linux.52292: . ack 321200 win 65535
15:43:46.176972 IP linux.52292 > mailserver.smtp: . 375896:377344(1448) ack 542 win 7504
15:43:46.177007 IP linux.52292 > mailserver.smtp: . 377344:378792(1448) ack 542 win 7504
15:43:46.177035 IP linux.52292 > mailserver.smtp: . 378792:380240(1448) ack 542 win 7504
15:43:46.289744 IP mailserver.smtp > linux.52292: . ack 322648 win 65535
15:43:46.289766 IP linux.52292 > mailserver.smtp: . 380240:381688(1448) ack 542 win 7504
15:43:46.357708 IP mailserver.smtp > linux.52292: . ack 325544 win 65535
15:43:46.357725 IP linux.52292 > mailserver.smtp: . 381688:383136(1448) ack 542 win 7504
15:43:46.357756 IP linux.52292 > mailserver.smtp: . 383136:384584(1448) ack 542 win 7504
15:43:46.478932 IP mailserver.smtp > linux.52292: . ack 328440 win 65535
15:43:46.478953 IP linux.52292 > mailserver.smtp: . 384584:386032(1448) ack 542 win 7504
15:43:46.478983 IP linux.52292 > mailserver.smtp: . 386032:387480(1448) ack 542 win 7504
15:43:46.479011 IP linux.52292 > mailserver.smtp: . 387480:388928(1448) ack 542 win 7504
15:43:46.599901 IP mailserver.smtp > linux.52292: . ack 331336 win 65535
15:43:46.599925 IP linux.52292 > mailserver.smtp: . 388928:390376(1448) ack 542 win 7504
15:43:46.599957 IP linux.52292 > mailserver.smtp: . 390376:391824(1448) ack 542 win 7504
15:43:46.706829 IP mailserver.smtp > linux.52292: . ack 332784 win 65535
15:43:46.706849 IP linux.52292 > mailserver.smtp: . 391824:393272(1448) ack 542 win 7504
15:43:46.781694 IP mailserver.smtp > linux.52292: . ack 335680 win 65535
15:43:46.781710 IP linux.52292 > mailserver.smtp: . 393272:394720(1448) ack 542 win 7504
15:43:46.781740 IP linux.52292 > mailserver.smtp: . 394720:396168(1448) ack 542 win 7504
15:43:46.781768 IP linux.52292 > mailserver.smtp: . 396168:397616(1448) ack 542 win 7504
15:43:46.901696 IP mailserver.smtp > linux.52292: . ack 338576 win 65535
15:43:46.901723 IP linux.52292 > mailserver.smtp: . 397616:399064(1448) ack 542 win 7504
15:43:46.901756 IP linux.52292 > mailserver.smtp: . 399064:400512(1448) ack 542 win 7504
15:43:47.023889 IP mailserver.smtp > linux.52292: . ack 341472 win 65535
15:43:47.023914 IP linux.52292 > mailserver.smtp: . 400512:401960(1448) ack 542 win 7504
15:43:47.023946 IP linux.52292 > mailserver.smtp: . 401960:403408(1448) ack 542 win 7504
15:43:47.126865 IP mailserver.smtp > linux.52292: . ack 342920 win 65535
15:43:47.126899 IP linux.52292 > mailserver.smtp: . 403408:404856(1448) ack 542 win 7504
15:43:47.126933 IP linux.52292 > mailserver.smtp: . 404856:406304(1448) ack 542 win 7504
15:43:47.205693 IP mailserver.smtp > linux.52292: . ack 345816 win 65535
15:43:47.205726 IP linux.52292 > mailserver.smtp: . 406304:407752(1448) ack 542 win 7504
15:43:47.205760 IP linux.52292 > mailserver.smtp: . 407752:409200(1448) ack 542 win 7504
15:43:47.337721 IP mailserver.smtp > linux.52292: . ack 348384 win 65535
15:43:47.337745 IP linux.52292 > mailserver.smtp: . 409200:410648(1448) ack 542 win 7504
15:43:47.337778 IP linux.52292 > mailserver.smtp: . 410648:412096(1448) ack 542 win 7504
15:43:47.434787 IP mailserver.smtp > linux.52292: . ack 351280 win 65535
15:43:47.434808 IP linux.52292 > mailserver.smtp: . 412096:413544(1448) ack 542 win 7504
15:43:47.434840 IP linux.52292 > mailserver.smtp: . 413544:414992(1448) ack 542 win 7504
15:43:47.434868 IP linux.52292 > mailserver.smtp: . 414992:416440(1448) ack 542 win 7504
15:43:47.547611 IP mailserver.smtp > linux.52292: . ack 352728 win 65535
15:43:47.547635 IP linux.52292 > mailserver.smtp: . 416440:417888(1448) ack 542 win 7504
15:43:47.615600 IP mailserver.smtp > linux.52292: . ack 355624 win 65535
15:43:47.615618 IP linux.52292 > mailserver.smtp: . 417888:419336(1448) ack 542 win 7504
15:43:47.615650 IP linux.52292 > mailserver.smtp: . 419336:420784(1448) ack 542 win 7504
15:43:47.743710 IP mailserver.smtp > linux.52292: . ack 358520 win 65535
15:43:47.743727 IP linux.52292 > mailserver.smtp: . 420784:422232(1448) ack 542 win 7504
15:43:47.743759 IP linux.52292 > mailserver.smtp: . 422232:423680(1448) ack 542 win 7504
15:43:47.870591 IP mailserver.smtp > linux.52292: . ack 361416 win 65535
15:43:47.870613 IP linux.52292 > mailserver.smtp: . 423680:425128(1448) ack 542 win 7504
15:43:47.870644 IP linux.52292 > mailserver.smtp: . 425128:426576(1448) ack 542 win 7504
15:43:47.967659 IP mailserver.smtp > linux.52292: . ack 362864 win 65535
15:43:47.967695 IP linux.52292 > mailserver.smtp: . 426576:428024(1448) ack 542 win 7504
15:43:48.054611 IP mailserver.smtp > linux.52292: . ack 365760 win 65535
15:43:48.054643 IP linux.52292 > mailserver.smtp: . 428024:429472(1448) ack 542 win 7504
15:43:48.054677 IP linux.52292 > mailserver.smtp: . 429472:430920(1448) ack 542 win 7504
15:43:48.177804 IP mailserver.smtp > linux.52292: . ack 368656 win 65535
15:43:48.177845 IP linux.52292 > mailserver.smtp: . 430920:432368(1448) ack 542 win 7504
15:43:48.177879 IP linux.52292 > mailserver.smtp: . 432368:433816(1448) ack 542 win 7504
15:43:48.296768 IP mailserver.smtp > linux.52292: . ack 371552 win 65535
15:43:48.296790 IP linux.52292 > mailserver.smtp: . 433816:435264(1448) ack 542 win 7504
15:43:48.296820 IP linux.52292 > mailserver.smtp: . 435264:436712(1448) ack 542 win 7504
15:43:48.386685 IP mailserver.smtp > linux.52292: . ack 373000 win 65535
15:43:48.386702 IP linux.52292 > mailserver.smtp: . 436712:438160(1448) ack 542 win 7504
15:43:48.478585 IP mailserver.smtp > linux.52292: . ack 375896 win 65535
15:43:48.478605 IP linux.52292 > mailserver.smtp: . 438160:439608(1448) ack 542 win 7504
15:43:48.478636 IP linux.52292 > mailserver.smtp: . 439608:441056(1448) ack 542 win 7504
15:43:48.597559 IP mailserver.smtp > linux.52292: . ack 377344 win 65535
15:43:48.597581 IP linux.52292 > mailserver.smtp: . 441056:442504(1448) ack 542 win 7504
15:43:48.806718 IP mailserver.smtp > linux.52292: . ack 378792 win 65535
15:43:48.806740 IP linux.52292 > mailserver.smtp: . 442504:443952(1448) ack 542 win 7504
15:43:52.032437 IP linux.52292 > mailserver.smtp: . 378792:380240(1448) ack 542 win 7504
15:43:52.153192 IP mailserver.smtp > linux.52292: . ack 383136 win 65535
15:43:52.153228 IP linux.52292 > mailserver.smtp: . 443952:445400(1448) ack 542 win 7504
15:43:52.153262 IP linux.52292 > mailserver.smtp: . 445400:446848(1448) ack 542 win 7504
15:43:52.153362 IP mailserver.smtp > linux.52292: . ack 387480 win 65535
15:43:52.153381 IP linux.52292 > mailserver.smtp: . 446848:448296(1448) ack 542 win 7504
15:43:59.048438 IP linux.52292 > mailserver.smtp: . 387480:388928(1448) ack 542 win 7504
15:43:59.170481 IP mailserver.smtp > linux.52292: . ack 390376 win 65535
15:43:59.170516 IP linux.52292 > mailserver.smtp: . 448296:449744(1448) ack 542 win 7504
15:43:59.170551 IP linux.52292 > mailserver.smtp: . 449744:451192(1448) ack 542 win 7504
15:43:59.172438 IP mailserver.smtp > linux.52292: . ack 394720 win 65535
15:43:59.172457 IP linux.52292 > mailserver.smtp: . 451192:452640(1448) ack 542 win 7504
15:43:59.295693 IP mailserver.smtp > linux.52292: . ack 396168 win 65535
15:44:22.548439 IP linux.52292 > mailserver.smtp: . 396168:397616(1448) ack 542 win 7504
15:44:22.669245 IP mailserver.smtp > linux.52292: . ack 399064 win 65535
15:44:22.669266 IP linux.52292 > mailserver.smtp: . 452640:454088(1448) ack 542 win 7504
15:44:22.669299 IP linux.52292 > mailserver.smtp: . 454088:455536(1448) ack 542 win 7504
15:44:22.669399 IP mailserver.smtp > linux.52292: . ack 404856 win 65535
15:44:22.669427 IP linux.52292 > mailserver.smtp: . 455536:456984(1448) ack 542 win 7504
15:45:15.916435 IP linux.52292 > mailserver.smtp: . 404856:406304(1448) ack 542 win 7504
15:45:16.127748 IP mailserver.smtp > linux.52292: . ack 406304 win 65535
15:45:16.127773 IP linux.52292 > mailserver.smtp: . 456984:458432(1448) ack 542 win 7504
15:45:16.127804 IP linux.52292 > mailserver.smtp: . 458432:459880(1448) ack 542 win 7504

...

The full packet flow can be found in the attached dump. Another mail seems
to be transmitted while the first was stalling, but the issue is the same,
the timing information shows no active transfer for over 60s.

Thomas

[-- Attachment #2: smtp.tcpdump.bz2 --]
[-- Type: application/x-bzip2, Size: 10663 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-11 14:33                         ` Thomas Jarosch
@ 2008-07-15 11:47                           ` Thomas Jarosch
  2008-07-15 16:10                             ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-15 11:47 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: netdev, Patrick McHardy, Sven Riedel, Netfilter Developer Mailing List

On Friday, 11. July 2008 16:33:41 Thomas Jarosch wrote:
> On Thursday, 10. July 2008 23:21:37 Jozsef Kadlecsik wrote:
> > You did not mention the type of your driver. Isn't there some changes in
> > the driver code between 2.6.23.17 and 2.6.24 which could cause such
> > stallings?
>
> It's a 8139too and the "same" hardware works fine at other sites.
> I tried the nmap trick mentioned by Dâniel Fraga with no noticable
> difference.

I swapped the NIC with a "via-rhine" based card which is installed
in the remote box, but without much success.

Luckily I'm able to reproduce the problem locally using an ADSL line from the 
same provider, so I'll now bisect the kernel from 2.6.23.17 to 2.6.24.

Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-15 11:47                           ` Thomas Jarosch
@ 2008-07-15 16:10                             ` Thomas Jarosch
  2008-07-15 18:30                               ` Dâniel Fraga
  2008-07-15 20:17                               ` Ilpo Järvinen
  0 siblings, 2 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-15 16:10 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga

> Luckily I'm able to reproduce the problem locally using an ADSL line from
> the same provider, so I'll now bisect the kernel from 2.6.23.17 to 2.6.24.

After bisecting for hours, l only had ten revisions left to test.
There was this commit that caught my eye:

------------------------------
commit c96fd3d461fa495400df24be3b3b66f0e0b152f9
Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Date:   Thu Sep 20 11:36:37 2007 -0700

    [TCP]: Enable SACK enhanced FRTO (RFC4138) by default
------------------------------

This change sets the value of "tcp_frto" to 2 by default.
If I reset it to zero, the connection works immediately.
@Dâniel Fraga: Does disabling tcp_frto work for you, too?

Disabling tcp_sack makes no difference. To summarize the situation,
I had two different cases of stalling TCP connections, both connecting
to busy SMTP relays servers which probably drop some packets here and there.

I can easily reproduce the problem, so how do we go from here?

Cheers,
Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-15 16:10                             ` Thomas Jarosch
@ 2008-07-15 18:30                               ` Dâniel Fraga
  2008-07-31  4:47                                 ` Dâniel Fraga
  2008-07-15 20:17                               ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-07-15 18:30 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List

On Tue, 15 Jul 2008 18:10:42 +0200
Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:

> Disabling tcp_sack makes no difference. To summarize the situation,
> I had two different cases of stalling TCP connections, both connecting
> to busy SMTP relays servers which probably drop some packets here and there.
> 
> I can easily reproduce the problem, so how do we go from here?

	I'm using kernel 2.6.26 and the problem has gone. No stalled
connections anymore. The problem was with 2.6.25 kernel only.


-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-15 16:10                             ` Thomas Jarosch
  2008-07-15 18:30                               ` Dâniel Fraga
@ 2008-07-15 20:17                               ` Ilpo Järvinen
  2008-07-16  8:07                                 ` Thomas Jarosch
  2008-07-16  9:03                                 ` Thomas Jarosch
  1 sibling, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-15 20:17 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1491 bytes --]

On Tue, 15 Jul 2008, Thomas Jarosch wrote:

> > Luckily I'm able to reproduce the problem locally using an ADSL line from
> > the same provider, so I'll now bisect the kernel from 2.6.23.17 to 2.6.24.
> 
> After bisecting for hours, l only had ten revisions left to test.
> There was this commit that caught my eye:
> 
> ------------------------------
> commit c96fd3d461fa495400df24be3b3b66f0e0b152f9
> Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> Date:   Thu Sep 20 11:36:37 2007 -0700
> 
>     [TCP]: Enable SACK enhanced FRTO (RFC4138) by default
> ------------------------------
> 
> This change sets the value of "tcp_frto" to 2 by default.
> If I reset it to zero, the connection works immediately.
> @Dâniel Fraga: Does disabling tcp_frto work for you, too?
> 
> Disabling tcp_sack makes no difference. To summarize the situation,
> I had two different cases of stalling TCP connections, both connecting
> to busy SMTP relays servers which probably drop some packets here and there.
> 
> I can easily reproduce the problem, so how do we go from here?

FRTO in 2.6.24.y is broken, I recently fixed couple of things in FRTO, 
late 2.6.25.y or 2.6.26 should be used to have all the fixes. If you can 
reproce with either one, please tcpdump it (I just returned, was couple of 
weeks away, so I'm slowly catching up what has happened in between here). 
...I guess somebody had dumped at least 2.6.24.y but that's not 
interesting due to known (and fixed) bugs with FRTO.


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-15 20:17                               ` Ilpo Järvinen
@ 2008-07-16  8:07                                 ` Thomas Jarosch
  2008-07-16  9:03                                 ` Thomas Jarosch
  1 sibling, 0 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-16  8:07 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga

Terve Ilpo,

On Tuesday, 15. July 2008 22:17:47 Ilpo Järvinen wrote:
> > I can easily reproduce the problem, so how do we go from here?
>
> FRTO in 2.6.24.y is broken, I recently fixed couple of things in FRTO,
> late 2.6.25.y or 2.6.26 should be used to have all the fixes. If you can
> reproce with either one, please tcpdump it (I just returned, was couple of
> weeks away, so I'm slowly catching up what has happened in between here).
> ...I guess somebody had dumped at least 2.6.24.y but that's not
> interesting due to known (and fixed) bugs with FRTO.

I tried 2.6.25.10 without luck. I have a git "master" tree from yesterday 
which is also stalling for some seconds around ~220kb and then recovering. 
The connection completly stalls at around ~1.3mb. I'll send you a tcpdump
in private soon as it's going to be rather big for the mailinglist.

Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-15 20:17                               ` Ilpo Järvinen
  2008-07-16  8:07                                 ` Thomas Jarosch
@ 2008-07-16  9:03                                 ` Thomas Jarosch
  2008-07-17 13:55                                   ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-16  9:03 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga

On Tuesday, 15. July 2008 22:17:47 Ilpo Järvinen wrote:
> FRTO in 2.6.24.y is broken, I recently fixed couple of things in FRTO,
> late 2.6.25.y or 2.6.26 should be used to have all the fixes. If you can
> reproce with either one, please tcpdump it

As the dumps are really big, I uploaded them to a temporary space.
Included are two tcpdumps of stalling connections using git "master".
The first one stalls around ~1.3mb, the second one around ~4mb.

Get it from here:
http://www.intra2net.com/de/download/tcpdump/tcp_frto_tcpdumps.tar.bz2

There is another box in front of my test system doing NAT
which is running 2.6.24.7. I've tested with and without tcp_frto
on that box to make sure it's not FRTO related.

I've also included a tcpdump with FRTO disabled, so you can see
the connection is actually working. Just by looking at the packet flow
while tracing the connection looks much smoother without FRTO
and doesn't stall for seconds here and there.

Cheers,
Thomas

-- 
Address (better: trap) for people I really don't want to get mail from:
jessica.hope@cactusamerica.com
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-16  9:03                                 ` Thomas Jarosch
@ 2008-07-17 13:55                                   ` Ilpo Järvinen
  2008-07-17 15:15                                     ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-17 13:55 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5125 bytes --]

On Wed, 16 Jul 2008, Thomas Jarosch wrote:

> On Tuesday, 15. July 2008 22:17:47 Ilpo Järvinen wrote:
> > FRTO in 2.6.24.y is broken, I recently fixed couple of things in FRTO,
> > late 2.6.25.y or 2.6.26 should be used to have all the fixes. If you can
> > reproce with either one, please tcpdump it
> 
> As the dumps are really big, I uploaded them to a temporary space.
> Included are two tcpdumps of stalling connections using git "master".
> The first one stalls around ~1.3mb, the second one around ~4mb.
> 
> Get it from here:
> http://www.intra2net.com/de/download/tcpdump/tcp_frto_tcpdumps.tar.bz2

Thanks for the dumps, it's pretty clear picture now... Also, I read this 
thread fully today, your note in the initial mail is correct and relevant:
"The picture is similar to Sven's issue reported backed in march: Some ACK 
packets are missing (as if the remote side never sent them)."

> There is another box in front of my test system doing NAT
> which is running 2.6.24.7. I've tested with and without tcp_frto
> on that box to make sure it's not FRTO related.

Did you accidently add "not" here? :-)

> I've also included a tcpdump with FRTO disabled, so you can see
> the connection is actually working. Just by looking at the packet flow
> while tracing the connection looks much smoother without FRTO
> and doesn't stall for seconds here and there.

Yes, but why it happens, let me explain...

 "A TCP receiver SHOULD send an immediate duplicate ACK when an out-
  of-order segment arrives." [RFC2581]

FRTO is partially built on assumption that the receiver does the right 
thing (tm), ie., sends duplicate ACKs. But in this case the server for 
some reason has chosen to ignore this SHOULD here in the standards, 
which stands for this:

"3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course." [RFC2119]

It could be that the duplicate ACKs are missing due to bug,
misconfiguration or broken middlebox at the provider. This is somewhat 
similar to the case we worked-around recently with the network printers 
that do accept data only in-order and just dupack rest. ...I actually 
predicted this dupACK-less receiver problem back then (not sure if I 
mentioned it in a mail though) but it seemed like small box problem 
rather than some big box like mail server problem. It seems hardly a 
reasonable way to interpret "in particular circumstances" as never send 
dupACKs (which have other benefits too).

Because those duplicate ACKs never arrive for the new data segments FRTO 
is segment, FRTO never falls back to conventional recovery but RTO expires 
again for a different segment and FRTO algorithm is retried with the same 
results. So TCP is basically in RTO loop making slowly progress. If there 
isn't external timeout, the situation is eventually recovered when all 
data ACKed by a big cumulative ACK or earlier when a temporary dupACK 
lossage end (like it should be at worst).

It would quite interesting to know more details about the mail server and 
why the duplicate ACKs are not generated or don't ever reach the sender 
but I guess the details are out of reach?

One option would be to disable reentry to FRTO when some progress was 
made... Please try with the patch below... It has some non-desirable 
properties in microbenchmarks but adds robustness, it's not clear to me 
how often the reentry would benefit in real life scenarios but I'd assume 
that most RTOs that occur for a later segment are not spurious anyway 
even when the first was.


-- 
 i.

--

[PATCH] tcp FRTO: workaround dupACK-less receivers

FRTO assumes that dupACKs arrive in-order to fallback into
conventional recovery. Some receivers, due to unknown reasons,
care not to send duplicate ACKs at all, which seems quite
unreasonable because RFC2581 is using SHOULD for ofo segment
duplicate ACKs. ...A more likely cause might be some broken
middlebox which blocks dupACKs. If no duplicate ACKs arrive,
TCP goes into RTO-loop due to FRTO, because only new data is
getting sent after the retransmission of the head segment
(and its partial ACK). The situation continues until a big
cumulative ACK covers all outstanding data.

This impacts FRTO accuracy as we lose ability to detect more than
one spurious segment per window with NewReno. Performance impact
might not be visible unless one sets up an microbenchmark... :-)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d6ea970..3f7cce9 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1714,6 +1714,10 @@ int tcp_use_frto(struct sock *sk)
 	if (tcp_is_sackfrto(tp))
 		return 1;
 
+	/* dupACK-less receiver workaround */
+	if (tp->frto_counter > 1)
+		return 0;
+
 	/* Avoid expensive walking of rexmit queue if possible */
 	if (tp->retrans_out > 1)
 		return 0;
-- 
1.5.2.2

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-17 13:55                                   ` Ilpo Järvinen
@ 2008-07-17 15:15                                     ` Thomas Jarosch
  2008-07-17 15:53                                       ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-17 15:15 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

On Thursday, 17. July 2008 15:55:25 Ilpo Järvinen wrote:
> It would quite interesting to know more details about the mail server and
> why the duplicate ACKs are not generated or don't ever reach the sender
> but I guess the details are out of reach?

It will be quite difficult to get more details as it's the SMTP relay sever
of Germany's biggest ISP. There's a comment about them
in Patrick's blog from 2008-06-23 if you are curious ;-)

We see the same issue with a MX server from "United Internet".
Normally they are pretty accurate about standards (they run GMX),
so I guess this must be a problem of a router in between.

This is also supported by the fact that 935 of our boxes already updated
to kernel 2.6.24.7, yet the problem occured only at three sites and I guess
there are more people out there using that SMTP relay server.

Could you somehow "probe" the servers to see if they normally
send duplicated ACKs by faking/forcing a retransmission?
Though I guess this would invole writing some TCP "test" code.

> One option would be to disable reentry to FRTO when some progress was
> made... Please try with the patch below...

Thanks for the patch. It seemed to help a bit. Here are two more traces:
http://www.intra2net.com/de/download/tcpdump/tcp_frto_with_patch.tar.bz2

The first connection somehow made it after 400 seconds,
the second one stalled and timed out :-(
Hope they dumps are useful to you.

Chers,
Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-17 15:15                                     ` Thomas Jarosch
@ 2008-07-17 15:53                                       ` Ilpo Järvinen
  2008-07-18  9:14                                         ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-17 15:53 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3969 bytes --]

On Thu, 17 Jul 2008, Thomas Jarosch wrote:

> On Thursday, 17. July 2008 15:55:25 Ilpo Järvinen wrote:
> > It would quite interesting to know more details about the mail server and
> > why the duplicate ACKs are not generated or don't ever reach the sender
> > but I guess the details are out of reach?
> 
> It will be quite difficult to get more details as it's the SMTP relay sever
> of Germany's biggest ISP. There's a comment about them
> in Patrick's blog from 2008-06-23 if you are curious ;-)

...I thought so, unless one has some connections they're not that
willing, ever :-).

> We see the same issue with a MX server from "United Internet".
> Normally they are pretty accurate about standards (they run GMX),
> so I guess this must be a problem of a router in between.

I'd vote for middlebox, e.g., some kind of TCP proxy, split TCP 
brokeness or misconfigured firewall or such (or perhaps it's just
because of some misguided one who have been thought that duplicate
ACKs are a serious threat :-))...

> Could you somehow "probe" the servers to see if they normally
> send duplicated ACKs by faking/forcing a retransmission?
> Though I guess this would invole writing some TCP "test" code.

Yes, it wouldn't even be that hard to do with hping3. I might actually 
try to come up with something (but not now).

> > One option would be to disable reentry to FRTO when some progress was
> > made... Please try with the patch below...
> 
> Thanks for the patch. It seemed to help a bit. Here are two more traces:
> http://www.intra2net.com/de/download/tcpdump/tcp_frto_with_patch.tar.bz2
> 
> The first connection somehow made it after 400 seconds,
> the second one stalled and timed out :-(
> Hope they dumps are useful to you.

Ah, I just forgot that the situation might persist... Try with this 
one instead...

-- 
 i.

[PATCH] tcp FRTO: workaround dupACK-less receivers

FRTO assumes that dupACKs arrive in-order to fallback into
conventional recovery. Some receivers, due to unknown reasons,
care not to send duplicate ACKs at all, which seems quite
unreasonable because RFC2581 is using SHOULD for ofo segment
duplicate ACKs. ...A more likely cause might be some broken
middlebox which blocks dupACKs. If no duplicate ACKs arrive,
TCP goes into RTO-loop due to FRTO, because only new data is
getting sent after the retransmission of the head segment
(and its partial ACK). The situation continues until a big
cumulative ACK covers all outstanding data (or until somebody
gives up).

The new approach prevents reentry to FRTO when a previous FRTO
recovery is underway. This alone was found inadequate solution
because the situation may persist with some receivers even after
the first fallback has occured. Thus cover anything in CA_Loss
state too.

This impacts FRTO accuracy as we lose ability to detect more than
one spurious segment per window with NewReno. Performance impact
in real world is hard to estimate because it's hard to know how
often second RTO would be spurious in practice, however, the
worst case behavior will still be as without FRTO so it just
reduces the benefits of FRTO.

This issue was reported by Thomas Jarosch and probably a number
of other people (though there was other case which was a real
bug with similar symptoms that was fixed in 2.6.25.7).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
---
 net/ipv4/tcp_input.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d6ea970..764c084 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1714,6 +1714,10 @@ int tcp_use_frto(struct sock *sk)
 	if (tcp_is_sackfrto(tp))
 		return 1;
 
+	/* dupACK-less receiver workaround */
+	if (tp->frto_counter > 1 || icsk->icsk_ca_state == TCP_CA_Loss)
+		return 0;
+
 	/* Avoid expensive walking of rexmit queue if possible */
 	if (tp->retrans_out > 1)
 		return 0;
-- 
1.5.2.2

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-17 15:53                                       ` Ilpo Järvinen
@ 2008-07-18  9:14                                         ` Thomas Jarosch
  2008-07-18 13:55                                           ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-18  9:14 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

Moi Ilpo,

On Thursday, 17. July 2008 17:53:01 Ilpo Järvinen wrote:
> > > One option would be to disable reentry to FRTO when some progress was
> > > made... Please try with the patch below...
>
> Ah, I just forgot that the situation might persist... Try with this
> one instead...

Good news everyone: Two connections made it to the finish line.

The bad part: One transfer took four minutes, the other sixteen minutes.
A colleague commented it's still much faster than carrying the message
by plane ;-) A session without FRTO takes around 84 seconds.

I've added debug printks() to every return path in tcp_use_frto(),
so you can see what's going on. They look like this:

Jul 18 10:20:40 intratest131 kernel: [  957.318006] tcp_use_frto: ENTER: frto_counter: 0, icsk->icsk_ca_state: 0
Jul 18 10:20:40 intratest131 kernel: [  957.318011] tcp_use_frto: DEFAULT RETURN 1;
Jul 18 10:21:08 intratest131 kernel: [  984.446006] tcp_use_frto: ENTER: frto_counter: 3, icsk->icsk_ca_state: 0
Jul 18 10:21:08 intratest131 kernel: [  984.446011] tcp_use_frto: RETURN in "tp->frto_counter > 1 || icsk->icsk_ca_state == TCP_CA_Loss"
Jul 18 10:21:14 intratest131 kernel: [  991.058006] tcp_use_frto: ENTER: frto_counter: 0, icsk->icsk_ca_state: 0
Jul 18 10:21:14 intratest131 kernel: [  991.058011] tcp_use_frto: DEFAULT RETURN 1;

Here are two new dumps and the corresponding debug traces:
http://www.intra2net.com/de/download/tcpdump/tcp_frto_second_patch.tar.bz2

Enjoy,
Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-18  9:14                                         ` Thomas Jarosch
@ 2008-07-18 13:55                                           ` Ilpo Järvinen
  2008-07-18 14:02                                             ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-18 13:55 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2681 bytes --]

On Fri, 18 Jul 2008, Thomas Jarosch wrote:

> On Thursday, 17. July 2008 17:53:01 Ilpo Järvinen wrote:
> > > > One option would be to disable reentry to FRTO when some progress was
> > > > made... Please try with the patch below...
> >
> > Ah, I just forgot that the situation might persist... Try with this
> > one instead...
> 
> Good news everyone: Two connections made it to the finish line.
> 
> The bad part: One transfer took four minutes, the other sixteen minutes.
> A colleague commented it's still much faster than carrying the message
> by plane ;-) A session without FRTO takes around 84 seconds.

...I guess if you would limit ssthresh to some small value you might beat 
that value even without FRTO.

> I've added debug printks() to every return path in tcp_use_frto(),
> so you can see what's going on. They look like this:
> 
> Jul 18 10:20:40 intratest131 kernel: [  957.318006] tcp_use_frto: ENTER: frto_counter: 0, icsk->icsk_ca_state: 0
> Jul 18 10:20:40 intratest131 kernel: [  957.318011] tcp_use_frto: DEFAULT RETURN 1;
> Jul 18 10:21:08 intratest131 kernel: [  984.446006] tcp_use_frto: ENTER: frto_counter: 3, icsk->icsk_ca_state: 0
> Jul 18 10:21:08 intratest131 kernel: [  984.446011] tcp_use_frto: RETURN in "tp->frto_counter > 1 || icsk->icsk_ca_state == TCP_CA_Loss"
> Jul 18 10:21:14 intratest131 kernel: [  991.058006] tcp_use_frto: ENTER: frto_counter: 0, icsk->icsk_ca_state: 0
> Jul 18 10:21:14 intratest131 kernel: [  991.058011] tcp_use_frto: DEFAULT RETURN 1;
> 
> Here are two new dumps and the corresponding debug traces:
> http://www.intra2net.com/de/download/tcpdump/tcp_frto_second_patch.tar.bz2

It seems that with FRTO the retransmission timeout grows much higher which 
causes longer delays when things continue by RTO, this might be plainly 
due to the fact that some timeouts seem indeed spurious, and with FRTO we 
can take RTT measures out of such. I'll keep digging deeper... The 
receiver is definately doing something crazy as well, eg.:

6.1.131.56060: . ack 1995587 win 65535
152.31.131.25: . 1998387:1999787(1400) ack 562 win 7504 (DF)
152.31.131.25: . 1999787:2001187(1400) ack 562 win 7504 (DF)
152.31.131.25: . 2001187:2002587(1400) ack 562 win 7504 (DF)
6.1.131.56060: . ack 1995587 win 8192 (DF)
6.1.131.56060: . ack 1996987 win 8192 (DF)
6.1.131.56060: . ack 1996987 win 8192 (DF)
6.1.131.56060: . ack 1996987 win 8192 (DF)

...The receiver shrunk the window here (it's not the only example) :-), 
though on the bright side, those are duplicate ACKs... :-D

Btw, on which kernel you ran these things (I hope it wasn't 2.6.24.7, 
which has FRTO related bugs anyway that the patches I've sent now won't 
fix)? 

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-18 13:55                                           ` Ilpo Järvinen
@ 2008-07-18 14:02                                             ` Thomas Jarosch
  2008-07-19  7:35                                               ` Ilpo Järvinen
  2008-07-25 10:00                                               ` Ilpo Järvinen
  0 siblings, 2 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-18 14:02 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

On Friday, 18. July 2008 15:55:22 Ilpo Järvinen wrote:
> Btw, on which kernel you ran these things (I hope it wasn't 2.6.24.7,
> which has FRTO related bugs anyway that the patches I've sent now won't
> fix)?

It's the git "master" tree from two days ago, so it should be 2.6.27-pre.
Like I wrote before, there's another box doing NAT in front of it running 
2.6.24.7. FRTO is disabled on that box. Hope that helps a bit.

Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-18 14:02                                             ` Thomas Jarosch
@ 2008-07-19  7:35                                               ` Ilpo Järvinen
  2008-07-25 10:00                                               ` Ilpo Järvinen
  1 sibling, 0 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-19  7:35 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1434 bytes --]

On Fri, 18 Jul 2008, Thomas Jarosch wrote:

> On Friday, 18. July 2008 15:55:22 Ilpo Järvinen wrote:
> > Btw, on which kernel you ran these things (I hope it wasn't 2.6.24.7,
> > which has FRTO related bugs anyway that the patches I've sent now won't
> > fix)?
> 
> It's the git "master" tree from two days ago, so it should be 2.6.27-pre.
> Like I wrote before, there's another box doing NAT in front of it running 
> 2.6.24.7. FRTO is disabled on that box. Hope that helps a bit.

Hmm, those were spurious RTOs indeed or a sign of perverted TCP "proxy" 
(or whatever they call them), longest delay spike I've found so far is 
this:

11:27:28.454827 172.16.1.131.56060 > 80.152.31.131.25: . 3989187:3990587(1400)
...
11:28:00.188835 80.152.31.131.25 > 172.16.1.131.56060: . ack 3990587 win 65535

That's 32 seconds? :-D What should TCP do with that :-) ...disregard that 
measurement because some other TCP variant would not be able to use the 
same measurement due to ambiguity problem(?), I don't think so... It seems
that non-FRTO TCP just misses those signs and acts _too_ aggressively ;-),
which is well known to happen when spurious RTO occurs.

...Also, those duplicate ACKs I pointed out earlier are a sign of 
unnecessary retransmissions (they occur both with and without FRTO).
I actually doubt you have any real losses there, I'll probably next 
calculate RTTs based on that assumption in the non-FRTO dump too...

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-18 14:02                                             ` Thomas Jarosch
  2008-07-19  7:35                                               ` Ilpo Järvinen
@ 2008-07-25 10:00                                               ` Ilpo Järvinen
  2008-07-25 13:00                                                 ` Thomas Jarosch
  1 sibling, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-25 10:00 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6382 bytes --]

On Fri, 18 Jul 2008, Thomas Jarosch wrote:

> On Friday, 18. July 2008 15:55:22 Ilpo Järvinen wrote:
> > Btw, on which kernel you ran these things (I hope it wasn't 2.6.24.7,
> > which has FRTO related bugs anyway that the patches I've sent now won't
> > fix)?
> 
> It's the git "master" tree from two days ago, so it should be 2.6.27-pre.
> Like I wrote before, there's another box doing NAT in front of it running 
> 2.6.24.7. FRTO is disabled on that box. Hope that helps a bit.

Ok.

I looked more into it, there indeed is a large number of spurious RTOs 
with extremely large round-trip times, though I suspect they occur due to 
some broken hw/cfg or whatever rather than due to a real wire+queueing 
delays, and that some external event is required to get things going again 
with it/them... but that's purely speculation since we don't know about 
the isp's stuff... :-)

Here are some example time-seqno graphs, the second was includes the first 
one in the lower left corner:

http://www.cs.helsinki.fi/u/ijjarvin/tcp/bigrto1.jpg
http://www.cs.helsinki.fi/u/ijjarvin/tcp/bigrto2.jpg

Larger boxes - data packets
Smaller boxes - ACKs (& receiver's advertized window)
...both are connected with lines in time order for easier tracking

RTOs occur when the data transfer line falls down, if there is more than 
one cumulative (advancing ACK) with FRTO sending pattern (ie., when there 
are two new datas following the retransmission) following the 
retransmission, it basically means that the original data segments made it 
through, and in the extreme cases it was sent much earlier!!! The longest 
round-trips are around 50 seconds in there. These increasing RTT 
measurements cause tp->rttvar to grow exponentially per each spurious RTO, 
which is very good to avoid spurious RTOs in future but obviously breaks 
down if future progress is also bound to actually triggering those RTOs

...I bet we could measure any desired value for RTT with those servers... 
except there's the application level timeout on the way... :-)

Could you try if the patch below helps any...

-- 
 i.


[PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround

Hmm, it wasn't non-dup ACKing receiver, there were dupACKs when
an unnecessary retransmission was made (though those ACKs revoke
a part of the advertized window, which is strange enough in
itself :-)).

2nd try:

This is probably due to some broken middlebox but that's purely
speculation since the details of the not named ISP's (you can
find some hint in Patrick's blog though ;-)) equipment are not
available to us.

It seems that we will have to consciously attempt to violate
packet conservation principle and do a spammy go-back-n in case
there's a middlebox using split TCPish approach by waiting an
arrival of TCP layer retransmission and then doing an in-order
delivery (basically violates end-to-end semantics of a TCP
connection). I.e., the proxy intentionally reorders segment by
_any_ amount (well, there's some upper limit based on the
advertized window I guess), it's ridiculously fragile approach...

Such middleboxes basically mean two things: First, any measured
RTT value when a loss occurred is entirely bogus, yet all
indication of the existance of that loss is hidden intentionally,
so the correct operation basically depends on ambiguity problem
and the inability to measure RTTs during it. Secondly, a timely
feedback from network is non-existing, ie., no fast recovery &
friends... This goodbye for RFC2581 clearly signifies that such
way of behavior is contradicting some very fundamental
assumptions a standard TCP is allowed to make about the network,
would the RFC2581 stuff work, also FRTO would work. ...Finally
I see something which resembles something as pre-historic as TCP
Tahoe (in the real world) :-).

FRTO assumes reordering is relatively rare thing, but this
middlebox has decided to _always_ reorder the key segments FRTO
depends on... Thus FRTO makes "wrong" decision and declares the
RTO spurious, which is not in fact wrong at all because the
receiver probably received the segments in that order (or at
least its TCP layer did) and clearly indicates it by the
cumulative ACK pattern. A cumulative ACK for a not retransmitted
range basically means that one of those segments just arrived,
in this case it's after ridiculous RTT, even 50 seconds were
measured in practice!! As a result, tp->rttvar flies to outer
space when exponentially increasing RTTs get sampled. But this
increase is much desired, in general, to avoid future RTOs would
the real RTT really grow that fast.

The workaround prevents reentry to FRTO when a previous FRTO
recovery occurred within the last window (though multiple RTOs
for a single segment are still allowed to go into FRTO each
time). This workaround impacts FRTO accuracy as we lose ability
to detect more than one spurious segment per window. We just
consciously violate packet conservation principle by
retransmitting unnecessarily to make rest of the high RTT
readings ambiguous and that's it... :-) Though even go-back-N
as fallback this won't guarantee anything if we're just unlucky
because RTTs we measure can still grow if losses occur too
frequently so that period in between is not enough to lower
RTT estimation :-). In contrast, non-FRTO TCP can always happily
ignore high RTT readings because of the ambiguity problem, ie.,
by violating packet conservation principle by design :-).

I'm not that sure if this is worthwhile modification to the
kernel due to the reasons that are explained above.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
---
 net/ipv4/tcp_input.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1f5e604..2a7528c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1721,6 +1721,13 @@ int tcp_use_frto(struct sock *sk)
 	if (tcp_is_sackfrto(tp))
 		return 1;
 
+	/* in-order-only "TCP proxy" fragility workaround, spam by go-back-n,
+	 * ie., consciously attempt to violate packet conservation principle
+	 * to cover every loss in the outstanding window on a single RTT
+	 */
+	if (!tp->frto_counter && tp->frto_highmark)
+		return 0;
+
 	/* Avoid expensive walking of rexmit queue if possible */
 	if (tp->retrans_out > 1)
 		return 0;
-- 
1.5.2.2

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-25 10:00                                               ` Ilpo Järvinen
@ 2008-07-25 13:00                                                 ` Thomas Jarosch
  2008-07-25 14:06                                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-25 13:00 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

Ilpo,

On Friday, 25. July 2008 12:00:29 Ilpo Järvinen wrote:
> [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround

The latest patch works quite good. I accidentally had your
previous patch applied, too, which gave even better results.
Though I don't know enough about the gory details of FRTO
if this effectivly disables it...

Here are two fresh tcpdumps, one with the last patch only
and one which also includes your previous patch:
http://www.intra2net.com/de/download/tcpdump/tcp_frto_highmark_patch.tar.bz2

> ...
> This is probably due to some broken middlebox but that's purely
> speculation since the details of the not named ISP's (you can
> find some hint in Patrick's blog though ;-)) equipment are not
> available to us.

LOL, this reminds me about the post on kernel.org from 2007-03-01:
"Kudos ... to Hewlett Packard for building a machine that can take the beating 
of an unnamed shipping company and keep on ticking". 

Just think of "unnamed" while looking at the images of the broken server ;-)

Have a nice weekend,
Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-25 13:00                                                 ` Thomas Jarosch
@ 2008-07-25 14:06                                                   ` Ilpo Järvinen
  2008-07-25 15:34                                                     ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-25 14:06 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6992 bytes --]

On Fri, 25 Jul 2008, Thomas Jarosch wrote:

> On Friday, 25. July 2008 12:00:29 Ilpo Järvinen wrote:
> > [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
> 
> The latest patch works quite good. I accidentally had your
> previous patch applied, too, which gave even better results.
> Though I don't know enough about the gory details of FRTO
> if this effectivly disables it...

Indeed, it seems that with the earlier patch (or at least part of it)
one can achieve even better performance, though limiting sending window 
would probably be the most efficient way to communicate through the 
middlebox to avoid capacity waste that is going on whole the time due
to it.

This patch alone could occassionally leave TCP hanging until a new RTO 
occurs when it has already gotten the first ACK after RTO (but the second 
is not coming until we kick the middlebox again by retransmitting the 
missing segment). But other than that, it worked as expected and solved 
many of the situations...

I guess the patch below would be enough in itself to create the desired 
effect (though "desired" is hardly a negative enough word to describe a 
workaround of this kind). Currently the workaround is only for SACKless 
TCP, though I guess there could be some "engineers" around who could 
without a doubt design a system which allows negotiating SACK, yet, doing 
all delivery in-order... :-) I think SACKless is enough though this same 
problem could occur with SACK too but that's not as likely as without 
SACK.

Funny, the violation of packet conservation principle leads to another 
queue overflow (as often expected) in more than half of the cases and 
therefore another RTO is needed... :-)

There is a new things in the logs too (I didn't study all details of the 
earlier ones so I might have missed them in there), probably signs about 
link-layer retransmissions... and that "notch" in advertized window is 
hilarious... :-)

Some statistics; unnecessary retransmissions (%, n), packets, filename:

0.0000   0 3026 stalling2
0.0000   0  698 stalling1
2.2693 137 6037 smtp_slooow
3.4316 221 6440 smtp_sixteen_minutes
4.3833 284 6479 smtp_worked_but_stalling_here_and_there
4.8030  50 1041 smtp_stalled
5.2868 340 6431 smtp_highmark_and_TCP_CA_Loss
6.0382 392 6492 smtp_highmark_only
6.8752 435 6327 working_no_frto

Ie., in the worst case 6.8% of your link's capacity was wasted during the 
transfer due to inefficiency cause by that middlebox, not counting the 
under-utilization that occurs both because of a small window or a wait for 
RTOs, not bad result at all... :-D

Try the patch below (alone) which should be close to the behavior of the 
both patches put together.

-- 
 i.

[PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround

Hmm, it wasn't non-dup ACKing receiver, there were dupACKs when
an unnecessary retransmission was made (though those ACKs revoke
a part of the advertized window, which is strange enough in
itself :-)).

2nd try:

This is probably due to some broken middlebox but that's purely
speculation since the details of the not named ISP's (you can
find some hint in Patrick's blog though ;-)) equipment are not
available to us.

It seems that we will have to consciously attempt to violate
packet conservation principle and do a spammy go-back-n in case
there's a middlebox using split TCPish approach by waiting an
arrival of TCP layer retransmission and then doing an in-order
delivery (basically violates end-to-end semantics of a TCP
connection). I.e., the proxy intentionally reorders segment by
_any_ amount (well, there's some upper limit based on the
advertized window I guess), it's ridiculously fragile approach...

Such middleboxes basically mean two things: First, any measured
RTT value when a loss occurred is entirely bogus, yet all
indication of the existance of that loss is hidden intentionally,
so the correct operation basically depends on ambiguity problem
and the inability to measure RTTs during it. Secondly, a timely
feedback from network is non-existing, ie., no fast recovery &
friends... This goodbye for RFC2581 clearly signifies that such
way of behavior is contradicting some very fundamental
assumptions a standard TCP is allowed to make about the network,
would the RFC2581 stuff work, also FRTO would work. ...Finally
I see something which resembles something as pre-historic as TCP
Tahoe (in the real world) :-).

FRTO assumes reordering is relatively rare thing, but this
middlebox has decided to _always_ reorder the key segments FRTO
depends on... Thus FRTO makes "wrong" decision and declares the
RTO spurious, which is not in fact wrong at all because the
receiver probably received the segments in that order (or at
least its TCP layer did) and clearly indicates it by the
cumulative ACK pattern. A cumulative ACK for a not retransmitted
range basically means that one of those segments just arrived,
in this case it's after ridiculous RTT, even 50 seconds were
measured in practice!! As a result, tp->rttvar flies to outer
space when exponentially increasing RTTs get sampled. But this
increase is much desired, in general, to avoid future RTOs would
the real RTT really grow that fast.

The workaround prevents reentry to FRTO when a previous FRTO
recovery occurred within the last window (though multiple RTOs
for a single segment are still allowed to go into FRTO each
time). This workaround impacts FRTO accuracy as we lose ability
to detect more than one spurious segment per window. We just
consciously violate packet conservation principle by
retransmitting unnecessarily to make rest of the high RTT
readings ambiguous and that's it... :-) Though even go-back-N
as fallback this won't guarantee anything if we're just unlucky
because RTTs we measure can still grow if losses occur too
frequently so that period in between is not enough to lower
RTT estimation :-). In contrast, non-FRTO TCP can always happily
ignore high RTT readings because of the ambiguity problem, ie.,
by violating packet conservation principle by design :-).

I'm not that sure if this is worthwhile modification to the
kernel due to the reasons that are explained above.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
---
 net/ipv4/tcp_input.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 75efd24..314bd55 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1721,6 +1721,13 @@ int tcp_use_frto(struct sock *sk)
 	if (tcp_is_sackfrto(tp))
 		return 1;
 
+	/* in-order-only "TCP proxy" fragility workaround, spam by go-back-n,
+	 * ie., consciously attempt to violate packet conservation principle
+	 * to cover every loss in the outstanding window on a single RTT
+	 */
+	if (tp->frto_counter != 1 && tp->frto_highmark)
+		return 0;
+
 	/* Avoid expensive walking of rexmit queue if possible */
 	if (tp->retrans_out > 1)
 		return 0;
-- 
1.5.2.2

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-25 14:06                                                   ` Ilpo Järvinen
@ 2008-07-25 15:34                                                     ` Thomas Jarosch
  2008-07-31  7:39                                                       ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-25 15:34 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Jozsef Kadlecsik, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Dâniel Fraga,
	David Miller

Ilpo,

On Friday, 25. July 2008 16:06:04 Ilpo Järvinen wrote:
> I guess the patch below would be enough in itself to create the desired
> effect (though "desired" is hardly a negative enough word to describe a
> workaround of this kind). 

Yeah, the result feels good enough for me. Here's the latest tcpdump
before I run out of good filenames for the dumps:
http://www.intra2net.com/de/download/tcpdump/tcp_frto_combined_patch.tar.bz2

> Ie., in the worst case 6.8% of your link's capacity was wasted during the
> transfer due to inefficiency cause by that middlebox, not counting the
> under-utilization that occurs both because of a small window or a wait for
> RTOs, not bad result at all... :-D

IIRC our outbound box does traffic shaping, so some percents are to be 
accounted to packets being dropped to slow down the connection a bit 
if they come (in) too fast.

Anyway, we now just have to flip a coin if this gets into the kernel
or not :-) I really hope this could save someone from doing
the same debug session all over again...

Thanks for the hard work you put into debugging this.

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-15 18:30                               ` Dâniel Fraga
@ 2008-07-31  4:47                                 ` Dâniel Fraga
  2008-07-31  7:39                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-07-31  4:47 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev

On Tue, 15 Jul 2008 15:30:45 -0300
Dâniel Fraga <fragabr@gmail.com> wrote:

> 	I'm using kernel 2.6.26 and the problem has gone. No stalled
> connections anymore. The problem was with 2.6.25 kernel only.
	
	Sorry, I was wrong. In 2.6.26 the problem remains.

	Everyday my connection stalls to my NNTP server using 2.6.26
too. Before I just used a "nmap -sS server" and the connection would
come back, but now it doesn't work.

-- 
Linux 2.6.26: Rotary Wombat
http://u-br.net


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-25 15:34                                                     ` Thomas Jarosch
@ 2008-07-31  7:39                                                       ` Thomas Jarosch
  2008-07-31 12:44                                                         ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-31  7:39 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: Ilpo Järvinen, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik, David Miller

Hi Dâniel,

On Thursday, 31. July 2008 06:47:38 you wrote:
> On Tue, 15 Jul 2008 15:30:45 -0300
>
> Dâniel Fraga <fragabr@gmail.com> wrote:
> > 	I'm using kernel 2.6.26 and the problem has gone. No stalled
> > connections anymore. The problem was with 2.6.25 kernel only.
>
> 	Sorry, I was wrong. In 2.6.26 the problem remains.
>
> 	Everyday my connection stalls to my NNTP server using 2.6.26
> too. Before I just used a "nmap -sS server" and the connection would
> come back, but now it doesn't work.

Ok. Please try the latest patch Ilpo CC:ed you. Here's a link to the post:
http://marc.info/?l=linux-netdev&m=121699478406378&w=2

Where is the NNTP server located? At your provider?

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-31  4:47                                 ` Dâniel Fraga
@ 2008-07-31  7:39                                   ` Ilpo Järvinen
  2008-08-02 12:24                                     ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-07-31  7:39 UTC (permalink / raw)
  To: Dâniel Fraga; +Cc: netdev, netfilter-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 822 bytes --]

On Thu, 31 Jul 2008, Dâniel Fraga wrote:

> On Tue, 15 Jul 2008 15:30:45 -0300
> Dâniel Fraga <fragabr@gmail.com> wrote:
> 
> > 	I'm using kernel 2.6.26 and the problem has gone. No stalled
> > connections anymore. The problem was with 2.6.25 kernel only.
> 	
> 	Sorry, I was wrong. In 2.6.26 the problem remains.
> 
> 	Everyday my connection stalls to my NNTP server using 2.6.26
> too. Before I just used a "nmap -sS server" and the connection would
> come back, but now it doesn't work.

Tcpdumping it would help some... :-) Can you try the suggested patch if it 
changes any (though would I have a tcpdump showing the problem, I could 
probably tell right away if the patch would help or not, and also come up 
with something else if necessary :-)):

  http://marc.info/?l=linux-netdev&m=121699478406378&w=2



-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-31  7:39                                                       ` Thomas Jarosch
@ 2008-07-31 12:44                                                         ` Dâniel Fraga
  2008-07-31 13:47                                                           ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-07-31 12:44 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Ilpo Järvinen, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik, David Miller

On Thu, 31 Jul 2008 09:39:40 +0200
Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:

> Ok. Please try the latest patch Ilpo CC:ed you. Here's a link to the post:
> http://marc.info/?l=linux-netdev&m=121699478406378&w=2

	Before I try could this issue be related to some of these
kernel parameters?

echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

echo 0 > /proc/sys/net/ipv4/conf/all/accept_source_route 

echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects 

echo 1 > /proc/sys/net/ipv4/icmp_ignore_bogus_error_responses 

	I ask it because I decided to comment these lines (on my
NATted desktop and on the server) and until now I don't have the problem
anymore. But I'll keep testing all day and if the problem comes back
I'll try the patch ok?

> Where is the NNTP server located? At your provider?

	It's my nntp server:

nntp://news.abusar.org

	You can post test messages on grupo "u-br.teste".

	But there's an issue. My connection was stalled mainly when I
ran some application with sudo (for example fetchnews etc). Then I'd do
an nmap -sS and the connection would come back alive. Sometimes it
would be necessary a nmap on my desktop (local machine) and sometimes
on the server (news.abusar.org).

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-31 12:44                                                         ` Dâniel Fraga
@ 2008-07-31 13:47                                                           ` Thomas Jarosch
  2008-07-31 14:11                                                             ` Dâniel Fraga
  2008-08-06 18:53                                                             ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-07-31 13:47 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: Ilpo Järvinen, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik, David Miller

On Thursday, 31. July 2008 14:44:36 Dâniel Fraga wrote:
> > Ok. Please try the latest patch Ilpo CC:ed you. Here's a link to the
> > post: http://marc.info/?l=linux-netdev&m=121699478406378&w=2
>
> 	Before I try could this issue be related to some of these
> kernel parameters?

If your problem is really FRTO related (that what the patch is for),
you could try to disable FRTO temporarily:

echo 0 > /proc/sys/net/ipv4/tcp_frto

> > Where is the NNTP server located? At your provider?
>
> 	It's my nntp server:
>
> nntp://news.abusar.org
>
> 	You can post test messages on grupo "u-br.teste".

Nice, so Ilpo can "test" (=bombard) it with big messages ;-)

Thomas
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-31 13:47                                                           ` Thomas Jarosch
@ 2008-07-31 14:11                                                             ` Dâniel Fraga
  2008-08-06 18:53                                                             ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-07-31 14:11 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Ilpo Järvinen, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik, David Miller

On Thu, 31 Jul 2008 15:47:55 +0200
Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:

> echo 0 > /proc/sys/net/ipv4/tcp_frto
	
	Ok, i'm testing here. If I have any conclusions I'll return.

> Nice, so Ilpo can "test" (=bombard) it with big messages ;-)

	ehehe no problem. As long as you post on u-br.teste, you can do
whatever tests you want ;) 

	But I'm not completely sure my problem is tcp_frto related,
since sometimes it just happened when I "sudo" some program... I'll
keep investigating it. Thanks.


-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-31  7:39                                   ` Ilpo Järvinen
@ 2008-08-02 12:24                                     ` Dâniel Fraga
  0 siblings, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-02 12:24 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: netdev, netfilter-devel

On Thu, 31 Jul 2008 10:39:56 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Tcpdumping it would help some... :-) Can you try the suggested patch if it 
> changes any (though would I have a tcpdump showing the problem, I could 
> probably tell right away if the patch would help or not, and also come up 
> with something else if necessary :-)):
> 
>   http://marc.info/?l=linux-netdev&m=121699478406378&w=2

	Hi, I'm using the patch and I can confirm it solved my problem.
No more stalled connections :).

	Is there any chance this patch can be merged in the kernel?
Thank you.

-- 


-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-07-31 13:47                                                           ` Thomas Jarosch
  2008-07-31 14:11                                                             ` Dâniel Fraga
@ 2008-08-06 18:53                                                             ` Dâniel Fraga
  2008-08-07  6:54                                                               ` Ilpo Järvinen
  2008-08-07 11:33                                                               ` [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround Ilpo Järvinen
  1 sibling, 2 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-06 18:53 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: Ilpo Järvinen, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik, David Miller

On Thu, 31 Jul 2008 15:47:55 +0200
Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:

> If your problem is really FRTO related (that what the patch is for),
> you could try to disable FRTO temporarily:

	Hi, the patch helped, but what's the conclusion? Is the problem
"solved"? Will this patch be merged in the next kernel? This thread
seems to be forgotten.

	Thank you.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-06 18:53                                                             ` Dâniel Fraga
@ 2008-08-07  6:54                                                               ` Ilpo Järvinen
  2008-08-07 11:50                                                                 ` Denys Fedoryshchenko
  2008-08-07 11:33                                                               ` [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-07  6:54 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: Thomas Jarosch, Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik, David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 712 bytes --]

On Wed, 6 Aug 2008, Dâniel Fraga wrote:

> On Thu, 31 Jul 2008 15:47:55 +0200
> Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:
> 
> > If your problem is really FRTO related (that what the patch is for),
> > you could try to disable FRTO temporarily:
> 
> 	Hi, the patch helped, but what's the conclusion? Is the problem
> "solved"? Will this patch be merged in the next kernel? This thread
> seems to be forgotten.

I was yesterday preparing the patch description by adding some more 
thoughts to it (as if there weren't enough already) but didn't yet send it 
with new cover (to sort of notify davem).

I give no guarantees about the _next_ kernel but some 2.6.26.y and 2.6.27
is more likely bet.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-06 18:53                                                             ` Dâniel Fraga
  2008-08-07  6:54                                                               ` Ilpo Järvinen
@ 2008-08-07 11:33                                                               ` Ilpo Järvinen
  2008-08-08  4:42                                                                 ` Bill Fink
  1 sibling, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-07 11:33 UTC (permalink / raw)
  To: Dâniel Fraga, Thomas Jarosch, David Miller
  Cc: Netdev, Patrick McHardy, Sven Riedel,
	Netfilter Developer Mailing List, Jozsef Kadlecsik

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7137 bytes --]

On Wed, 6 Aug 2008, Dâniel Fraga wrote:

> On Thu, 31 Jul 2008 15:47:55 +0200
> Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:
> 
> > If your problem is really FRTO related (that what the patch is for),
> > you could try to disable FRTO temporarily:
> 
> 	Hi, the patch helped, but what's the conclusion? Is the problem
> "solved"? Will this patch be merged in the next kernel? This thread
> seems to be forgotten.

...Dave, I think we should probably put this FRTO work-around to net-2.6 
and -stable to remain somewhat robust (it's currently worked around only 
for newreno anyway). ...But I leave the final decision up to you.


-- 
 i.

[PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround

Hmm, it wasn't non-dup ACKing receiver, there were dupACKs when
an unnecessary retransmission was made (though those ACKs revoke
a part of the advertized window, which is strange enough in
itself :-)).

2nd try:

This is probably due to some broken middlebox but that's purely
speculation since the details of the not named ISP's (you can
find some hint in Patrick's blog though ;-)) equipment are not
available to us.

It seems that we will have to consciously attempt to violate
packet conservation principle and do a spammy go-back-n in case
there's a middlebox using split TCPish approach by waiting an
arrival of TCP layer retransmission and then doing an in-order
delivery (basically violates end-to-end semantics of a TCP
connection). I.e., the proxy intentionally reorders segment by
_any_ amount (well, there's some upper limit based on the
advertized window I guess), it's ridiculously fragile approach...

Such middleboxes basically mean two things: First, any measured
RTT value when a loss occurred is entirely bogus, yet all
indication of the existance of that loss is hidden intentionally,
so the correct operation basically depends on ambiguity problem
and the inability to measure RTTs during it. Secondly, a timely
feedback from network is non-existing, ie., no fast recovery &
friends... This goodbye for RFC2581 clearly signifies that such
way of behavior is contradicting some very fundamental
assumptions a standard TCP is allowed to make about the network,
would the RFC2581 stuff work, also FRTO would work. ...Finally
I see something which resembles something as pre-historic as TCP
Tahoe (I mean in the real world) :-).

FRTO assumes reordering is relatively rare thing, but this
middlebox has decided to _always_ reorder the key segments FRTO
depends on... Thus FRTO makes "wrong" decision and declares the
RTO spurious, which is not in fact wrong at all because the
receiver probably received the segments in that order (or at
least its TCP layer did) and clearly indicates it by the
cumulative ACK pattern. A cumulative ACK for a not retransmitted
range basically means that one of those segments just arrived
when an ACK got sent, in this case it's after ridiculous RTT,
even 50 seconds were measured in practice!! As a result,
tp->rttvar flies to outer space when exponentially increasing
RTTs get sampled. But this increase is much desired, in general,
to avoid future RTOs would the real RTT really grow that fast.
It just leads to a disaster here because the RTT measurements
are sender driven.

The workaround prevents reentry to FRTO when a previous FRTO
recovery occurred within the last window (though multiple RTOs
for a single segment are still allowed to go into FRTO each
time). This workaround impacts FRTO accuracy as we lose ability
to detect more than one spurious segment per window. We just
consciously violate packet conservation principle by
retransmitting unnecessarily to make rest of the high RTT
readings ambiguous and that's it... :-) Though even go-back-N
as fallback this won't guarantee anything if we're just unlucky
because RTTs we measure can still grow if losses occur too
frequently so that period in between is not enough to lower RTT
estimation :-). In contrast, non-FRTO TCP can always happily
ignore high RTT readings because of the ambiguity problem, ie.,
by violating packet conservation principle by design :-).

I currently implemented the workaround for newreno only though
SACK TCP could be subject to similar middlebox but lets hope that
there won't be that many of middleboxes that allow negotiating
SACK through them while forcing SACK blocks to extinction.

I find this workaround quite controversial, it seems that without
FRTO (at all), amusing 6.8% of the transmitted segments were
unnecessarily retransmitted, which do cause buffer overflow that
often leads to another RTO (in ~50% of cases), which is sort of
expected when packet conservation principle gets violated like
here. With FRTO, even if its final decision (ie., RTO=spurious)
here is probably "flawed" because of the carefully selected
reordering, _all_ unnecessary retransmissions are avoided (those
duplicate ACKs that indicated old segment arrivals vanished) and
with the default response the congestion window gets shrunk anyway
so it's not more aggressive than what non-FRTO TCP would be. Sadly
enough the RTT times will grow making FRTO approach unbearable
without some changes. Still, that kind of middleboxes do no good
for any TCP flow and should be fixed.

A better workaround would have to consider two things to keep
performance on a semi-acceptable level: prevent exponential RTT
back-off while avoiding over-aggressive cwnd calculation. The latter
seems easy to deal with because either the RTO is genuine spurious
RTO within the original window or there's this crazy middlebox which
only received the retransmission while the original got lost, both
events fall to the same RTT where cwnd was already reduced and
therefore it is possible to show that there's no further need for
congestion window reduction. But the RTT back-off prevention would
be more controversial because as said before, it is a desirable
property in case of a genuine spurious RTO. However, it might be
possible to argue that this situation where two spurious RTOs hit
the same window won't occur that often in practice (for different
segments, we already adjusted the RTO value anyway on the first of
them). ...I leave that into future consideration.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Dâniel Fraga <fragabr@gmail.com>
---
 net/ipv4/tcp_input.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 67ccce2..e137578 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1721,6 +1721,13 @@ int tcp_use_frto(struct sock *sk)
 	if (tcp_is_sackfrto(tp))
 		return 1;
 
+	/* in-order-only "TCP proxy" fragility workaround, spam by go-back-n,
+	 * ie., consciously attempt to violate packet conservation principle
+	 * to cover every loss in the outstanding window on a single RTT
+	 */
+	if (tp->frto_counter != 1 && tp->frto_highmark)
+		return 0;
+
 	/* Avoid expensive walking of rexmit queue if possible */
 	if (tp->retrans_out > 1)
 		return 0;
-- 
1.5.2.2

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-07  6:54                                                               ` Ilpo Järvinen
@ 2008-08-07 11:50                                                                 ` Denys Fedoryshchenko
  2008-08-07 12:11                                                                   ` Thomas Jarosch
  2008-08-07 12:14                                                                   ` Ilpo Järvinen
  0 siblings, 2 replies; 116+ messages in thread
From: Denys Fedoryshchenko @ 2008-08-07 11:50 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Dâniel Fraga, Thomas Jarosch, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik,
	David Miller

On Thursday 07 August 2008, Ilpo Järvinen wrote:
> On Wed, 6 Aug 2008, Dâniel Fraga wrote:
> > On Thu, 31 Jul 2008 15:47:55 +0200
> >
> > Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:
> > > If your problem is really FRTO related (that what the patch is for),
> > > you could try to disable FRTO temporarily:
> >
> > 	Hi, the patch helped, but what's the conclusion? Is the problem
> > "solved"? Will this patch be merged in the next kernel? This thread
> > seems to be forgotten.
>
> I was yesterday preparing the patch description by adding some more
> thoughts to it (as if there weren't enough already) but didn't yet send it
> with new cover (to sort of notify davem).
>
> I give no guarantees about the _next_ kernel but some 2.6.26.y and 2.6.27
> is more likely bet.

By the way, i had also problem with frto with local connections, and it was 
trivial to reproduce. But because of proprioetary(but i have sources) 
userspace application and specific way of using it - i didn't report to 
maillist. But after patch is ready, add me please in cc, i will test it with 
me too.

For me disabling frto helps to solve problem. With frto i have 
connections "stalling", if there is trasferred large chunks of data over 
loopback. It is complicated way how it works all - but i can try to explain 
how everything works, if required.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-07 11:50                                                                 ` Denys Fedoryshchenko
@ 2008-08-07 12:11                                                                   ` Thomas Jarosch
  2008-08-07 12:14                                                                   ` Ilpo Järvinen
  1 sibling, 0 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-08-07 12:11 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: Ilpo Järvinen, Dâniel Fraga, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik,
	David Miller

On Thursday, 7. August 2008 13:50:42 Denys Fedoryshchenko wrote:
> By the way, i had also problem with frto with local connections, and it was
> trivial to reproduce. But because of proprioetary(but i have sources)
> userspace application and specific way of using it - i didn't report to
> maillist. But after patch is ready, add me please in cc, i will test it
> with me too.
>
> For me disabling frto helps to solve problem. With frto i have
> connections "stalling", if there is trasferred large chunks of data over
> loopback. It is complicated way how it works all - but i can try to explain
> how everything works, if required.

What kernel version are you using?

IMHO this could only happen on the loopback interface
if you are a) using an "old" kernel version like 2.6.24/2.6.25
or b) there might be a bug hidden somewhere else.

Ilpo has sent the "final" patch to linux-netdev
some minutes ago, give it a try.

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-07 11:50                                                                 ` Denys Fedoryshchenko
  2008-08-07 12:11                                                                   ` Thomas Jarosch
@ 2008-08-07 12:14                                                                   ` Ilpo Järvinen
  2008-08-07 12:23                                                                     ` Denys Fedoryshchenko
  1 sibling, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-07 12:14 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: Dâniel Fraga, Thomas Jarosch, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2530 bytes --]

On Thu, 7 Aug 2008, Denys Fedoryshchenko wrote:

> On Thursday 07 August 2008, Ilpo Järvinen wrote:
> > On Wed, 6 Aug 2008, Dâniel Fraga wrote:
> > > On Thu, 31 Jul 2008 15:47:55 +0200
> > >
> > > Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:
> > > > If your problem is really FRTO related (that what the patch is for),
> > > > you could try to disable FRTO temporarily:
> > >
> > > 	Hi, the patch helped, but what's the conclusion? Is the problem
> > > "solved"? Will this patch be merged in the next kernel? This thread
> > > seems to be forgotten.
> >
> > I was yesterday preparing the patch description by adding some more
> > thoughts to it (as if there weren't enough already) but didn't yet send it
> > with new cover (to sort of notify davem).
> >
> > I give no guarantees about the _next_ kernel but some 2.6.26.y and 2.6.27
> > is more likely bet.
> 
> By the way, i had also problem with frto with local connections, and it was 
> trivial to reproduce. But because of proprioetary(but i have sources) 
> userspace application and specific way of using it - i didn't report to 
> maillist.

I could have still looked to it :-), I can mostly decide anything TCP 
congestion control related based on solely a tcpdump, and I can even read 
tcpdump -n -r logfile output if you want to fully hide any payloads (as 
long as the lines are not split to a mess in an email :-)) though then 
plotting them is not as easy for me (I could hack my tool someday though 
to handle that as well).

But if there is pre-2.6.25.7/2.6.26 kernel involved, then it's obsolete 
one and requires upgrade or the relevant fixes from 2.6.25.7.

> But after patch is ready, add me please in cc, i will test it with 
> me too.

I already sent it, though vger was in some sort of distress, so it might 
take some time to arrive...

> For me disabling frto helps to solve problem. With frto i have 
> connections "stalling", if there is trasferred large chunks of data over 
> loopback. It is complicated way how it works all - but i can try to explain 
> how everything works, if required.

Could you just tcpdump over (at least) one stall? ...That would be useful 
even if you find the patch I sent working because it's always possible 
that something has been overlooked in FRTO spec or so and I would like to 
understand the problem rather than just use a workaround which was 
intented to fix (possibly) other problem...

If there's something I cannot figure out from the dump, I'll then consult 
you about the userspace details.


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-07 12:14                                                                   ` Ilpo Järvinen
@ 2008-08-07 12:23                                                                     ` Denys Fedoryshchenko
  2008-08-08  9:56                                                                       ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Denys Fedoryshchenko @ 2008-08-07 12:23 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Dâniel Fraga, Thomas Jarosch, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik,
	David Miller

On Thursday 07 August 2008, Ilpo Järvinen wrote:
>
> I could have still looked to it :-), I can mostly decide anything TCP
> congestion control related based on solely a tcpdump, and I can even read
> tcpdump -n -r logfile output if you want to fully hide any payloads (as
> long as the lines are not split to a mess in an email :-)) though then
> plotting them is not as easy for me (I could hack my tool someday though
> to handle that as well).
I will try my best to reproduce it and report (sure on latest stable kernel).
On pre and git a bit more difficult but i will try also.


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-07 11:33                                                               ` [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround Ilpo Järvinen
@ 2008-08-08  4:42                                                                 ` Bill Fink
  2008-08-08 10:32                                                                   ` Ilpo Järvinen
  2008-08-11 21:41                                                                   ` David Miller
  0 siblings, 2 replies; 116+ messages in thread
From: Bill Fink @ 2008-08-08  4:42 UTC (permalink / raw)
  To:  Ilpo Järvinen 
  Cc:  Dâniel Fraga ,
	Thomas Jarosch, David Miller, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik

On Thu, 7 Aug 2008, Ilpo Järvinen wrote:

> On Wed, 6 Aug 2008, Dâniel Fraga wrote:
> 
> > On Thu, 31 Jul 2008 15:47:55 +0200
> > Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:
> > 
> > > If your problem is really FRTO related (that what the patch is for),
> > > you could try to disable FRTO temporarily:
> > 
> > 	Hi, the patch helped, but what's the conclusion? Is the problem
> > "solved"? Will this patch be merged in the next kernel? This thread
> > seems to be forgotten.
> 
> ...Dave, I think we should probably put this FRTO work-around to net-2.6 
> and -stable to remain somewhat robust (it's currently worked around only 
> for newreno anyway). ...But I leave the final decision up to you.

Since you suspect the problem is being caused by a broken middlebox,
would it perhaps be a better approach to add a per-route option to
allow disabling of FRTO for the given destination.  This would be
similar to Stephen Hemminger's fix for broken middleboxes that don't
handle window scaling properly.  It seems this would be better than
modifying FRTO behavior for everyone else that is being compliant.

A question then arises is if the bogus scenario has a TCP signature
that could be used to print a warning message for the unsuspecting
user so they could then take necessary corrective action.

						-Bill
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-07 12:23                                                                     ` Denys Fedoryshchenko
@ 2008-08-08  9:56                                                                       ` Ilpo Järvinen
  2008-08-08 10:32                                                                         ` Denys Fedoryshchenko
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-08  9:56 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: Dâniel Fraga, Thomas Jarosch, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik,
	David Miller

[-- Attachment #1: Type: TEXT/PLAIN, Size: 769 bytes --]

On Thu, 7 Aug 2008, Denys Fedoryshchenko wrote:

> On Thursday 07 August 2008, Ilpo Järvinen wrote:
> >
> > I could have still looked to it :-), I can mostly decide anything TCP
> > congestion control related based on solely a tcpdump, and I can even read
> > tcpdump -n -r logfile output if you want to fully hide any payloads (as
> > long as the lines are not split to a mess in an email :-)) though then
> > plotting them is not as easy for me (I could hack my tool someday though
> > to handle that as well).
>
> I will try my best to reproduce it and report (sure on latest stable kernel).
> On pre and git a bit more difficult but i will try also.

I thought it was "trivial to reproduce"... :-) Perhaps it was then just 
related to pre-2.6.25.7 kernels?

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-08  4:42                                                                 ` Bill Fink
@ 2008-08-08 10:32                                                                   ` Ilpo Järvinen
  2008-08-11 21:44                                                                     ` David Miller
  2008-08-11 21:41                                                                   ` David Miller
  1 sibling, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-08 10:32 UTC (permalink / raw)
  To: Bill Fink
  Cc: \Dâniel Fraga\,
	Thomas Jarosch, David Miller, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3706 bytes --]

On Fri, 8 Aug 2008, Bill Fink wrote:

> On Thu, 7 Aug 2008, Ilpo Järvinen wrote:
> 
> > On Wed, 6 Aug 2008, Dâniel Fraga wrote:
> > 
> > > On Thu, 31 Jul 2008 15:47:55 +0200
> > > Thomas Jarosch <thomas.jarosch@intra2net.com> wrote:
> > > 
> > > > If your problem is really FRTO related (that what the patch is for),
> > > > you could try to disable FRTO temporarily:
> > > 
> > > 	Hi, the patch helped, but what's the conclusion? Is the problem
> > > "solved"? Will this patch be merged in the next kernel? This thread
> > > seems to be forgotten.
> > 
> > ...Dave, I think we should probably put this FRTO work-around to net-2.6 
> > and -stable to remain somewhat robust (it's currently worked around only 
> > for newreno anyway). ...But I leave the final decision up to you.
> 
> Since you suspect the problem is being caused by a broken middlebox,

It seems very likely, any split-TCPish approach that tries to hide some 
losses that would happen on access links could cause this though it's 
very stupid to put such box there when there's a physical wire rather than 
wireless. And even with wireless the given configuration is not going to 
help but make things worse :-), the box is plain stupid as is (I guess 
it's deployed because some marketting guy has convinced some clueless 
whoever that they need the box :-)).

In theory it could be at the receiver below the TCP layer too but that's 
quite unlikely that smtp server would run on such stack. And also then 
it's kind of middlebox as TCP works end-to-end (not end host to end host)
while the rest remains as black box to it, even if something is performed 
on the very same host below TCP layer.

Even less likely thing is that TCP receiver would do this and it doesn't 
explain pacing of ACKs at all. ...It would be at least kind of twisting
of specs if not out-of-spec somewhere.

> would it perhaps be a better approach to add a per-route option to
> allow disabling of FRTO for the given destination.  This would be
> similar to Stephen Hemminger's fix for broken middleboxes that don't
> handle window scaling properly.  It seems this would be better than
> modifying FRTO behavior for everyone else that is being compliant.

Sure, but that requires some thought still, I'll try after weekend so
that I can think it a bit more because there are plenty of states where
we can end to after the detection of the first RTO as spurious.

It might even be interesting to run CA_Recovery on RTOs when we detect 
this kind of middlebox because RTOs basically happen because there's lack 
of duplicate ACKs and then we could efficiently use partial ACKs to send 
just the lost segments rather than everything which is causing problems 
after the recovery has finished because we sent with too high rate while 
recovering. Then fallbackto CA_Loss if RTO is triggered again in 
CA_Recovery. But I'm not sure if it's worth of the effort though.

> A question then arises is if the bogus scenario has a TCP signature
> that could be used to print a warning message for the unsuspecting
> user so they could then take necessary corrective action.

Probably yes, but I need to add some state. I could probably also make it 
to switch per flow to more robust approach on-demand when enough evidence 
is gathered. ...I think I'll add 1-bit history counter per flow so that 
it's possible to do print the warning and switch when there's third RTO in 
a single window (while two first were found spurious). IMHO it's unlikely 
enough that there will be three latency spikes (each longer than the 
previous) within a single window to make the decision, I wouldn't trust 
two enough because hand-overs can take time and have non-trivial effects.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: TCP connection stalls under 2.6.24.7
  2008-08-08  9:56                                                                       ` Ilpo Järvinen
@ 2008-08-08 10:32                                                                         ` Denys Fedoryshchenko
  0 siblings, 0 replies; 116+ messages in thread
From: Denys Fedoryshchenko @ 2008-08-08 10:32 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Dâniel Fraga, Thomas Jarosch, Netdev, Patrick McHardy,
	Sven Riedel, Netfilter Developer Mailing List, Jozsef Kadlecsik,
	David Miller

On Friday 08 August 2008, Ilpo Järvinen wrote:

> I thought it was "trivial to reproduce"... :-) Perhaps it was then just
> related to pre-2.6.25.7 kernels?
Trivial to reproduce, but it is production systems with 600-700 req/s, i 
cannot take much risk with them, and it semi-embedded distro running on USB 
flash.
I am just trying now to use them, if i will have failures, i will buy PC and 
build something on my table.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-08  4:42                                                                 ` Bill Fink
  2008-08-08 10:32                                                                   ` Ilpo Järvinen
@ 2008-08-11 21:41                                                                   ` David Miller
  1 sibling, 0 replies; 116+ messages in thread
From: David Miller @ 2008-08-11 21:41 UTC (permalink / raw)
  To: billfink
  Cc: ilpo.jarvinen, fragabr, thomas.jarosch, netdev, kaber, sr,
	netfilter-devel, kadlec

From: Bill Fink <billfink@mindspring.com>
Date: Fri, 8 Aug 2008 00:42:31 -0400

> Since you suspect the problem is being caused by a broken middlebox,
> would it perhaps be a better approach to add a per-route option to
> allow disabling of FRTO for the given destination.  This would be
> similar to Stephen Hemminger's fix for broken middleboxes that don't
> handle window scaling properly.  It seems this would be better than
> modifying FRTO behavior for everyone else that is being compliant.

This is the kind of direction I'm leaning towards as well.

The behavior of these middleboxes borders on unbelievable.  And there
comes a point where catering to these various busted boxes stops to
make sense.  At some point we have to say "sorry, someone has to get
that box fixed."

You can't reorder packets like that, on purpose, and not expect some
new, yet reasonable, TCP algorithm to fall flat on it's face.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-08 10:32                                                                   ` Ilpo Järvinen
@ 2008-08-11 21:44                                                                     ` David Miller
  2008-08-12  7:46                                                                       ` Thomas Jarosch
  0 siblings, 1 reply; 116+ messages in thread
From: David Miller @ 2008-08-11 21:44 UTC (permalink / raw)
  To: ilpo.jarvinen
  Cc: billfink, fragabr, thomas.jarosch, netdev, kaber, sr,
	netfilter-devel, kadlec

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Fri, 8 Aug 2008 13:32:14 +0300 (EEST)

> On Fri, 8 Aug 2008, Bill Fink wrote:
> > A question then arises is if the bogus scenario has a TCP signature
> > that could be used to print a warning message for the unsuspecting
> > user so they could then take necessary corrective action.
> 
> Probably yes, but I need to add some state. I could probably also make it 
> to switch per flow to more robust approach on-demand when enough evidence 
> is gathered. ...I think I'll add 1-bit history counter per flow so that 
> it's possible to do print the warning and switch when there's third RTO in 
> a single window (while two first were found spurious). IMHO it's unlikely 
> enough that there will be three latency spikes (each longer than the 
> previous) within a single window to make the decision, I wouldn't trust 
> two enough because hand-overs can take time and have non-trivial effects.

Trying to come up with a signature for this bogus stuff is both time
consuming and having a risk of false positives.  And I really question
whether this thing is worth it.

The sane thing to do in this case is to declare the box inoperative
and that it needs to be fixed to avoid this behavior.

Any reasonable congestion control scheme is going to run into problems
trying to react to the packet patterns this thing creates.  It is
therefore not really limited to FRTO so it really shouldn't be treated
like an FRTO problem even though it shows up more pronounced when
FRTO is enabled.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-11 21:44                                                                     ` David Miller
@ 2008-08-12  7:46                                                                       ` Thomas Jarosch
  2008-08-12  8:18                                                                         ` David Miller
  2008-08-22 21:18                                                                         ` Ilpo Järvinen
  0 siblings, 2 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-08-12  7:46 UTC (permalink / raw)
  To: David Miller
  Cc: ilpo.jarvinen, billfink, fragabr, netdev, kaber, sr,
	netfilter-devel, kadlec

On Monday, 11. August 2008 23:44:21 David Miller wrote:
> Trying to come up with a signature for this bogus stuff is both time
> consuming and having a risk of false positives.  And I really question
> whether this thing is worth it.
>
> The sane thing to do in this case is to declare the box inoperative
> and that it needs to be fixed to avoid this behavior.
>
> Any reasonable congestion control scheme is going to run into problems
> trying to react to the packet patterns this thing creates.  It is
> therefore not really limited to FRTO so it really shouldn't be treated
> like an FRTO problem even though it shows up more pronounced when
> FRTO is enabled.

David, I agree with you, though I'm not sure about the end user experience:

The kernel is an early adopter of FRTO and will be bitten by bugs of other
TCP implementations like we've experienced. I guess most affected users
just see stalled or slow connections and won't have the time or knowledge
to debug this. A proper warning could help them and the kernel
developers to get this issue solved as quickly as possible.

We called the hotline of the ISP several times and they always claimed
sending big mails with Outlook/Windows works, so it must be linux's fault.
That view of things is totally biased, but it's something I want to make sure
people can't get away with easily :-)

So, if it's possible to detect broken middleware boxes without
spending too much time on it, that would really be nice.

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-12  7:46                                                                       ` Thomas Jarosch
@ 2008-08-12  8:18                                                                         ` David Miller
  2008-08-12 17:43                                                                           ` Dâniel Fraga
  2008-08-13  8:00                                                                           ` Thomas Jarosch
  2008-08-22 21:18                                                                         ` Ilpo Järvinen
  1 sibling, 2 replies; 116+ messages in thread
From: David Miller @ 2008-08-12  8:18 UTC (permalink / raw)
  To: thomas.jarosch
  Cc: ilpo.jarvinen, billfink, fragabr, netdev, kaber, sr,
	netfilter-devel, kadlec

From: Thomas Jarosch <thomas.jarosch@intra2net.com>
Date: Tue, 12 Aug 2008 09:46:17 +0200

> David, I agree with you, though I'm not sure about the end user experience:

We had the same situation with ECN and window scaling, and my proposal
is the same as how we handled those situations involving broken
middleware boxes.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-12  8:18                                                                         ` David Miller
@ 2008-08-12 17:43                                                                           ` Dâniel Fraga
  2008-08-12 17:52                                                                             ` Ilpo Järvinen
  2008-08-13  8:00                                                                           ` Thomas Jarosch
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-12 17:43 UTC (permalink / raw)
  To: David Miller
  Cc: thomas.jarosch, ilpo.jarvinen, billfink, netdev, kaber, sr,
	netfilter-devel, kadlec

On Tue, 12 Aug 2008 01:18:22 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> We had the same situation with ECN and window scaling, and my proposal
> is the same as how we handled those situations involving broken
> middleware boxes.

	Sorry for my ignorance (I'm just an user), but if the problem
is not with Linux, why this problem appeared just on 2.6.25 kernel? I
mean, with 2.6.24 and before I never had stalled connections. Just a
coincidence? Or something has changed in 2.6.25 which caused this?

	Thank you!


-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-12 17:43                                                                           ` Dâniel Fraga
@ 2008-08-12 17:52                                                                             ` Ilpo Järvinen
  2008-08-13 17:53                                                                               ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-12 17:52 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 967 bytes --]

On Tue, 12 Aug 2008, Dâniel Fraga wrote:

> On Tue, 12 Aug 2008 01:18:22 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > We had the same situation with ECN and window scaling, and my proposal
> > is the same as how we handled those situations involving broken
> > middleware boxes.
> 
> 	Sorry for my ignorance (I'm just an user), but if the problem
> is not with Linux, why this problem appeared just on 2.6.25 kernel? I
> mean, with 2.6.24 and before I never had stalled connections. Just a
> coincidence? Or something has changed in 2.6.25 which caused this?

I still propose that you tcpdump it, then I can tell you (I know
enough about Thomas' case but yours has a large number of
unknowns)... :-) I don't know why 2.6.24 didn't suffer from the
problem as FRTO was enabled already in it. The command you need
to create dump.log file:

# tcpdump -w dump.log -i <iface> host <peerip>

...you need root rights (or sudo) to do the capturing.


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-12  8:18                                                                         ` David Miller
  2008-08-12 17:43                                                                           ` Dâniel Fraga
@ 2008-08-13  8:00                                                                           ` Thomas Jarosch
  1 sibling, 0 replies; 116+ messages in thread
From: Thomas Jarosch @ 2008-08-13  8:00 UTC (permalink / raw)
  To: David Miller
  Cc: ilpo.jarvinen, billfink, fragabr, netdev, kaber, sr,
	netfilter-devel, kadlec

On Tuesday, 12. August 2008 10:18:22 David Miller wrote:
> From: Thomas Jarosch <thomas.jarosch@intra2net.com>
> Date: Tue, 12 Aug 2008 09:46:17 +0200
>
> > David, I agree with you, though I'm not sure about the end user
> > experience:
>
> We had the same situation with ECN and window scaling, and my proposal
> is the same as how we handled those situations involving broken
> middleware boxes.

Yes, that is true. IMHO there's a slight difference with FRTO trouble
compared to ECN/window scaling issues:

ECN trouble -> No access at all
Broken window scaling -> Large transfers don't work
MTU issues -> No access at all / large transfers don't work

FRTO problems -> Hard to spot as they only happen when packet loss occurs.
Though I guess Ilpo knows best if there's an "easy" way to detect this or not.

Thomas

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-12 17:52                                                                             ` Ilpo Järvinen
@ 2008-08-13 17:53                                                                               ` Dâniel Fraga
  2008-08-13 18:34                                                                                 ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-13 17:53 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Tue, 12 Aug 2008 20:52:37 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> I still propose that you tcpdump it, then I can tell you (I know
> enough about Thomas' case but yours has a large number of
> unknowns)... :-) I don't know why 2.6.24 didn't suffer from the
> problem as FRTO was enabled already in it. The command you need
> to create dump.log file:
> 
> # tcpdump -w dump.log -i <iface> host <peerip>
> 
> ...you need root rights (or sudo) to do the capturing.

	Ok, but the problem is that the bug doesn't happen
frequently... yesterday I waited for it to happen and nothing
happened :). I'll keep watching it... if I can get the dump, I send it.
Thanks.


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-13 17:53                                                                               ` Dâniel Fraga
@ 2008-08-13 18:34                                                                                 ` Ilpo Järvinen
  2008-08-15  4:34                                                                                   ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-13 18:34 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 987 bytes --]

On Wed, 13 Aug 2008, Dâniel Fraga wrote:

> On Tue, 12 Aug 2008 20:52:37 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > I still propose that you tcpdump it, then I can tell you (I know
> > enough about Thomas' case but yours has a large number of
> > unknowns)... :-) I don't know why 2.6.24 didn't suffer from the
> > problem as FRTO was enabled already in it. The command you need
> > to create dump.log file:
> > 
> > # tcpdump -w dump.log -i <iface> host <peerip>
> > 
> > ...you need root rights (or sudo) to do the capturing.
> 
> 	Ok, but the problem is that the bug doesn't happen
> frequently... yesterday I waited for it to happen and nothing
> happened :). I'll keep watching it... if I can get the dump, I send it.

Ok, thanks for your efforts... These are often hard to reproduce because
some not that likely pattern needs to happen and such things is often not
easily controllable (if it is at all possible to influence the 
likelyhoods).

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-13 18:34                                                                                 ` Ilpo Järvinen
@ 2008-08-15  4:34                                                                                   ` Dâniel Fraga
  2008-08-15  7:06                                                                                     ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-15  4:34 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Wed, 13 Aug 2008 21:34:10 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Ok, thanks for your efforts... These are often hard to reproduce because
> some not that likely pattern needs to happen and such things is often not
> easily controllable (if it is at all possible to influence the 
> likelyhoods).

	Hi Ilpo, I don't know if the dumps are correct, but I did when
the connection was stalled. The problem is, when I dumped "eth0", the
connection suddenly come back alive again... so, I don't know if it's
useless or not:

	For tun1 interface (which I use for my vpn):

http://www.abusar.org/dump-tun1.log

	local loopback interface:

http://www.abusar.org/dump-lo.log

	eth0, if it matters:

http://www.abusar.org/dump-eth0.log

	Thanks.



-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-15  4:34                                                                                   ` Dâniel Fraga
@ 2008-08-15  7:06                                                                                     ` Ilpo Järvinen
  2008-08-15 21:35                                                                                       ` Dâniel Fraga
  2008-08-15 21:59                                                                                       ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-15  7:06 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1756 bytes --]

On Fri, 15 Aug 2008, Dâniel Fraga wrote:

> On Wed, 13 Aug 2008 21:34:10 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Ok, thanks for your efforts... These are often hard to reproduce because
> > some not that likely pattern needs to happen and such things is often not
> > easily controllable (if it is at all possible to influence the 
> > likelyhoods).
> 
> 	Hi Ilpo, I don't know if the dumps are correct, but I did when
> the connection was stalled.

I would be better to have tcpdump running at least a bit back (2-3 windows 
back is long enough for me), but obviously that might not be possible 
option because it occurs so rarely. ...It should be possible to have 
tcpdump restarted once in a while to avoid a one huge log if you'd just 
keep running tcpdump from beginning.

> The problem is, when I dumped "eth0", the connection suddenly come back 
> alive again...

The situation (or some of those I did debug with other people) are such 
that they may indeed resolve themself, though I'm also interested why the 
slow part occurred.

> so, I don't know if it's useless or not:

What do you mean by "come back alive"...? ...In eth0 log I found this 
connection 189.38.18.122.995 > 192.168.0.2.35477, the ip matches with 
abusar's. But I'm not sure if the connection in the tunnel is the 
interesting one, since it's going to/from port 119 but the ip addresses 
(10.195.195.2 and 10.195.195.1) don't tell anything to me, I guess you 
know their meaning (ie., if 10.195.195.2 is the one with which the 
connection stalls)? ...You're probably right that this wasn't very useful 
log, the longest "stall" I find is only 1.111328 seconds long (and it 
might be due to some processing that is made by 10.195.195.2).

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-15  7:06                                                                                     ` Ilpo Järvinen
@ 2008-08-15 21:35                                                                                       ` Dâniel Fraga
  2008-08-15 22:06                                                                                         ` Ilpo Järvinen
  2008-08-15 21:59                                                                                       ` Dâniel Fraga
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-15 21:35 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Fri, 15 Aug 2008 10:06:39 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> I would be better to have tcpdump running at least a bit back (2-3 windows 
> back is long enough for me), but obviously that might not be possible 
> option because it occurs so rarely. ...It should be possible to have 
> tcpdump restarted once in a while to avoid a one huge log if you'd just 
> keep running tcpdump from beginning.

	Ok.

> What do you mean by "come back alive"...? ...In eth0 log I found this 

	I mean, it isn't stalled anymore. When it stalls, fetchnews
stops and stay stalled forever. When it come back alive, it resumes
(but it will only do that if I do something to restore the connection).

> connection 189.38.18.122.995 > 192.168.0.2.35477, the ip matches with 
> abusar's. But I'm not sure if the connection in the tunnel is the 
> interesting one, since it's going to/from port 119 but the ip addresses 
> (10.195.195.2 and 10.195.195.1) don't tell anything to me, I guess you 
> know their meaning (ie., if 10.195.195.2 is the one with which the 
> connection stalls)? ...You're probably right that this wasn't very useful 
> log, the longest "stall" I find is only 1.111328 seconds long (and it 
> might be due to some processing that is made by 10.195.195.2).

	Ok:

10.195.195.1 is my local VPN IP (tun1)

10.195.195.2 is the remote VPN IP (on the server)

192.168.0.2 is my local IP (eth0)

189.38.18.122 is the server's IP

	Should I use tcpdump on the server too or is it sufficient to
use on my client machine?

	Thank you very much again.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-15  7:06                                                                                     ` Ilpo Järvinen
  2008-08-15 21:35                                                                                       ` Dâniel Fraga
@ 2008-08-15 21:59                                                                                       ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-15 21:59 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Fri, 15 Aug 2008 10:06:39 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> I would be better to have tcpdump running at least a bit back (2-3 windows 
> back is long enough for me), but obviously that might not be possible 
> option because it occurs so rarely. ...It should be possible to have 
> tcpdump restarted once in a while to avoid a one huge log if you'd just 
> keep running tcpdump from beginning.

	Ok.

> What do you mean by "come back alive"...? ...In eth0 log I found this 
> connection 189.38.18.122.995 > 192.168.0.2.35477, the ip matches with 
> abusar's. But I'm not sure if the connection in the tunnel is the 
> interesting one, since it's going to/from port 119 but the ip addresses 
> (10.195.195.2 and 10.195.195.1) don't tell anything to me, I guess you 
> know their meaning (ie., if 10.195.195.2 is the one with which the 
> connection stalls)? ...You're probably right that this wasn't very useful 
> log, the longest "stall" I find is only 1.111328 seconds long (and it 
> might be due to some processing that is made by 10.195.195.2).

	By "come back alive" I mean when the connection isn't stalled
anymore.

189.38.18.122 -> server

10.195.195.1 -> my local VPN ip (tun1)

10.195.195.2 -> remote VPN ip (on the server)

192.168.0.2 -> my local ip (eth0)

	Should I run tcpdump on the server too, or is it sufficient to
dump just on my client machine?

	Thank you very much again.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-15 21:35                                                                                       ` Dâniel Fraga
@ 2008-08-15 22:06                                                                                         ` Ilpo Järvinen
  2008-08-15 23:57                                                                                           ` Dâniel Fraga
  2008-08-16  2:15                                                                                           ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-15 22:06 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2865 bytes --]

On Fri, 15 Aug 2008, Dâniel Fraga wrote:

> On Fri, 15 Aug 2008 10:06:39 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > I would be better to have tcpdump running at least a bit back (2-3 windows 
> > back is long enough for me), but obviously that might not be possible 
> > option because it occurs so rarely. ...It should be possible to have 
> > tcpdump restarted once in a while to avoid a one huge log if you'd just 
> > keep running tcpdump from beginning.
> 
> 	Ok.
> 
> > What do you mean by "come back alive"...? ...In eth0 log I found this 
> 
> 	I mean, it isn't stalled anymore. When it stalls, fetchnews
> stops and stay stalled forever. When it come back alive, it resumes
> (but it will only do that if I do something to restore the connection).

Ok. I hope it will still reproduce with tcpdump running... Btw, doing cat 
/proc/net/tcp during the stall wouldn't be a bad idea (in addition to 
tcpdumping it). Also please let the tcpdumps run long enough if the stall 
persists, something like 15mins doesn't hurt because there are large 
timer values possibly involved.

You might have mentioned it but I would like you to confirm which kernel 
version the server is running (at least 2.6.25.7 or 2.6.26 is new enough 
to have all bug fixes)?

> > connection 189.38.18.122.995 > 192.168.0.2.35477, the ip matches with 
> > abusar's. But I'm not sure if the connection in the tunnel is the 
> > interesting one, since it's going to/from port 119 but the ip addresses 
> > (10.195.195.2 and 10.195.195.1) don't tell anything to me, I guess you 
> > know their meaning (ie., if 10.195.195.2 is the one with which the 
> > connection stalls)? ...You're probably right that this wasn't very useful 
> > log, the longest "stall" I find is only 1.111328 seconds long (and it 
> > might be due to some processing that is made by 10.195.195.2).
> 
> 	Ok:
> 
> 10.195.195.1 is my local VPN IP (tun1)
> 
> 10.195.195.2 is the remote VPN IP (on the server)

I sort of assumed so, thanks for the confirmation.

> 192.168.0.2 is my local IP (eth0)
> 
> 189.38.18.122 is the server's IP
> 
> 	Should I use tcpdump on the server too or is it sufficient to
> use on my client machine?

It definately wouldn't hurt (though I usually can figure out what happens 
in the other end) and I guess it's quite easy for you to arrange.

In case there's some other use than your testing traffic with the server, 
it's probably polite to filter there aggressively enough to not get that 
much unrelated traffic (tcpdump ... host <ip> and host <clientip> and port 
<portnum>, or so, I guess the ip address pair should be the vpn endpoints 
since the nntp traffic seems to go through it and the portnum is 119, if 
unsure you can verify with sudo netstat -p which tcp connections are 
associated to fetchnews if that's not immediately obvious).


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-15 22:06                                                                                         ` Ilpo Järvinen
@ 2008-08-15 23:57                                                                                           ` Dâniel Fraga
  2008-08-16  2:15                                                                                           ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-15 23:57 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Sat, 16 Aug 2008 01:06:55 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> You might have mentioned it but I would like you to confirm which kernel 
> version the server is running (at least 2.6.25.7 or 2.6.26 is new enough 
> to have all bug fixes)?

	Yes, 2.6.26. 

	Thank you very much for your excellent explanation. You helped
a lot. I'll try to do my "home work" and as soon as I have the data,
I'll return.


-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-15 22:06                                                                                         ` Ilpo Järvinen
  2008-08-15 23:57                                                                                           ` Dâniel Fraga
@ 2008-08-16  2:15                                                                                           ` Dâniel Fraga
  2008-08-16  7:10                                                                                             ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-16  2:15 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Sat, 16 Aug 2008 01:06:55 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Ok. I hope it will still reproduce with tcpdump running... Btw, doing cat 
> /proc/net/tcp during the stall wouldn't be a bad idea (in addition to 
> tcpdumping it). Also please let the tcpdumps run long enough if the stall 
> persists, something like 15mins doesn't hurt because there are large 
> timer values possibly involved.

	Hi, I did the following:

fraga@tux ~/src$ cat /proc/net/tcp 
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when
retrnsmt   uid  timeout inode 0: 00000000:0DA5 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 2912 1
ffff81007ea28000 299 0 0 2 -1 1: 00000000:23AA 00000000:0000 0A
00000000:00000000 00:00000000 00000000   501        0 6586 1
ffff8100614e1e00 299 0 0 2 -1 2: 00000000:1F4A 00000000:0000 0A
00000000:00000000 00:00000000 00000000   501        0 6164 1
ffff8100614e2400 299 0 0 2 -1 3: 00000000:0CEA 00000000:0000 0A
00000000:00000000 00:00000000 00000000 12347        0 3205 1
ffff81007ea29800 299 0 0 2 -1 4: 00000000:008B 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 3191 1
ffff81007e921800 299 0 0 2 -1 5: 00000000:1770 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 3454 1
ffff81007ea29e00 299 0 0 2 -1 6: 00000000:0015 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 2860 1
ffff81007e920600 299 0 0 2 -1 7: 00000000:0016 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 2507 1
ffff81007e920000 299 0 0 2 -1 8: 00000000:0077 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 2861 1
ffff81007e920c00 299 0 0 2 -1 9: 00000000:0019 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 3029 1
ffff81007ea29200 299 0 0 2 -1 10: 00000000:01BD 00000000:0000 0A
00000000:00000000 00:00000000 00000000     0        0 3190 1
ffff81007e921200 299 0 0 2 -1 11: 0200A8C0:D8B5 4009BCCD:1446 01
00000000:00000000 00:00000000 00000000   501        0 6187 1
ffff8100614e4200 36 3 20 4 3 12: 0200A8C0:C77C D4E133C9:C5CF 01
00000000:00000000 00:00000000 00000000   501        0 7593 1
ffff810049008c00 39 3 24 4 -1 13: 0200A8C0:DD6A 21250440:0747 01
00000000:00000000 00:00000000 00000000   501        0 17613 1
ffff81007e927200 71 3 0 4 -1 14: 0200A8C0:9D7D B9B5A342:13BA 01
00000000:00000000 00:00000000 00000000   501        0 6183 1
ffff8100614e2a00 49 3 0 4 2 15: 0200A8C0:807C 7A1226BD:03E3 01
0000007C:00000000 01:00000089 00000003   501        0 24919 2
ffff81007ea2b600 183 0 0 2 2 16: 0200A8C0:8C07 7DA355D1:1467 01
00000000:00000000 00:00000000 00000000   501        0 6186 1
ffff8100614e3c00 41 3 26 4 -1 17: 0100007F:0077 0100007F:BED7 01
00000000:00000000 00:00000000 00000000     0        0 21883 1
ffff8100614e1800 21 3 1 6 -1 18: 0200A8C0:852D 7A1226BD:0016 01
00000000:00000000 02:0000E189 00000000   501        0 5975 2
ffff81007ea2ce00 23 3 10 2 2 19: 0100007F:B4C6 0100007F:0DA5 01
00000000:00000000 00:00000000 00000000   501        0 3815 1
ffff81007ea2a400 20 3 18 5 -1 20: 0200A8C0:EBFC B81B2ECF:0747 01
00000000:00000000 00:00000000 00000000   501        0 11878 1
ffff81004900c800 74 3 0 4 3 21: 0100007F:0DA5 0100007F:B4C5 01
00000000:00000000 00:00000000 00000000     0        0 2930 1
ffff81007ea28c00 20 3 31 3 -1 22: 0100007F:BED7 0100007F:0077 01
00000000:00000000 00:00000000 00000000   501        0 21881 1
ffff8100614e3000 20 3 0 5 -1 23: 0200A8C0:839B 141B2ECF:0747 01
00000000:00000000 00:00000000 00000000   501        0 6331 1
ffff8100614e0c00 83 3 0 4 3 24: 0100007F:0DA5 0100007F:B4C6 01
00000000:00000000 00:00000000 00000000     0        0 3816 1
ffff81007ea2aa00 21 3 23 5 -1 25: 0100007F:B4C5 0100007F:0DA5 01
00000000:00000000 00:00000000 00000000 65534        0 2929 1
ffff81007ea28600 20 3 30 3 -1 26: 0200A8C0:DBE6 63C155D1:0050 01
00000000:00000000 00:00000000 00000000   501        0 23988 1
ffff81007e922400 21 3 8 4 -1 27: 0200A8C0:8CDA 26E43641:0747 01
00000000:00000000 00:00000000 00000000   501        0 21604 1
ffff8100614e6000 60 3 0 4 -1 28: 0200A8C0:C2F5 8D1A2ECF:0747 01
00000000:00000000 00:00000000 00000000   501        0 6328 1
ffff81007e924200 110 3 0 4 3 29: 0200A8C0:8806 2D6C2ECF:0747 01
00000000:00000000 00:00000000 00000000   501        0 17566 1
ffff81004900ce00 46 3 30 4 -1 30: 0200A8C0:96FF 27F9F03F:0050 01
00000000:00000000 00:00000000 00000000   501        0 23953 1
ffff81007ea2c800 55 3 8 3 -1 31: 0200A8C0:8078 7A1226BD:03E3 04
0000007D:00000000 01:000009BD 00000006     0        0 0 2
ffff81007ea2c200 6718 3 6 2 2                         

	And I can't use host and port at the same time with tcpdump (or
I did something wrong) so I used (I need to update this, can't find
the manpage... I tried to download a newer version but the link form
the site seems broken):

sudo tcpdump -w dump-mail.log -i eth0 port 995

	to capture mail traffic that was stuck (usually it only happens
with mail or nntp, interesting no?). All the other services (http,
ssh, ftp always work fine).

http://www.abusar.org/dump-mail.log

	But the file is small. I don't know if it will help.

	If not, no problem, just tell me and I'll try harder next time.
Thanks.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-16  2:15                                                                                           ` Dâniel Fraga
@ 2008-08-16  7:10                                                                                             ` Ilpo Järvinen
  2008-08-16 19:18                                                                                               ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-16  7:10 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1630 bytes --]

On Fri, 15 Aug 2008, Dâniel Fraga wrote:

> On Sat, 16 Aug 2008 01:06:55 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Ok. I hope it will still reproduce with tcpdump running... Btw, doing cat 
> > /proc/net/tcp during the stall wouldn't be a bad idea (in addition to 
> > tcpdumping it). Also please let the tcpdumps run long enough if the stall 
> > persists, something like 15mins doesn't hurt because there are large 
> > timer values possibly involved.
> 
> 	Hi, I did the following:
> 
> fraga@tux ~/src$ cat /proc/net/tcp 

...snip...

> 	And I can't use host and port at the same time with tcpdump (or
> I did something wrong) so I used (I need to update this, can't find
> the manpage... I tried to download a newer version but the link form
> the site seems broken):
> 
> sudo tcpdump -w dump-mail.log -i eth0 port 995

Hmm, sudo /usr/sbin/tcpdump -i eth1 host 192.168.1.1 and port 22 works for 
me, perhaps you forgot the and-operator in between them? Anyway, it seems 
quite fine.

> 	to capture mail traffic that was stuck (usually it only happens
> with mail or nntp, interesting no?). All the other services (http,
> ssh, ftp always work fine).
> 
> http://www.abusar.org/dump-mail.log
> 
> 	But the file is small. I don't know if it will help.
> 
> 	If not, no problem, just tell me and I'll try harder next time.
> Thanks.

This seems to be a valid sample, thanks. I'll return once I have figured 
something out (it might be that our state machine is somehow broken since 
there's traffic in both ways (rexmitted), yet neither party seems to be 
very willing to make progress).

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-16  7:10                                                                                             ` Ilpo Järvinen
@ 2008-08-16 19:18                                                                                               ` Ilpo Järvinen
  2008-08-17  0:36                                                                                                 ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-16 19:18 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1354 bytes --]

On Sat, 16 Aug 2008, Ilpo Järvinen wrote:

> On Fri, 15 Aug 2008, Dâniel Fraga wrote:
> 
> > with mail or nntp, interesting no?). All the other services (http,
> > ssh, ftp always work fine).
> >
> > 	But the file is small. I don't know if it will help.
> > 
> > 	If not, no problem, just tell me and I'll try harder next time.
> 
> This seems to be a valid sample, thanks. I'll return once I have figured 
> something out (it might be that our state machine is somehow broken since 
> there's traffic in both ways (rexmitted), yet neither party seems to be 
> very willing to make progress).

Some thoughts, nothing very earth-shattering yet...

It seems that the server (port 995) never leaves SYN-RECV state because it 
keeps retransmitting SYNACKs. While the other end (the client) is doing 
it's best to ACK them (correctly) and it also tries to send some data 
which never gets through and retransmissions are attempted for it (those 
packets also contain a ACK seqno that should be enough to end the 
SYN-RECV but for some reason that never happens). Eventually the 
connection is RSTed.

I'll look through 2.6.24..25 history once I have some time to see if 
there are some clues about the cause. I'm also having a problem in 
figurin out why would the frto patch you tested solve this issue (unless 
there are two issues in the picture).

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-16 19:18                                                                                               ` Ilpo Järvinen
@ 2008-08-17  0:36                                                                                                 ` Dâniel Fraga
  2008-08-19 10:38                                                                                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-17  0:36 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Sat, 16 Aug 2008 22:18:50 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> I'll look through 2.6.24..25 history once I have some time to see if 
> there are some clues about the cause. I'm also having a problem in 
> figurin out why would the frto patch you tested solve this issue (unless 
> there are two issues in the picture).

	Ok, surely some patch between .24 and .25 caused this. Or it's
some bug that only "appeared" in .25 :)

	In fact, the frto patch helped, but not prevented the problem.
I mean, it seems that with the frto patch, the problem doesn't happen
frequently. And if I disable frto, the problem doesn't occur either.

	But, maybe, we could be talking about another bug, completely
unrelated to frto... I don't know. i'm just guessing ;). Anyway, we
talk about stalled connections ;)

	What I know is:

1) what you wrote is right: 2.6.24 is fine, 2.6.25 and 2.6.26 not

2) nmap -sS <server> seems to reset the connection (it's my workaround
until now ;). Maybe the ping probe help in some way? I don't know.

	I want to help you as much as I can. So, ask anything you need.

	Thanks!


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-17  0:36                                                                                                 ` Dâniel Fraga
@ 2008-08-19 10:38                                                                                                   ` Ilpo Järvinen
  2008-08-20  0:34                                                                                                     ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-19 10:38 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2334 bytes --]

On Sat, 16 Aug 2008, Dâniel Fraga wrote:

> On Sat, 16 Aug 2008 22:18:50 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > I'll look through 2.6.24..25 history once I have some time to see if 
> > there are some clues about the cause. I'm also having a problem in 
> > figurin out why would the frto patch you tested solve this issue (unless 
> > there are two issues in the picture).
> 
> 	Ok, surely some patch between .24 and .25 caused this. Or it's
> some bug that only "appeared" in .25 :)
> 
> 	In fact, the frto patch helped, but not prevented the problem.
> I mean, it seems that with the frto patch, the problem doesn't happen
> frequently. And if I disable frto, the problem doesn't occur either.
> 
> 	But, maybe, we could be talking about another bug, completely
> unrelated to frto... I don't know. i'm just guessing ;). Anyway, we
> talk about stalled connections ;)
>
> 	What I know is:
> 
> 1) what you wrote is right: 2.6.24 is fine, 2.6.25 and 2.6.26 not
> 
> 2) nmap -sS <server> seems to reset the connection (it's my workaround
> until now ;). Maybe the ping probe help in some way? I don't know.

Perhaps, though it's not at all clear how it could do that...

> 	I want to help you as much as I can. So, ask anything you need.

I went through TCP related and inet_connection_sock related things, 
nothing obvious I could notice in there...

Do you have net namespaces enabled CONFIG_NET_NS in .config?

Any netfilter (iptables) rules on server which could cause those packets 
to not reach TCP layer?

MIBs might give some clue why those segments didn't get accepted. Most 
interesting ones are PAWSEstab, TCPAbortOnSyn and InErrs. One can use 
/bin/cut to read those from the one-line files if one wants to (however,
I attached a script which transposes them to get them somewhat 
human-readable). Also having the /proc/net/tcp output from the server 
while stalling would be (have been) useful to reveal state info (but I 
should have remembered to ask you to run it on both of them :-)). 

Also, I wonder what that [|tcp] hides, e.g., "<nop,nop,timestamp 
15980976 70381399,nop,nop,[|tcp]>" in tcpdump (and that was for an ACK 
which doesn't make too much sense to me there). It occurs because 
snaplen which was given for tcpdump is small enough to make TCP header 
partial.


-- 
 i.

[-- Attachment #2: Type: APPLICATION/X-SH, Size: 793 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-19 10:38                                                                                                   ` Ilpo Järvinen
@ 2008-08-20  0:34                                                                                                     ` Dâniel Fraga
  2008-08-20  7:57                                                                                                       ` Ilpo Järvinen
  2008-08-20 12:37                                                                                                       ` Ilpo Järvinen
  0 siblings, 2 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-20  0:34 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

On Tue, 19 Aug 2008 13:38:35 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Perhaps, though it's not at all clear how it could do that...

	I was thinking here of of some specific configuration I use.
For example, I always used the wonder shaper htb script:

http://lartc.org/howto/lartc.cookbook.ultimate-tc.html#AEN2241

	Could HTB mess with frto or cause this problem? Would it be
useful to disable completely HTB and use just the default scheduler?

> Do you have net namespaces enabled CONFIG_NET_NS in .config?

	I couldn't find this specific option:

fraga@tux /usr/src/linux$ grep CONFIG_NET_NS .config
fraga@tux /usr/src/linux$ 

	But I have those:

fraga@tux /usr/src/linux$ grep CONFIG_NET_ .config
# CONFIG_NET_KEY is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
CONFIG_NET_SCHED=y
# CONFIG_NET_SCH_CBQ is not set
CONFIG_NET_SCH_HTB=m
# CONFIG_NET_SCH_HFSC is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
# CONFIG_NET_SCH_TEQL is not set
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
# CONFIG_NET_SCH_NETEM is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_CLS=y
# CONFIG_NET_CLS_BASIC is not set
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_CLS_RSVP=m
# CONFIG_NET_CLS_RSVP6 is not set
# CONFIG_NET_CLS_FLOW is not set
# CONFIG_NET_EMATCH is not set
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=y
# CONFIG_NET_ACT_GACT is not set
# CONFIG_NET_ACT_MIRRED is not set
# CONFIG_NET_ACT_IPT is not set
# CONFIG_NET_ACT_NAT is not set
# CONFIG_NET_ACT_PEDIT is not set
# CONFIG_NET_ACT_SIMP is not set
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_SCH_FIFO=y
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_9P is not set
# CONFIG_NET_SB1000 is not set
CONFIG_NET_ETHERNET=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_TULIP is not set
CONFIG_NET_PCI=y
# CONFIG_NET_POCKET is not set
# CONFIG_NET_FC is not set
# CONFIG_NET_POLL_CONTROLLER is not set

	And that:

fraga@tux /usr/src/linux$ grep NAMESPACE .config
CONFIG_NAMESPACES=y

	but this one, I think, isn't related to what you asked me.

> Any netfilter (iptables) rules on server which could cause those packets 
> to not reach TCP layer?

	Here are the complete rules:

# Generated by iptables-save v1.3.8 on Tue Aug 19 21:28:12 2008
*filter
:INPUT DROP [627:34387]
:FORWARD DROP [0:0]
:OUTPUT ACCEPT [58771289:83128359870]
:DROP_INPUT - [0:0]
:FLDR - [0:0]
:LDR - [0:0]
-A INPUT -i lo -j ACCEPT 
-A INPUT -j DROP_INPUT 
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT 
-A INPUT -p tcp -m multiport --dports 80,21,25,53,119,443,873,993,995
-A INPUT -s 192.168.102.1 -p tcp -m tcp --dport 3493 -j ACCEPT 
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT 
-A INPUT -p udp -m udp --dport 53 -j ACCEPT 
-A INPUT -p tcp -m tcp --dport 113 -j REJECT --reject-with tcp-reset 
-A INPUT -p udp -m udp --dport 1194:1196 -j ACCEPT 
-A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT 
-A INPUT -j LDR 
-A FORWARD -j FLDR 
-A DROP_INPUT -s 216.201.112.111 -m comment --comment "deborahsafe Spam" -j DROP 
-A DROP_INPUT -s 200.49.247.241 -p tcp -m tcp --dport 22 -j DROP 
-A DROP_INPUT -s 189.70.204.3 -p tcp -m tcp --dport 21 -j DROP 
-A DROP_INPUT -s 189.70.204.3 -p tcp -m tcp --dport 21 -j DROP 
-A DROP_INPUT -s 189.70.204.3 -p tcp -m tcp --dport 21 -j DROP 
-A FLDR -j LOG --log-prefix "DROP [FORWARD]: " --log-level 6 --log-ip-options 
-A FLDR -j DROP 
-A LDR -j LOG --log-prefix "DROP [INPUT]: " --log-level 6 --log-ip-options 
-A LDR -j DROP 
COMMIT
# Completed on Tue Aug 19 21:28:13 2008

	As you can see, it's a preetty simple set of rules, nothing exotic here.

> MIBs might give some clue why those segments didn't get accepted. Most 
> interesting ones are PAWSEstab, TCPAbortOnSyn and InErrs. One can use 
> /bin/cut to read those from the one-line files if one wants to (however,
> I attached a script which transposes them to get them somewhat 
> human-readable). Also having the /proc/net/tcp output from the server 
> while stalling would be (have been) useful to reveal state info (but I 
> should have remembered to ask you to run it on both of them :-)). 

	Ok ;) No problem, when I get the problem, I'll provide you the requested 
information.

> Also, I wonder what that [|tcp] hides, e.g., "<nop,nop,timestamp 
> 15980976 70381399,nop,nop,[|tcp]>" in tcpdump (and that was for an ACK 
> which doesn't make too much sense to me there). It occurs because 
> snaplen which was given for tcpdump is small enough to make TCP header 
> partial.

	Hmmm, I don't know. This is complex to me, but I'll apply your script.

	Thank you!

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-20  0:34                                                                                                     ` Dâniel Fraga
@ 2008-08-20  7:57                                                                                                       ` Ilpo Järvinen
  2008-08-20 12:37                                                                                                       ` Ilpo Järvinen
  1 sibling, 0 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-20  7:57 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	sr, netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5537 bytes --]

On Tue, 19 Aug 2008, Dâniel Fraga wrote:

> On Tue, 19 Aug 2008 13:38:35 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Perhaps, though it's not at all clear how it could do that...
> 
> 	I was thinking here of of some specific configuration I use.
> For example, I always used the wonder shaper htb script:
> 
> http://lartc.org/howto/lartc.cookbook.ultimate-tc.html#AEN2241
> 
> 	Could HTB mess with frto or cause this problem? Would it be
> useful to disable completely HTB and use just the default scheduler?
> 
> > Do you have net namespaces enabled CONFIG_NET_NS in .config?
> 
> 	I couldn't find this specific option:
> 
> fraga@tux /usr/src/linux$ grep CONFIG_NET_NS .config
> fraga@tux /usr/src/linux$ 
> 
> 	But I have those:
> 
> fraga@tux /usr/src/linux$ grep CONFIG_NET_ .config
> # CONFIG_NET_KEY is not set
> # CONFIG_NET_IPIP is not set
> # CONFIG_NET_IPGRE is not set
> CONFIG_NET_SCHED=y
> # CONFIG_NET_SCH_CBQ is not set
> CONFIG_NET_SCH_HTB=m
> # CONFIG_NET_SCH_HFSC is not set
> CONFIG_NET_SCH_PRIO=m
> CONFIG_NET_SCH_RED=m
> CONFIG_NET_SCH_SFQ=m
> # CONFIG_NET_SCH_TEQL is not set
> CONFIG_NET_SCH_TBF=m
> CONFIG_NET_SCH_GRED=m
> CONFIG_NET_SCH_DSMARK=m
> # CONFIG_NET_SCH_NETEM is not set
> CONFIG_NET_SCH_INGRESS=m
> CONFIG_NET_CLS=y
> # CONFIG_NET_CLS_BASIC is not set
> CONFIG_NET_CLS_TCINDEX=m
> CONFIG_NET_CLS_ROUTE4=m
> CONFIG_NET_CLS_ROUTE=y
> CONFIG_NET_CLS_FW=m
> CONFIG_NET_CLS_U32=m
> CONFIG_NET_CLS_RSVP=m
> # CONFIG_NET_CLS_RSVP6 is not set
> # CONFIG_NET_CLS_FLOW is not set
> # CONFIG_NET_EMATCH is not set
> CONFIG_NET_CLS_ACT=y
> CONFIG_NET_ACT_POLICE=y
> # CONFIG_NET_ACT_GACT is not set
> # CONFIG_NET_ACT_MIRRED is not set
> # CONFIG_NET_ACT_IPT is not set
> # CONFIG_NET_ACT_NAT is not set
> # CONFIG_NET_ACT_PEDIT is not set
> # CONFIG_NET_ACT_SIMP is not set
> # CONFIG_NET_CLS_IND is not set
> CONFIG_NET_SCH_FIFO=y
> # CONFIG_NET_PKTGEN is not set
> # CONFIG_NET_9P is not set
> # CONFIG_NET_SB1000 is not set
> CONFIG_NET_ETHERNET=y
> # CONFIG_NET_VENDOR_3COM is not set
> # CONFIG_NET_TULIP is not set
> CONFIG_NET_PCI=y
> # CONFIG_NET_POCKET is not set
> # CONFIG_NET_FC is not set
> # CONFIG_NET_POLL_CONTROLLER is not set
> 
> 	And that:
> 
> fraga@tux /usr/src/linux$ grep NAMESPACE .config
> CONFIG_NAMESPACES=y
> 
> 	but this one, I think, isn't related to what you asked me.
> 
> > Any netfilter (iptables) rules on server which could cause those packets 
> > to not reach TCP layer?
> 
> 	Here are the complete rules:
> 
> # Generated by iptables-save v1.3.8 on Tue Aug 19 21:28:12 2008
> *filter
> :INPUT DROP [627:34387]
> :FORWARD DROP [0:0]
> :OUTPUT ACCEPT [58771289:83128359870]
> :DROP_INPUT - [0:0]
> :FLDR - [0:0]
> :LDR - [0:0]
> -A INPUT -i lo -j ACCEPT 
> -A INPUT -j DROP_INPUT 
> -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT 
> -A INPUT -p tcp -m multiport --dports 80,21,25,53,119,443,873,993,995
> -A INPUT -s 192.168.102.1 -p tcp -m tcp --dport 3493 -j ACCEPT 
> -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT 
> -A INPUT -p udp -m udp --dport 53 -j ACCEPT 
> -A INPUT -p tcp -m tcp --dport 113 -j REJECT --reject-with tcp-reset 
> -A INPUT -p udp -m udp --dport 1194:1196 -j ACCEPT 
> -A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT 
> -A INPUT -j LDR 
> -A FORWARD -j FLDR 
> -A DROP_INPUT -s 216.201.112.111 -m comment --comment "deborahsafe Spam" -j DROP 
> -A DROP_INPUT -s 200.49.247.241 -p tcp -m tcp --dport 22 -j DROP 
> -A DROP_INPUT -s 189.70.204.3 -p tcp -m tcp --dport 21 -j DROP 
> -A DROP_INPUT -s 189.70.204.3 -p tcp -m tcp --dport 21 -j DROP 
> -A DROP_INPUT -s 189.70.204.3 -p tcp -m tcp --dport 21 -j DROP 
> -A FLDR -j LOG --log-prefix "DROP [FORWARD]: " --log-level 6 --log-ip-options 
> -A FLDR -j DROP 
> -A LDR -j LOG --log-prefix "DROP [INPUT]: " --log-level 6 --log-ip-options 
> -A LDR -j DROP 
> COMMIT
> # Completed on Tue Aug 19 21:28:13 2008
> 
> 	As you can see, it's a preetty simple set of rules, nothing exotic here.
> 
> > MIBs might give some clue why those segments didn't get accepted. Most 
> > interesting ones are PAWSEstab, TCPAbortOnSyn and InErrs. One can use 
> > /bin/cut to read those from the one-line files if one wants to (however,
> > I attached a script which transposes them to get them somewhat 
> > human-readable). Also having the /proc/net/tcp output from the server 
> > while stalling would be (have been) useful to reveal state info (but I 
> > should have remembered to ask you to run it on both of them :-)). 
> 
>  Ok ;) No problem, when I get the problem, I'll provide you the 
>  requested information.

It would be nice to "watch" them for a while (take snapshots with 
timestamps) during the event, so that it's easy to see increments.

> > Also, I wonder what that [|tcp] hides, e.g., "<nop,nop,timestamp 
> > 15980976 70381399,nop,nop,[|tcp]>" in tcpdump (and that was for an ACK 
> > which doesn't make too much sense to me there). It occurs because 
> > snaplen which was given for tcpdump is small enough to make TCP header 
> > partial.
> 
> 	Hmmm, I don't know. This is complex to me, but I'll apply your script.

Try giving -s<number> among tcpdump parameters, where number is at least 
100 or so.

Also, it is very useful to have full set of logs about it to see what 
corresponds to what, so that also the tcpdump and /proc/net/tcp from both 
ends would be included (one started during the problem is better than 
nothing but if you can get it from earlier point too it would be quite 
nice).

I'll comment the rest of this mail later on...

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-20  0:34                                                                                                     ` Dâniel Fraga
  2008-08-20  7:57                                                                                                       ` Ilpo Järvinen
@ 2008-08-20 12:37                                                                                                       ` Ilpo Järvinen
  2008-08-22 21:32                                                                                                         ` Dâniel Fraga
  1 sibling, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-20 12:37 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4190 bytes --]

On Tue, 19 Aug 2008, Dâniel Fraga wrote:

> On Tue, 19 Aug 2008 13:38:35 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Perhaps, though it's not at all clear how it could do that...
> 
> 	I was thinking here of of some specific configuration I use.
> For example, I always used the wonder shaper htb script:
> 
> http://lartc.org/howto/lartc.cookbook.ultimate-tc.html#AEN2241
> 
> 	Could HTB mess with frto or cause this problem? Would it be
> useful to disable completely HTB and use just the default scheduler?

Based on irc discussion with davem, there is a htb bug which can cause 
corruption of the retransmitted TCP packets (and then a discard due to 
checksum mismatch). That would also explain the strange headers I noticed 
earlier. There's a patch below (should apply to 2.6.26), please put it at 
least on the host(s) which use htb (I don't know if both server and the 
client do use wondershaper script or just the client). An different 
failure symptoms (one could be somehow frto related as FRTO is used while 
retransmitting) are also quite well explainable.

But FRTO is mostly not a suspect based on the tcpdump you provided (no 
FRTO workaround would help in that).

If you tcpdump with -s0 at receiver, you get full payload and therefore 
it is possible to verify checksum correctness.

> > Do you have net namespaces enabled CONFIG_NET_NS in .config?
> 
> 	I couldn't find this specific option:
> 
> fraga@tux /usr/src/linux$ grep CONFIG_NET_NS .config
> fraga@tux /usr/src/linux$ 
> 
> 	But I have those:

I wasn't in error :-), it took some time also for me to figure out the 
right one, it's quite expected to be off (and even that missing) since 
it's currently !SYSFS depending.

> > Any netfilter (iptables) rules on server which could cause those packets 
> > to not reach TCP layer?
> 
> 	Here are the complete rules:

...snip...

> 	As you can see, it's a preetty simple set of rules, nothing exotic here.

...agreed.

-- 
 i.


---
From: David Miller <davem@davemloft.net>

pkt_sched: Fix return value corruption in HTB and TBF.

Packet schedulers should only return NET_XMIT_DROP iff
the packet really was dropped.  If the packet does reach
the device after we return NET_XMIT_DROP then TCP can
crash because it depends upon the enqueue path return
values being accurate.

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 3fb58f4..51c3f68 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -595,11 +595,13 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		kfree_skb(skb);
 		return ret;
 #endif
-	} else if (cl->un.leaf.q->enqueue(skb, cl->un.leaf.q) !=
+	} else if ((ret = cl->un.leaf.q->enqueue(skb, cl->un.leaf.q)) !=
 		   NET_XMIT_SUCCESS) {
-		sch->qstats.drops++;
-		cl->qstats.drops++;
-		return NET_XMIT_DROP;
+		if (ret == NET_XMIT_DROP) {
+			sch->qstats.drops++;
+			cl->qstats.drops++;
+		}
+		return ret;
 	} else {
 		cl->bstats.packets +=
 			skb_is_gso(skb)?skb_shinfo(skb)->gso_segs:1;
@@ -639,11 +641,13 @@ static int htb_requeue(struct sk_buff *skb, struct Qdisc *sch)
 		kfree_skb(skb);
 		return ret;
 #endif
-	} else if (cl->un.leaf.q->ops->requeue(skb, cl->un.leaf.q) !=
+	} else if ((ret = cl->un.leaf.q->ops->requeue(skb, cl->un.leaf.q)) !=
 		   NET_XMIT_SUCCESS) {
-		sch->qstats.drops++;
-		cl->qstats.drops++;
-		return NET_XMIT_DROP;
+		if (ret == NET_XMIT_DROP) {
+			sch->qstats.drops++;
+			cl->qstats.drops++;
+		}
+		return ret;
 	} else
 		htb_activate(q, cl);
 
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index 0b7d78f..fc6f8f3 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -123,15 +123,8 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 	struct tbf_sched_data *q = qdisc_priv(sch);
 	int ret;
 
-	if (skb->len > q->max_size) {
-		sch->qstats.drops++;
-#ifdef CONFIG_NET_CLS_ACT
-		if (sch->reshape_fail == NULL || sch->reshape_fail(skb, sch))
-#endif
-			kfree_skb(skb);
-
-		return NET_XMIT_DROP;
-	}
+	if (skb->len > q->max_size)
+		return qdisc_reshape_fail(skb, sch);
 
 	if ((ret = q->qdisc->enqueue(skb, q->qdisc)) != 0) {
 		sch->qstats.drops++;

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-12  7:46                                                                       ` Thomas Jarosch
  2008-08-12  8:18                                                                         ` David Miller
@ 2008-08-22 21:18                                                                         ` Ilpo Järvinen
  1 sibling, 0 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-22 21:18 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: David Miller, billfink, fragabr, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: text/plain, Size: 2460 bytes --]

On Tue, 12 Aug 2008, Thomas Jarosch wrote:

> On Monday, 11. August 2008 23:44:21 David Miller wrote:
> > Trying to come up with a signature for this bogus stuff is both time
> > consuming and having a risk of false positives.  And I really question
> > whether this thing is worth it.
> >
> > The sane thing to do in this case is to declare the box inoperative
> > and that it needs to be fixed to avoid this behavior.
> >
> > Any reasonable congestion control scheme is going to run into problems
> > trying to react to the packet patterns this thing creates.  It is
> > therefore not really limited to FRTO so it really shouldn't be treated
> > like an FRTO problem even though it shows up more pronounced when
> > FRTO is enabled.
> 
> David, I agree with you, though I'm not sure about the end user experience:
> 
> The kernel is an early adopter of FRTO and will be bitten by bugs of other
> TCP implementations like we've experienced. I guess most affected users
> just see stalled or slow connections and won't have the time or knowledge
> to debug this.

This is hardly a big problem. Much bigger problem seems to be that some 
distros base to 2.6.24 and did not take TCP fixes that were put to 
2.6.25.7 but not to 2.6.24.y series because it wasn't updated anymore. 
There are hardly any other reports but for 2.6.24 (and the ones which we 
have have gone through @ netdev to fix the bugs / problems) in the ones 
I've seen.

> A proper warning could help them and the kernel
> developers to get this issue solved as quickly as possible.
> 
> We called the hotline of the ISP several times and they always claimed
> sending big mails with Outlook/Windows works, so it must be linux's fault.
> That view of things is totally biased, but it's something I want to make sure
> people can't get away with easily :-)

I should probably one day check how vista's frto is behaving
itself to know better... ...but I guess they'll be running to
some problems with big mails pretty soon... ;-)

In the meantime, can you check the attached patches. Besides the kernel 
patch, you need to build your own patched iproute2 as well to configure 
the features (ip tool among them is enough in case the build of some other 
part of the toolset fails like it did for me). I somewhat tested them, and 
the result seemed to be what I'd expect (I just forced RTOs with some 
netem heavy dropping and quickly glanced over the resulting packet 
patterns near RTO).

-- 
 i.

[-- Attachment #2: Type: text/plain, Size: 1764 bytes --]

From b4d1efcf1d4384296d6d6b4f8378f8c408cefc98 Mon Sep 17 00:00:00 2001
From: =?ISO-8859-1?q?Ilpo=20J=E4rvinen?= <ilpo.jarvinen@helsinki.fi>
Date: Tue, 19 Aug 2008 08:20:16 +0300
Subject: [PATCH] tcp/frto: make frto per route configurable
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit

Needs iproute2 support since it isn't able to set RTAX_FEATURES
currently (ie., also the other TCP variant related RTAX_FEATUREs
won't work, they've been unused since the addition in 2003 or
so).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/rtnetlink.h |    1 +
 net/ipv4/tcp_input.c      |    4 ++++
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index f4d386c..e628062 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -373,6 +373,7 @@ enum
 #define RTAX_FEATURE_SACK	0x00000002
 #define RTAX_FEATURE_TIMESTAMP	0x00000004
 #define RTAX_FEATURE_ALLFRAG	0x00000008
+#define RTAX_FEATURE_FRTO	0x00000010
 
 struct rta_session
 {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1f5e604..4f1cc0e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1709,11 +1709,15 @@ int tcp_use_frto(struct sock *sk)
 {
 	const struct tcp_sock *tp = tcp_sk(sk);
 	const struct inet_connection_sock *icsk = inet_csk(sk);
+	struct dst_entry *dst = __sk_dst_get(sk);
 	struct sk_buff *skb;
 
 	if (!sysctl_tcp_frto)
 		return 0;
 
+	if (dst && (dst_metric(dst, RTAX_FEATURES) & RTAX_FEATURE_FRTO))
+		return 0;
+
 	/* MTU probe and F-RTO won't really play nicely along currently */
 	if (icsk->icsk_mtup.probe_size)
 		return 0;
-- 
1.5.2.2


[-- Attachment #3: Type: text/plain, Size: 5146 bytes --]

From 59d7878c04eb9571c58baf78bfd07b169d3e5c0d Mon Sep 17 00:00:00 2001
From: =?ISO-8859-1?q?Ilpo=20J=E4rvinen?= <ilpo.jarvinen@helsinki.fi>
Date: Fri, 22 Aug 2008 14:49:00 +0300
Subject: [PATCH] iproute2: enable setting of per route features
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit

The kernel has had an entry for per route RTAX_FEATURES which
was added as unused back in 2003. Allow setting them now.

It seems that it's much more sensible to have the meaning
negated because otherwise the meaning of zero is very ambiguous,
ie., does it mean that feature is turned off or not given.
Besides, this matches what one would expect in the intented
use-case, where we have global settings from sysctl and want
to work-around something per route (ie., disable an otherwise
enabled feature).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/rtnetlink.h |    1 +
 ip/iproute.c              |   58 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index c1f2d50..354a6f1 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -373,6 +373,7 @@ enum
 #define RTAX_FEATURE_SACK	0x00000002
 #define RTAX_FEATURE_TIMESTAMP	0x00000004
 #define RTAX_FEATURE_ALLFRAG	0x00000008
+#define RTAX_FEATURE_FRTO	0x00000010
 
 struct rta_session
 {
diff --git a/ip/iproute.c b/ip/iproute.c
index 2a8f3f8..d4a90fc 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -52,6 +52,20 @@ static const char *mx_names[RTAX_MAX+1] = {
 	[RTAX_FEATURES] = "features",
 	[RTAX_RTO_MIN]	= "rto_min",
 };
+
+struct valname {
+	unsigned int	val;
+	const char	*name;
+};
+
+static const struct valname features[] = {
+	{ RTAX_FEATURE_ECN, "ecn" },
+	{ RTAX_FEATURE_SACK, "sack" },
+	{ RTAX_FEATURE_TIMESTAMP, "timestamps" },
+	{ RTAX_FEATURE_TIMESTAMP, "ts" },
+	{ RTAX_FEATURE_FRTO, "frto"},
+};
+
 static void usage(void) __attribute__((noreturn));
 
 static void usage(void)
@@ -73,7 +87,7 @@ static void usage(void)
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
 	fprintf(stderr, "           [ ssthresh NUMBER ] [ realms REALM ] [ src ADDRESS ]\n");
-	fprintf(stderr, "           [ rto_min TIME ]\n");
+	fprintf(stderr, "           [ rto_min TIME ] [ features DISABLED_FEATURES ]\n");
 	fprintf(stderr, "TYPE := [ unicast | local | broadcast | multicast | throw |\n");
 	fprintf(stderr, "          unreachable | prohibit | blackhole | nat ]\n");
 	fprintf(stderr, "TABLE_ID := [ local | main | default | all | NUMBER ]\n");
@@ -83,6 +97,8 @@ static void usage(void)
 	fprintf(stderr, "NHFLAGS := [ onlink | pervasive ]\n");
 	fprintf(stderr, "RTPROTO := [ kernel | boot | static | NUMBER ]\n");
 	fprintf(stderr, "TIME := NUMBER[s|ms|us|ns|j]\n");
+	fprintf(stderr, "DISABLED_FEATURES := sack | timestamps | ts | ecn | frto |\n");
+	fprintf(stderr, "                     [ DISABLED_FEATURES ]\n");
 	exit(-1);
 }
 
@@ -505,10 +521,8 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 			if (mxlock & (1<<i))
 				fprintf(fp, " lock");
 
-			if (i != RTAX_RTT && i != RTAX_RTTVAR &&
-			    i != RTAX_RTO_MIN)
-				fprintf(fp, " %u", *(unsigned*)RTA_DATA(mxrta[i]));
-			else {
+			if (i == RTAX_RTT || i == RTAX_RTTVAR ||
+			    i == RTAX_RTO_MIN) {
 				unsigned long long val = *(unsigned*)RTA_DATA(mxrta[i]);
 
 				val *= 1000;
@@ -520,6 +534,16 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 					fprintf(fp, " %llums", val/hz);
 				else
 					fprintf(fp, " %.2fms", (float)val/hz);
+			} else if (i == RTAX_FEATURES) {
+				int j;
+				unsigned int f = *(unsigned*)RTA_DATA(mxrta[i]);
+				for (j = 0; j < ARRAY_SIZE(features); j++)
+					if (f & features[j].val) {
+						fprintf(fp, " %s", features[j].name);
+						f &= ~features[j].val;
+					}
+			} else {
+				fprintf(fp, " %u", *(unsigned*)RTA_DATA(mxrta[i]));
 			}
 		}
 	}
@@ -851,6 +875,30 @@ int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
 			if (get_unsigned(&win, *argv, 0))
 				invarg("\"ssthresh\" value is invalid\n", *argv);
 			rta_addattr32(mxrta, sizeof(mxbuf), RTAX_SSTHRESH, win);
+		} else if (matches(*argv, "features") == 0) {
+			int j;
+			unsigned int f = 0;
+			NEXT_ARG();
+			while (1) {
+				for (j = 0; j < ARRAY_SIZE(features); j++) {
+					if (strcmp(*argv, features[j].name) == 0) {
+						f |= features[j].val;
+						if (!NEXT_ARG_OK())
+							goto feat_out;
+						NEXT_ARG();
+						break;
+					}
+				}
+				if (j == ARRAY_SIZE(features)) {
+					if (f)
+						PREV_ARG();
+					break;
+				}
+			}
+feat_out:
+			if (!f)
+				invarg("\"features\" list is invalid\n", *argv);
+			rta_addattr32(mxrta, sizeof(mxbuf), RTAX_FEATURES, f);
 		} else if (matches(*argv, "realms") == 0) {
 			__u32 realm;
 			NEXT_ARG();
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-20 12:37                                                                                                       ` Ilpo Järvinen
@ 2008-08-22 21:32                                                                                                         ` Dâniel Fraga
  2008-08-22 21:37                                                                                                           ` David Miller
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-22 21:32 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Wed, 20 Aug 2008 15:37:13 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Based on irc discussion with davem, there is a htb bug which can cause 
> corruption of the retransmitted TCP packets (and then a discard due to 
> checksum mismatch). That would also explain the strange headers I noticed 
> earlier. There's a patch below (should apply to 2.6.26), please put it at 
> least on the host(s) which use htb (I don't know if both server and the 
> client do use wondershaper script or just the client). An different 
> failure symptoms (one could be somehow frto related as FRTO is used while 
> retransmitting) are also quite well explainable.
> 
> But FRTO is mostly not a suspect based on the tcpdump you provided (no 
> FRTO workaround would help in that).

	Ilpo, I have good news. I decided to disable completely HTB and
the problem seems to have gone. And frto is enabled, of course. So the
problem was with HTB, not frto as I thought.

	The HTB patches you provided are going to be included in the
next 2.6.27 kernel, right?

	Thank you.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-22 21:32                                                                                                         ` Dâniel Fraga
@ 2008-08-22 21:37                                                                                                           ` David Miller
  2008-08-23 14:14                                                                                                             ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: David Miller @ 2008-08-22 21:37 UTC (permalink / raw)
  To: fragabr
  Cc: ilpo.jarvinen, thomas.jarosch, billfink, netdev, kaber,
	netfilter-devel, kadlec

From: Dâniel Fraga <fragabr@gmail.com>
Date: Fri, 22 Aug 2008 18:32:24 -0300

> On Wed, 20 Aug 2008 15:37:13 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Based on irc discussion with davem, there is a htb bug which can cause 
> > corruption of the retransmitted TCP packets (and then a discard due to 
> > checksum mismatch). That would also explain the strange headers I noticed 
> > earlier. There's a patch below (should apply to 2.6.26), please put it at 
> > least on the host(s) which use htb (I don't know if both server and the 
> > client do use wondershaper script or just the client). An different 
> > failure symptoms (one could be somehow frto related as FRTO is used while 
> > retransmitting) are also quite well explainable.
> > 
> > But FRTO is mostly not a suspect based on the tcpdump you provided (no 
> > FRTO workaround would help in that).
> 
> 	Ilpo, I have good news. I decided to disable completely HTB and
> the problem seems to have gone. And frto is enabled, of course. So the
> problem was with HTB, not frto as I thought.
> 
> 	The HTB patches you provided are going to be included in the
> next 2.6.27 kernel, right?

Yes, but it's important that you verify that the patch makes the
problem go away when HTB is enabled.  Please make this test if you
can.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-22 21:37                                                                                                           ` David Miller
@ 2008-08-23 14:14                                                                                                             ` Dâniel Fraga
  2008-08-23 14:38                                                                                                               ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-23 14:14 UTC (permalink / raw)
  To: David Miller
  Cc: ilpo.jarvinen, thomas.jarosch, billfink, netdev, kaber,
	netfilter-devel, kadlec

On Fri, 22 Aug 2008 14:37:09 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> Yes, but it's important that you verify that the patch makes the
> problem go away when HTB is enabled.  Please make this test if you
> can.

	Correct! I tested with the HTB patches and the problem was
solved. ;) Thank you very much.


-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-23 14:14                                                                                                             ` Dâniel Fraga
@ 2008-08-23 14:38                                                                                                               ` Ilpo Järvinen
  2008-08-24 19:38                                                                                                                 ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-23 14:38 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 414 bytes --]

On Sat, 23 Aug 2008, Dâniel Fraga wrote:

> On Fri, 22 Aug 2008 14:37:09 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > Yes, but it's important that you verify that the patch makes the
> > problem go away when HTB is enabled.  Please make this test if you
> > can.
> 
> 	Correct! I tested with the HTB patches and the problem was
> solved. ;) Thank you very much.

Thanks for verifying it!

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-23 14:38                                                                                                               ` Ilpo Järvinen
@ 2008-08-24 19:38                                                                                                                 ` Dâniel Fraga
  2008-08-26 14:10                                                                                                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-24 19:38 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Sat, 23 Aug 2008 17:38:32 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Thanks for verifying it!

	Ops! i replied too fast! I just got a stalled connection again!

	Important: these files were generated with the HTB patches applied.

	Here are botch tcpdump files:

http://www.abusar.org/htb/dump-mail-server.log
http://www.abusar.org/htb/dump-mail-client.log

	Both readmibs:

http://www.abusar.org/htb/readmibs-server.txt
http://www.abusar.org/htb/readmibs-client.txt

	Here are both cat /proc/net/tcp:

http://www.abusar.org/htb/tcp-server.txt
http://www.abusar.org/htb/tcp-client.txt

	I use the following to generate those dumps:

	1) on the server:

tcpdump -s 0 -w dump-mail-server.log -i eth0 host 201.52.214.230

	2) on the client:

tcpdump -s 0 -w dump-mail-client.log -i eth0 host teleporto.abusar.org and port 995

	What happened?

1) the connection was stalled

2) these tcpdumps are the *best ones* I got because although I started them with
the connection already stalled, the connection suddenly is not stalled anymore,
and a few minutes later was stalled again...

3) I keep tcpdump running for more time
	
	Ps: anyway I could notice that the only two services that
remain stalled is nntp, ftp, pop3 and smtp... http is never stalled,
neither ssh. It seems to affect only "old" protocols :)

	Ps2: anyway, the htb patch seems to help, because the problem
took much longer to happen. With htb patches the problem happens one time a day. 
Without the htb patches the problem happens more than one time a day.

	Ps3: I really doesn't understand why "nmap -sS server"
"solves" the stalled connection issue.

	Ps4: sorry for my hurry feedback before. I thought the problem had gone. Anyway, 
I hope this time I provided the best data for you. Thanks.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-24 19:38                                                                                                                 ` Dâniel Fraga
@ 2008-08-26 14:10                                                                                                                   ` Ilpo Järvinen
  2008-08-26 14:32                                                                                                                     ` Ilpo Järvinen
  2008-08-26 17:18                                                                                                                     ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-26 14:10 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5085 bytes --]

On Sun, 24 Aug 2008, Dâniel Fraga wrote:

> On Sat, 23 Aug 2008 17:38:32 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Thanks for verifying it!
> 
> 	Ops! i replied too fast! I just got a stalled connection again!
> 
> 	Important: these files were generated with the HTB patches applied.

snip

> 	What happened?
> 
> 1) the connection was stalled
> 
> 2) these tcpdumps are the *best ones* I got

Easy to read indeed :-).

> because although I started 
> them with the connection already stalled, the connection suddenly is not 
> stalled anymore, and a few minutes later was stalled again...

There is more than one TCP flow in your workload btw (so using 
"connection" is a bit more blurry from my/TCP's pov). Some stall and never 
finish, some get immediately through without any stalling and proceed ok. 
So far I've not seen any cases with mixed behavior.

The client seems to be working as expected. It even responds with DSACKs 
to SYNACK retransmissions indicating that it has processed them on TCP 
level. It might break some foreign systems btw (I don't remember if it was 
specified, so some TCP implementers may miss that possibility and their 
stack give up while seeing that to happen :-)), I hope that nobody demands 
it to be disabled someday (just a sidenote and has no relation to the 
actual problem).

> 3) I keep tcpdump running for more time
> 	
> 	Ps: anyway I could notice that the only two services that
> remain stalled is nntp, ftp, pop3 and smtp... http is never stalled,
> neither ssh. It seems to affect only "old" protocols :)

It could be userspace related thing.

> 	Ps2: anyway, the htb patch seems to help, because the problem
> took much longer to happen. With htb patches the problem happens one
> time a day. Without the htb patches the problem happens more than one 
> time a day.

It seems that there could well be more than one problem, with symptoms 
similar enough that they're hard to distinguish without a packet trace.

> 	Ps3: I really doesn't understand why "nmap -sS server"
> "solves" the stalled connection issue.

Did it solve in this particular case? At least for 995 nothing 
earth-shattering happened. I find it hardly related here. Ie., I clearly 
see the problematic flows, and non-problematic ones. Neither seem to have 
no relation to the nmap generated traffic / timing. There's one 
non-problematic 995 flow where server generates some traffic during nmap 
(5 mins since the previous packet was seen for that connection) but likely 
the NAT in between has timed out that connection because no tear-down 
resets (or anything else) show up in any tcpdump.

> 	Ps4: sorry for my hurry feedback before. I thought the problem had 
> gone. Anyway, I hope this time I provided the best data for you. Thanks.

No problem. It's well possible to have a lucky periods every now and 
then... 

A number of packets have bad tcp cksum for the sender but that's probably 
due to some offloading or so... Receiver-side has correct timestamps 
however, so it shouldn't be a problem after all. On the bright side, -s 0 
allows all timestamps to be visible, this makes me really perplexed:

S 3102907969:3102907969(0) win 5840 <mss 1460,sackOK,timestamp 37188459 0,nop,wscale 7> (DF)
S 3069527876:3069527876(0) ack 3102907970 win 5792 <mss 1460,sackOK,timestamp 258711279 37188459,nop,wscale 6> (DF)
. ack 1 win 46 <nop,nop,timestamp 37188477 258711279> (DF)
P 1:125(124) ack 1 win 46 <nop,nop,timestamp 37188481 258711279> (DF)
P 1:125(124) ack 1 win 46 <nop,nop,timestamp 37188699 258711279> (DF)
P 1:125(124) ack 1 win 46 <nop,nop,timestamp 37189135 258711279> (DF)
P 1:125(124) ack 1 win 46 <nop,nop,timestamp 37190007 258711279> (DF)
P 1:125(124) ack 1 win 46 <nop,nop,timestamp 37191751 258711279> (DF)
S 3069527876:3069527876(0) ack 3102907970 win 5792 <mss 1460,sackOK,timestamp 258712395 37191751,nop,wscale 6> (DF)
. ack 1 win 46 <nop,nop,timestamp 37192938 258712395,nop,nop,sack sack 1 {0:1} > (DF)
P 1:125(124) ack 1 win 46 <nop,nop,timestamp 37195239 258712395> (DF)

...On the latest syn, the ts_recent was updated by the last packet 
with data, so it was definately processed by (some parts of) TCP at the 
server, so at least that wasn't dropped any where in between.

In order for that to happen, I think req->ts_recent = tmp_opt.rcv_tsval
in tcp_check_req must be reached. It seems that there's likely an abort 
on early there because synacks keep being retransmitted. Would a valid 
socket be created the request would be removed from the list.

ListenOverflows might explain this (it can't be ListenDrops since it's 
equal to ListenOverflows and both get incremented on overflow). Are you 
perhaps short on workers at the userspace server? It would be nice to 
capture those mibs often enough (eg., once per 1s with timestamps) during 
the stall to see what actually gets incremented during the event because 
there's currently so much haystack that finding the needle gets impossible 
(ListenOverflows 47410) :-). Also, the corresponding tcpdump would be 
needed to match the events.


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-26 14:10                                                                                                                   ` Ilpo Järvinen
@ 2008-08-26 14:32                                                                                                                     ` Ilpo Järvinen
  2008-08-26 17:18                                                                                                                     ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-26 14:32 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 706 bytes --]

On Tue, 26 Aug 2008, Ilpo Järvinen wrote:

> ListenOverflows might explain this (it can't be ListenDrops since it's 
> equal to ListenOverflows and both get incremented on overflow). Are you 
> perhaps short on workers at the userspace server? It would be nice to 
> capture those mibs often enough (eg., once per 1s with timestamps) during 
> the stall to see what actually gets incremented during the event because 
> there's currently so much haystack that finding the needle gets impossible 
> (ListenOverflows 47410) :-). Also, the corresponding tcpdump would be 
> needed to match the events.

Alternatively, you could strace the userspace to see that it keeps 
accept()'ing the connections.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-26 14:10                                                                                                                   ` Ilpo Järvinen
  2008-08-26 14:32                                                                                                                     ` Ilpo Järvinen
@ 2008-08-26 17:18                                                                                                                     ` Dâniel Fraga
  2008-08-26 20:40                                                                                                                       ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-26 17:18 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Tue, 26 Aug 2008 17:10:46 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> There is more than one TCP flow in your workload btw (so using 
> "connection" is a bit more blurry from my/TCP's pov). Some stall and never 
> finish, some get immediately through without any stalling and proceed ok. 
> So far I've not seen any cases with mixed behavior.

	Interesting.

> It could be userspace related thing.

	Hmmm. I'll try to report this to the dovecot and inn lists.

> It seems that there could well be more than one problem, with symptoms 
> similar enough that they're hard to distinguish without a packet trace.

	Yes, exactly! I think the same.

> Did it solve in this particular case? At least for 995 nothing 

	Yes. nmap -sS always solves the problem. Very strange. nmap -sS
for me is kind of brute force attempt to restablish the normal
behaviour of the server... 

	Anyway, I disabled htb and frto and everything is fine for now.
I'll keep investigating this.

> ListenOverflows might explain this (it can't be ListenDrops since it's 
> equal to ListenOverflows and both get incremented on overflow). Are you 
> perhaps short on workers at the userspace server? It would be nice to 

	I use dovecot por mail. I'll post on the dovecot list. If it's
an userspace issue, better.

> capture those mibs often enough (eg., once per 1s with timestamps) during 
> the stall to see what actually gets incremented during the event because 
> there's currently so much haystack that finding the needle gets impossible 
> (ListenOverflows 47410) :-). Also, the corresponding tcpdump would be 
> needed to match the events.

	Ok. If I had more useful information, I'll reply.

	Thank you very much!

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-26 17:18                                                                                                                     ` Dâniel Fraga
@ 2008-08-26 20:40                                                                                                                       ` Ilpo Järvinen
  2008-08-26 21:17                                                                                                                         ` Dâniel Fraga
  2008-08-28 21:49                                                                                                                         ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-26 20:40 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4326 bytes --]

On Tue, 26 Aug 2008, Dâniel Fraga wrote:

> On Tue, 26 Aug 2008 17:10:46 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > There is more than one TCP flow in your workload btw (so using 
> > "connection" is a bit more blurry from my/TCP's pov). Some stall and never 
> > finish, some get immediately through without any stalling and proceed ok. 
> > So far I've not seen any cases with mixed behavior.
> 
> 	Interesting.

If you want to, a tcpdump from normal, working case wouldn't hurt either 
to show the "normal pattern" on network level and that is trivial to 
produce in no time now that you know the commands etc. I guess... :-)

> > It could be userspace related thing.
> 
> 	Hmmm. I'll try to report this to the dovecot and inn lists.

They might not be that interested until we have something more concrete 
than what we know currently... :-)

> > It seems that there could well be more than one problem, with symptoms 
> > similar enough that they're hard to distinguish without a packet trace.
> 
> 	Yes, exactly! I think the same.
> 
> > Did it solve in this particular case? At least for 995 nothing 
> 
> 	Yes. nmap -sS always solves the problem. Very strange. nmap -sS
> for me is kind of brute force attempt to restablish the normal
> behaviour of the server... 

Can you explain a bit more. Does it resolve during it or some time after 
it? And more importantly how do you know that it resolves? Ie., what is 
the normal behavior (be more specific than "it works" :-), how do know 
that it's working).

It seems that either we lack some traffic between the parties or simply 
need to find out what the userspace is doing, and in the latter case what 
happens in the network might not be relevant at all. Is there possibility 
that we miss an alternative route by using the host rule for tcpdump (at 
the server)? Nmap starts at 22:26:26.613098, the last packet in the client 
log is at 22:26:01.452842. Alternatively, the port 995 was not the right 
one to track (though there's clearly this on network level visible problem 
with it too)... :-(

> 	Anyway, I disabled htb and frto and everything is fine for now.
> I'll keep investigating this.

Two points:

HTB shaping could cause drops that are related but considering what it 
visible in the server end's tcpdump, the userspace's behavior is quite 
relevant.

You might jump into conclusions too quickly every now and then, more
time might be needed to really ensure something is working. Obviously
if any non-workingness is noticed, it's always a counter-proof even if 
long working periods occur in between.

> > ListenOverflows might explain this (it can't be ListenDrops since it's 
> > equal to ListenOverflows and both get incremented on overflow). Are you 
> > perhaps short on workers at the userspace server? It would be nice to 
> 
> 	I use dovecot por mail. I'll post on the dovecot list. If it's
> an userspace issue, better.

It's not guaranteed that it's _only_ userspace, there could be some kernel 
aspect in the problem too (e.g., related to wakeups or so).

In syscall terms this ListenOverflow means that int listen(int sockfd, int 
backlog); (see man -S 2 listen) is given some size as backlog for those 
connections that are not yet accept()'ed, and that is exhausted when the 
ListenOverflow gets incremented (ie., if I'm not completely wrong :-)).

You might want to look on dovecot how to make it accept more concurrent 
connections, perhaps the login_max_processes_count might the right one
(I quickly glanced http://wiki.dovecot.org/LoginProcess) though this is 
somewhat site configuration dependant according to that page.

> > capture those mibs often enough (eg., once per 1s with timestamps) during 
> > the stall to see what actually gets incremented during the event because 
> > there's currently so much haystack that finding the needle gets impossible 
> > (ListenOverflows 47410) :-). Also, the corresponding tcpdump would be 
> > needed to match the events.
> 
> 	Ok. If I had more useful information, I'll reply.
> 
> 	Thank you very much!

You could try setting up some script which does something along these 
lines and then redirect its during the event to some file (+ tcpdumping 
the thing obviously):

while [ : ]; do
	date "+%s.%N"
	cat /proc/net/{netstat,snmp}
	sleep 1
done

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-26 20:40                                                                                                                       ` Ilpo Järvinen
@ 2008-08-26 21:17                                                                                                                         ` Dâniel Fraga
  2008-08-27 10:22                                                                                                                           ` Ilpo Järvinen
  2008-08-28 21:49                                                                                                                         ` Dâniel Fraga
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-26 21:17 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Tue, 26 Aug 2008 23:40:58 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> If you want to, a tcpdump from normal, working case wouldn't hurt either 
> to show the "normal pattern" on network level and that is trivial to 
> produce in no time now that you know the commands etc. I guess... :-)

	Ok, there it is:

http://www.abusar.org/htb/dump-normal.log
	
	Just the port 995... I checked email, then received a message,
checked again, just the normal behaviour.

> They might not be that interested until we have something more concrete 
> than what we know currently... :-)

	Ok :) And you're right, because if I disable frto and htb *and*
the problem has gone, there's a huge chance to be something related to
kernel. Or a mix of kernel and user space problem which happens just
when frto and/or htb are used.

> Can you explain a bit more. Does it resolve during it or some time after 
> it? And more importantly how do you know that it resolves? Ie., what is 
> the normal behavior (be more specific than "it works" :-), how do know 
> that it's working).

	Ok. For example:

1) the connection is normal, then suddenly it stalls. I cannot receive
mail, nor download nntp messages, nor access ftp etc.

2) I do on my client machine a "nmap -sS server" and...

3) ...imediatelly the connection is not stalled anymore.

	Now I remembered one thing and I'd like to make a question (I
hope it isn't a stupid question): dynticks (tickless) were implemented
for x86-64 in 2.6.24 kernel and I started to use dynticks in 2.6.24. Could 
it be affecting the server behaviour? I use dynticks (enabled) on all
my machines, but does it make sense to use in a server environment?
Could the dynticks cause this? Until now, I don't think so, but... who
knows?

http://kernelnewbies.org/Linux_2_6_24#head-4edc562fa1b9fa8e5da5adaf1beab057237c325d

> It seems that either we lack some traffic between the parties or simply 
> need to find out what the userspace is doing, and in the latter case what 
> happens in the network might not be relevant at all. Is there possibility 
> that we miss an alternative route by using the host rule for tcpdump (at 
> the server)? Nmap starts at 22:26:26.613098, the last packet in the client 
> log is at 22:26:01.452842. Alternatively, the port 995 was not the right 
> one to track (though there's clearly this on network level visible problem 
> with it too)... :-(

	I tracked the 995 port, because I have problems reading email
pro pop3s (995). Should I do it different with tcpdump? 

> You might jump into conclusions too quickly every now and then, more
> time might be needed to really ensure something is working. Obviously
> if any non-workingness is noticed, it's always a counter-proof even if 
> long working periods occur in between.

	Ok. It seems a complex issue. You're right. I need more
patience ;)

> In syscall terms this ListenOverflow means that int listen(int sockfd, int 
> backlog); (see man -S 2 listen) is given some size as backlog for those 
> connections that are not yet accept()'ed, and that is exhausted when the 
> ListenOverflow gets incremented (ie., if I'm not completely wrong :-)).

	Hmm interesting.

> You might want to look on dovecot how to make it accept more concurrent 
> connections, perhaps the login_max_processes_count might the right one
> (I quickly glanced http://wiki.dovecot.org/LoginProcess) though this is 
> somewhat site configuration dependant according to that page.

	Yes, I have login_max_processes_count = 128 (the default) and I
have just a few users (just 10 users), so I think it's not the problem.
 
> You could try setting up some script which does something along these 
> lines and then redirect its during the event to some file (+ tcpdumping 
> the thing obviously):
> 
> while [ : ]; do
> 	date "+%s.%N"
> 	cat /proc/net/{netstat,snmp}
> 	sleep 1
> done

	Ok. You're helping a lot. Thanks Ilpo ;)


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-26 21:17                                                                                                                         ` Dâniel Fraga
@ 2008-08-27 10:22                                                                                                                           ` Ilpo Järvinen
  2008-08-27 19:51                                                                                                                             ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-27 10:22 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6560 bytes --]

On Tue, 26 Aug 2008, Dâniel Fraga wrote:

> On Tue, 26 Aug 2008 23:40:58 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > If you want to, a tcpdump from normal, working case wouldn't hurt either 
> > to show the "normal pattern" on network level and that is trivial to 
> > produce in no time now that you know the commands etc. I guess... :-)
> 
> 	Ok, there it is:
> 
> http://www.abusar.org/htb/dump-normal.log
> 	
> 	Just the port 995... I checked email, then received a message,
> checked again, just the normal behaviour.

Thanks, those flows (there were again some) looks exactly what also the 
working connections in the earlier log do.

> > They might not be that interested until we have something more concrete 
> > than what we know currently... :-)
> 
> 	Ok :) And you're right, because if I disable frto and htb *and*
> the problem has gone, there's a huge chance to be something related to
> kernel. Or a mix of kernel and user space problem which happens just
> when frto and/or htb are used.
> 
> > Can you explain a bit more. Does it resolve during it or some time after 
> > it? And more importantly how do you know that it resolves? Ie., what is 
> > the normal behavior (be more specific than "it works" :-), how do know 
> > that it's working).
> 
> 	Ok. For example:
> 
> 1) the connection is normal, then suddenly it stalls. I cannot receive
> mail, nor download nntp messages, nor access ftp etc.

...thus there could be other ports that are related as well, do you 
remember what exactly started working in that particular case :-)?

> 2) I do on my client machine a "nmap -sS server" and...
> 
> 3) ...imediatelly the connection is not stalled anymore.

Which of the connections? Mail, nntp, ftp, or the etc.? :-) To the host ip 
which was given for the tcpdump filter, definately nothing was resumed.

> 	Now I remembered one thing and I'd like to make a question (I
> hope it isn't a stupid question): dynticks (tickless) were implemented
> for x86-64 in 2.6.24 kernel and I started to use dynticks in 2.6.24. Could 
> it be affecting the server behaviour? I use dynticks (enabled) on all
> my machines, but does it make sense to use in a server environment?
> Could the dynticks cause this? Until now, I don't think so, but... who
> knows?
> 
> http://kernelnewbies.org/Linux_2_6_24#head-4edc562fa1b9fa8e5da5adaf1beab057237c325d

I was think that at a time (even thought of enquiring you about this 
part of the config), but the tcpdump log shows a problem that is 
unlikely to depend on timers in any way (and at least some timer expires 
because the SYNACKs are retransmitted, so it's not in some infinite wait 
bug). I'd like to know what causes that and try to solve it.

Once we know the reasons, we can probably easily determinate whether 
there's need to experiment with the timers. Trying to conquer all problems 
at once, when not even knowing how many problems one is going to find is 
not that easy either. Besides, I'd be more concerned about the timers on 
the client after seeing that nothing goes in the network while the nmap 
trick resolves the thing.

> > It seems that either we lack some traffic between the parties or simply 
> > need to find out what the userspace is doing, and in the latter case what 
> > happens in the network might not be relevant at all. Is there possibility 
> > that we miss an alternative route by using the host rule for tcpdump (at 
> > the server)? Nmap starts at 22:26:26.613098, the last packet in the client 
> > log is at 22:26:01.452842. Alternatively, the port 995 was not the right 
> > one to track (though there's clearly this on network level visible problem 
> > with it too)... :-(
> 
> 	I tracked the 995 port, because I have problems reading email
> pro pop3s (995). Should I do it different with tcpdump? 

The server's log captured not only 995 traffic but everything else to the 
host with the given ip (including udp which should show the tunnelled 
traffic I guess). Unless there's some other route to that host with 
a different ip, I think we don't have much more to find out in the network 
(besides the potential of missing packets from tcpdump during the syn 
flooding, but it's very unlikely that all packets of some active flow 
would be hit at the same time, so something from a progressing flow would 
still be shown even if some of packets would be missing).

This makes me wander if the network behavior is at all related to 
resolving of the problem. Only thing I can think of is that for some 
reason the userspace gets notified much later than it should about
TCP reset and therefore is waiting until that happens and can only
then continue.

> > You might jump into conclusions too quickly every now and then, more
> > time might be needed to really ensure something is working. Obviously
> > if any non-workingness is noticed, it's always a counter-proof even if 
> > long working periods occur in between.
> 
> 	Ok. It seems a complex issue. You're right. I need more
> patience ;)

...of course if one wants to comment something to keep others posted 
what's happening, one could always note that "so far all good but I keep 
testing for longer time" (that's what some other people say).

> > In syscall terms this ListenOverflow means that int listen(int sockfd, int 
> > backlog); (see man -S 2 listen) is given some size as backlog for those 
> > connections that are not yet accept()'ed, and that is exhausted when the 
> > ListenOverflow gets incremented (ie., if I'm not completely wrong :-)).
> 
> 	Hmm interesting.
> 
> > You might want to look on dovecot how to make it accept more concurrent 
> > connections, perhaps the login_max_processes_count might the right one
> > (I quickly glanced http://wiki.dovecot.org/LoginProcess) though this is 
> > somewhat site configuration dependant according to that page.
> 
> 	Yes, I have login_max_processes_count = 128 (the default) and I
> have just a few users (just 10 users), so I think it's not the problem.

It would be too easy explanation, yeah :-). Can you still please check 
next time that there aren't even near that many server processes at the 
server :-).

> > You could try setting up some script which does something along these 
> > lines and then redirect its during the event to some file (+ tcpdumping 
> > the thing obviously):
> > 
> > while [ : ]; do
> > 	date "+%s.%N"
> > 	cat /proc/net/{netstat,snmp}

Adding this wouldn't hurt btw:

cat /proc/net/tcp

> > 	sleep 1
> > done
> 
> 	Ok. You're helping a lot. Thanks Ilpo ;)
> 
> 
> 

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-27 10:22                                                                                                                           ` Ilpo Järvinen
@ 2008-08-27 19:51                                                                                                                             ` Dâniel Fraga
  2008-08-27 20:32                                                                                                                               ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-27 19:51 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Wed, 27 Aug 2008 13:22:22 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> ...thus there could be other ports that are related as well, do you 
> remember what exactly started working in that particular case :-)?

> Which of the connections? Mail, nntp, ftp, or the etc.? :-) To the host ip 
> which was given for the tcpdump filter, definately nothing was resumed.

	Ok. Let's focus on mail:

1) first, my client (Claws-mail -- but it happened with Outlook of
other users too) is working perfectly. I can download new messages. It
connects to port 995 on the server without problems.

2) suddenly it gives me an error message, that it cannot authenticate
anymore (sorry, I don't have the exact message). If I try again to
download new messages, it gives the same error. The connection to port
995 seems stalled or, better yet, cannot complete succesfully. It seems
to time out.

3) the server will stay this way, until I do an "nmap" to the server.
This way, everything goes back to normal.

	So, the server at point 2 got stalled and just an nmap server
can force the server to go back to normal behaviour (I discovered that
nmap solved the issue by luck).

> I was think that at a time (even thought of enquiring you about this 
> part of the config), but the tcpdump log shows a problem that is 
> unlikely to depend on timers in any way (and at least some timer expires 
> because the SYNACKs are retransmitted, so it's not in some infinite wait 
> bug). I'd like to know what causes that and try to solve it.

	Ok.

> Once we know the reasons, we can probably easily determinate whether 
> there's need to experiment with the timers. Trying to conquer all problems 
> at once, when not even knowing how many problems one is going to find is 
> not that easy either. Besides, I'd be more concerned about the timers on 
> the client after seeing that nothing goes in the network while the nmap 
> trick resolves the thing.

	Ok.

> The server's log captured not only 995 traffic but everything else to the 
> host with the given ip (including udp which should show the tunnelled 
> traffic I guess). Unless there's some other route to that host with 
	
	Ok, that's because I forgot to restrict traffic to port 995 on
the server. Sorry.

> This makes me wander if the network behavior is at all related to 
> resolving of the problem. Only thing I can think of is that for some 
> reason the userspace gets notified much later than it should about
> TCP reset and therefore is waiting until that happens and can only
> then continue.

	What I can assure you is that other users (which use Microsoft
Outlook Express) had the same problem, so, in this case, we can be
pretty sure it isn't related to user space client.

> It would be too easy explanation, yeah :-). Can you still please check 
> next time that there aren't even near that many server processes at the 
> server :-).

	Ok, when I get the problem I'll check this.

> Adding this wouldn't hurt btw:
> 
> cat /proc/net/tcp

	Ok.

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-27 19:51                                                                                                                             ` Dâniel Fraga
@ 2008-08-27 20:32                                                                                                                               ` Ilpo Järvinen
  2008-08-27 20:50                                                                                                                                 ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-27 20:32 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2897 bytes --]

On Wed, 27 Aug 2008, Dâniel Fraga wrote:

> On Wed, 27 Aug 2008 13:22:22 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > ...thus there could be other ports that are related as well, do you 
> > remember what exactly started working in that particular case :-)?
> 
> > Which of the connections? Mail, nntp, ftp, or the etc.? :-) To the host ip 
> > which was given for the tcpdump filter, definately nothing was resumed.
> 
> 	Ok. Let's focus on mail:
> 
> 1) first, my client (Claws-mail -- but it happened with Outlook of
> other users too) is working perfectly. I can download new messages. It
> connects to port 995 on the server without problems.
> 
> 2) suddenly it gives me an error message, that it cannot authenticate
> anymore (sorry, I don't have the exact message).

The exact message is not that big deal :-).

> If I try again to
> download new messages, it gives the same error. The connection to port
> 995 seems stalled or, better yet, cannot complete succesfully. It seems
> to time out.
> 
> 3) the server will stay this way, until I do an "nmap" to the server.
> This way, everything goes back to normal.

Ok. Though this all opens more questions than answers... :-(, why isn't 
there any traffic in neither of the tcpdumps then (not in the client's
nor in the server's).

> 	So, the server at point 2 got stalled and just an nmap server
> can force the server to go back to normal behaviour (I discovered that
> nmap solved the issue by luck).

I guess it might have something to do with the additional 3-way 
handshake that gets attempted but who knows...

> > The server's log captured not only 995 traffic but everything else to the 
> > host with the given ip (including udp which should show the tunnelled 
> > traffic I guess). Unless there's some other route to that host with 
> 	
> 	Ok, that's because I forgot to restrict traffic to port 995 on
> the server. Sorry.

I don't think it was bad at all... :-) I just meant that there wasn't any 
other visible traffic, which is very very strange because nothing port 995 
related (or anything else) seems to happen during the nmap... Which 
network interfaces the server has? Could things get routed through some 
other iface during the time of trouble (and during the nmap solution), 
that would explain why it isn't visible in the tcpdump which is for the 
specific interface.

> > This makes me wander if the network behavior is at all related to 
> > resolving of the problem. Only thing I can think of is that for some 
> > reason the userspace gets notified much later than it should about
> > TCP reset and therefore is waiting until that happens and can only
> > then continue.
> 
> 	What I can assure you is that other users (which use Microsoft
> Outlook Express) had the same problem, so, in this case, we can be
> pretty sure it isn't related to user space client.

Ah, ok.



-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-27 20:32                                                                                                                               ` Ilpo Järvinen
@ 2008-08-27 20:50                                                                                                                                 ` Dâniel Fraga
  2008-08-27 21:25                                                                                                                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-27 20:50 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Wed, 27 Aug 2008 23:32:34 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Ok. Though this all opens more questions than answers... :-(, why isn't 
> there any traffic in neither of the tcpdumps then (not in the client's
> nor in the server's).

	Very strange. In this topic, I saw the discussion about some
routers messing with traffic and frto related stuff, right? My server
is behind some routers that I don't know (because it's not me who
controls these routers). So if the problem is with some of these
routers, I'm afraid we can do nothing about that.

> I don't think it was bad at all... :-) I just meant that there wasn't any 
> other visible traffic, which is very very strange because nothing port 995 
> related (or anything else) seems to happen during the nmap... Which 
> network interfaces the server has? Could things get routed through some 
> other iface during the time of trouble (and during the nmap solution), 
> that would explain why it isn't visible in the tcpdump which is for the 
> specific interface.

fraga@teleporto ~$ ip route list
10.1.0.6 dev tun2  proto kernel  scope link  src 10.1.0.5 
10.195.195.1 dev tun1  proto kernel  scope link  src 10.195.195.2 
192.168.102.0/24 via 10.1.0.6 dev tun2 
200.211.201.0/24 dev eth1  proto kernel  scope link  src
200.211.201.248 189.38.0.0/16 dev eth0  proto kernel  scope link  src
189.38.18.122 default via 189.38.18.121 dev eth0 

	Well, if that's the problem I'll be very ashamed for wasting
your time. Anyway, eveything should go to eth0 interface, through
189.38.18.121 gateway.

	The eth1 interface (200.211.201.248) is an old interface which
we do not use anymore. So I'm right now deactivating it (I should do
that for a long time ago). Let's see if the problem remains or not
(although I have other servers with multiple interfaces and everything
is fine, since, as far as I understand, what matters is the default
gateway -- I think that there's no reason to Linux send something to
eth1 interface since there's only one default gateway).

	Anyway I'm dropping eth1 interface. I'll wait a few days before
confirming if that's the problem or not.

	Thanks again.

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-27 20:50                                                                                                                                 ` Dâniel Fraga
@ 2008-08-27 21:25                                                                                                                                   ` Ilpo Järvinen
  2008-08-27 21:42                                                                                                                                     ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-27 21:25 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3390 bytes --]

On Wed, 27 Aug 2008, Dâniel Fraga wrote:

> On Wed, 27 Aug 2008 23:32:34 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Ok. Though this all opens more questions than answers... :-(, why isn't 
> > there any traffic in neither of the tcpdumps then (not in the client's
> > nor in the server's).
> 
> 	Very strange. In this topic, I saw the discussion about some
> routers messing with traffic and frto related stuff, right? My server
> is behind some routers that I don't know (because it's not me who
> controls these routers). So if the problem is with some of these
> routers, I'm afraid we can do nothing about that.

Sure, but that still doesn't explain at all why the expected traffic 
doesn't show up anywhere (ie., the working mail after nmap resolution). 
Such a router cannot prevent us tcpdumping both ends. My point is that we 
would always see the packet at least in the sending end's tcpdump. And, 
would we have the traffic from the end-point tcpdumps, we could trivially 
figure out what the middlebox did to the traffic. 

> > I don't think it was bad at all... :-) I just meant that there wasn't any 
> > other visible traffic, which is very very strange because nothing port 995 
> > related (or anything else) seems to happen during the nmap... Which 
> > network interfaces the server has? Could things get routed through some 
> > other iface during the time of trouble (and during the nmap solution), 
> > that would explain why it isn't visible in the tcpdump which is for the 
> > specific interface.
> 
> fraga@teleporto ~$ ip route list
> 10.1.0.6 dev tun2  proto kernel  scope link  src 10.1.0.5 
> 10.195.195.1 dev tun1  proto kernel  scope link  src 10.195.195.2 
> 192.168.102.0/24 via 10.1.0.6 dev tun2 
> 200.211.201.0/24 dev eth1  proto kernel  scope link  src
> 200.211.201.248 189.38.0.0/16 dev eth0  proto kernel  scope link  src
> 189.38.18.122 default via 189.38.18.121 dev eth0 
> 
> 	Well, if that's the problem I'll be very ashamed for wasting
> your time. Anyway, eveything should go to eth0 interface, through
> 189.38.18.121 gateway.
>
> 	The eth1 interface (200.211.201.248) is an old interface which
> we do not use anymore. So I'm right now deactivating it (I should do
> that for a long time ago). Let's see if the problem remains or not
> (although I have other servers with multiple interfaces and everything
> is fine, since, as far as I understand, what matters is the default
> gateway -- I think that there's no reason to Linux send something to
> eth1 interface since there's only one default gateway).

Agreed, it shouldn't happen. Do you have a static setup for the IPs or is 
there dhcp component which could in theory cause some routing table 
alterations, again I find that unlikely but in the meantime I start to run 
out of ideas how the client and server can speak with each other without 
leaving any traces about it (I hope you really used exactly the tcpdump 
command you told in the mail linking to those stall logs).

> 	Anyway I'm dropping eth1 interface. I'll wait a few days before
> confirming if that's the problem or not.
> 
> 	Thanks again.

There's some NAT somewhere btw because the client uses 192.168.0.2 as 
source address but I don't think that currently has some relevance (it 
might timeout some connections after an idle period, which is usually
of configurable length).


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-27 21:25                                                                                                                                   ` Ilpo Järvinen
@ 2008-08-27 21:42                                                                                                                                     ` Dâniel Fraga
  2008-08-27 22:24                                                                                                                                       ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-27 21:42 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Thu, 28 Aug 2008 00:25:31 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Sure, but that still doesn't explain at all why the expected traffic 
> doesn't show up anywhere (ie., the working mail after nmap resolution). 
> Such a router cannot prevent us tcpdumping both ends. My point is that we 
> would always see the packet at least in the sending end's tcpdump. And, 
> would we have the traffic from the end-point tcpdumps, we could trivially 
> figure out what the middlebox did to the traffic. 

	Ok.

> Agreed, it shouldn't happen. Do you have a static setup for the IPs or is 
> there dhcp component which could in theory cause some routing table 

	Static setup. But there were traces of a multipath config I
used before and completely forgot (below).

> alterations, again I find that unlikely but in the meantime I start to run 
> out of ideas how the client and server can speak with each other without 
> leaving any traces about it (I hope you really used exactly the tcpdump 
> command you told in the mail linking to those stall logs).

	Yes, you can believe me. Anyway, I disabled the eth1 interface.
And there's more! We never had 2 links on this server, although I was
always prepared to use multipath and ip route policy (since the plans
were to have 2 links). I had 2 commands which I suspect maybe could be
messing everything:

#ip rule add from ${link0} lookup 1
#ip route add 0/0 via ${gw0} table 1

#ip rule add from ${link1} lookup 2
#ip route add 0/0 via ${gw1} table 2

	I deleted all of this. I just use this if I had to links from
two ISPs and decided to load balance between them. As it isn't the
case...

> There's some NAT somewhere btw because the client uses 192.168.0.2 as 
> source address but I don't think that currently has some relevance (it 
> might timeout some connections after an idle period, which is usually
> of configurable length).

	Yes, in my client machine, I'm behind a D-Link 524 router.
Anyway, give me some days, I'll test with these new changes for a week
at least before confirming the problem was solved (and with both frto
and htb enabled).

	If it's really this old Mulitpath misconfiguration, I apologize
for my error and the length discussion. Anyway, it's good to register,
in the case someone do the same error as me.

	Well, let's wait to see if the problem has gone.

	Thank you very much.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-27 21:42                                                                                                                                     ` Dâniel Fraga
@ 2008-08-27 22:24                                                                                                                                       ` Dâniel Fraga
  0 siblings, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-27 22:24 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Wed, 27 Aug 2008 18:42:05 -0300
Dâniel Fraga <fragabr@gmail.com> wrote:

> 	If it's really this old Mulitpath misconfiguration, I apologize
> for my error and the length discussion. Anyway, it's good to register,
> in the case someone do the same error as me.
> 
> 	Well, let's wait to see if the problem has gone.
> 
> 	Thank you very much.

	Well, you can ignore my previous message. Just got a stalled
connection. Since I do not have time to collect data now, I need to use
nmap to restablish the connection.

	Ok. So there's nothing to do with multipath.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-26 20:40                                                                                                                       ` Ilpo Järvinen
  2008-08-26 21:17                                                                                                                         ` Dâniel Fraga
@ 2008-08-28 21:49                                                                                                                         ` Dâniel Fraga
  2008-08-29 13:07                                                                                                                           ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-28 21:49 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Tue, 26 Aug 2008 23:40:58 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> while [ : ]; do
> 	date "+%s.%N"
> 	cat /proc/net/{netstat,snmp,tcp}
> 	sleep 1
> done

	Ok; Let's try again, now with more data (I hope):

	1) tcpdump (just port 995):

http://www.abusar.org/stall/dump-client
http://www.abusar.org/stall/dump-server

http://www.abusar.org/stall/dump-server-loopback

	I don't know if loopback is useful, just in case...

	2) the above script :

http://www.abusar.org/stall/script-client-log.txt
http://www.abusar.org/stall/script-server-log.txt

	3) and strace from the client Claws Mail:

http://www.abusar.org/stall/strace-client-claws-mail.txt

	I forgot to use the -r option in strace. But when Claws-mail
stalls, it gives the following multiple times:

read(4, 0xf48704, 4096)                 = -1 EAGAIN (Resource temporarily unavailable)
select(5, [4], [4], NULL, NULL)         = 1 (out [4])
writev(4, [{"xxxxxxxxxxxxxxx"..., 108}], 1) = 108
select(5, [4], [], NULL, NULL)          = 1 (in [4])
read(4, "xxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 108

	Well, I hope this time you have more information and I hope I didn't forget anything. 
If not, let's keep trying.

	Important: these data were collected with frto disabled (0) and htb disabled too. So
it isn't related to frto, neither htb.

	Thank you!

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-28 21:49                                                                                                                         ` Dâniel Fraga
@ 2008-08-29 13:07                                                                                                                           ` Ilpo Järvinen
  2008-08-29 17:41                                                                                                                             ` Dâniel Fraga
  2008-08-30  6:56                                                                                                                             ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-08-29 13:07 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2179 bytes --]

On Thu, 28 Aug 2008, Dâniel Fraga wrote:

> Well, I hope this time you have more information and I hope I 
> didn't forget anything. If not, let's keep trying.

Thanks. It took a moment for me to analyze such sheer amount of data, 
but I'm used to large logs... :-)

Can you check during a "normal" time if the ListenOverflows grows with as 
considerable rate as during the stall (no need to send that log to me,
just confirm that it doesn't do that is enough). A little cheat to do that 
for a logfile (the command I used):

grep -A1 "ListenOverflows" <log> | cut -d ' ' -f 21-22 | grep [0-9]

> Important: these data were collected with frto disabled (0) and htb 
> disabled too. So it isn't related to frto, neither htb.

I kind of assumed/knew that since the htb patch didn't solve it.

...When you use nmap to resolve, is the time always constant or do you run 
it until the situation resolves?

There are constantly 9 items in sk_ack_backlog (ie., connections which are 
not yet accept), those connections are in TCP_CLOSE_WAIT, then there are 
~7 connections hanging in SYN_RECV which cannot make progress (all of them 
from a single address besides two flows of yours in SYN_RECV).

So I guess that the configured 128 is not related to the number that 
is given to listen syscall, as it seems to be 9.

...Next we need to find out why dovecot is not accept()ing or is doing 
that dead slow (the client's state is hardly significant, so I guess 
it's no longer mandatory to collect it every time)...

Can you provide these to familiarize myself a bit to the server's 
environment (no need to wait for the stall):

ps ax | grep dovecot  (or whatever the process is named)
netstat -p -n -l | grep "995"

But you'll mostly have to resort to strace during the stall, I recommend 
trying to trace just part of the syscalls, eg at least these:

strace -e trace=accept,listen,close,shutdown,select

...as it would probably not be wise to make a full dump available (that it 
would contain every syscall). Alternatively, you can create one full dump 
for yourself and just grep the relevant parts. There may be need to strace
more than one process (all dovecot related).


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-29 13:07                                                                                                                           ` Ilpo Järvinen
@ 2008-08-29 17:41                                                                                                                             ` Dâniel Fraga
  2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
  2008-08-30  6:56                                                                                                                             ` Dâniel Fraga
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-29 17:41 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Fri, 29 Aug 2008 16:07:04 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Can you check during a "normal" time if the ListenOverflows grows with as 
> considerable rate as during the stall (no need to send that log to me,
> just confirm that it doesn't do that is enough). A little cheat to do that 
> for a logfile (the command I used):
> 
> grep -A1 "ListenOverflows" <log> | cut -d ' ' -f 21-22 | grep [0-9]

	It does not grow:

10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953
10953 10953

	It stays in this value for a long time.

> ...When you use nmap to resolve, is the time always constant or do you run 
> it until the situation resolves?

	The time is constant. It takes just 3 seconds to nmap to
"solve" the problem. I always have to use Ctrl+C to stop nmap before it
completes the scanning because in the first 3 seconds the problem is
"solved".

> There are constantly 9 items in sk_ack_backlog (ie., connections which are 
> not yet accept), those connections are in TCP_CLOSE_WAIT, then there are 
> ~7 connections hanging in SYN_RECV which cannot make progress (all of them 
> from a single address besides two flows of yours in SYN_RECV).
> 
> So I guess that the configured 128 is not related to the number that 
> is given to listen syscall, as it seems to be 9.
> 
> ...Next we need to find out why dovecot is not accept()ing or is doing 
> that dead slow (the client's state is hardly significant, so I guess 
> it's no longer mandatory to collect it every time)...

	Would it be useful if I do the same for port 119? Because inn
(nntp) stalls too. And proftp too. So I'm sure it isn't related to
dovecot, otherwise the other services wouldn't stall too.

> Can you provide these to familiarize myself a bit to the server's 
> environment (no need to wait for the stall):
> 
> ps ax | grep dovecot  (or whatever the process is named)

fraga@teleporto ~$ ps ax|grep dovecot
 2361 ?        Ss     0:13 /usr/local/sbin/dovecot
 2363 ?        S      0:07 dovecot-auth
 4751 ?        S      0:00 dovecot-auth -w
 6133 ?        S      0:00 dovecot-auth -w
 6134 ?        S      0:00 dovecot-auth -w
15963 ?        S      0:00 dovecot-auth -w

	The dovecot-auth I use for postfix too.

> netstat -p -n -l | grep "995"

fraga@teleporto ~$ sudo netstat -p -n -l | grep "995"
Password:
tcp        0      0 0.0.0.0:995             0.0.0.0:*       LISTEN      2361/dovecot        

> But you'll mostly have to resort to strace during the stall, I recommend 
> trying to trace just part of the syscalls, eg at least these:
> 
> strace -e trace=accept,listen,close,shutdown,select
> 
> ...as it would probably not be wise to make a full dump available (that it 
> would contain every syscall). Alternatively, you can create one full dump 
> for yourself and just grep the relevant parts. There may be need to strace
> more than one process (all dovecot related).
	
	Ok, at next stall I'll do that.

	Maybe it's good to strace inn and proftp too, right?

	Don't you think it's interesting that http (apache) and ssh never stalls?

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-29 13:07                                                                                                                           ` Ilpo Järvinen
  2008-08-29 17:41                                                                                                                             ` Dâniel Fraga
@ 2008-08-30  6:56                                                                                                                             ` Dâniel Fraga
  2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-08-30  6:56 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Fri, 29 Aug 2008 16:07:04 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> ...as it would probably not be wise to make a full dump available (that it 
> would contain every syscall). Alternatively, you can create one full dump 
> for yourself and just grep the relevant parts. There may be need to strace
> more than one process (all dovecot related).

	While waiting for a stall, I was thinking here: is there any
chance it could be a bug generated by gcc 4.3? I saw the date gcc 4.3.0
was released and it's just after 2.6.24 and before 2.6.25...

	I was using gcc 4.3.1 and now 4.3.2... but maybe I could try go
back to gcc 4.2.4 to test...

	Which version of gcc you developers are using?

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-29 17:41                                                                                                                             ` Dâniel Fraga
@ 2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
  0 siblings, 0 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-01  7:11 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4074 bytes --]

On Fri, 29 Aug 2008, Dâniel Fraga wrote:

> On Fri, 29 Aug 2008 16:07:04 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Can you check during a "normal" time if the ListenOverflows grows with as 
> > considerable rate as during the stall (no need to send that log to me,
> > just confirm that it doesn't do that is enough). A little cheat to do that 
> > for a logfile (the command I used):
> > 
> > grep -A1 "ListenOverflows" <log> | cut -d ' ' -f 21-22 | grep [0-9]
> 
> 	It does not grow:
> 
> 10953 10953

...snip...

> 	It stays in this value for a long time.

Yeah, a constant one is expected. During the stall it was growing sharply.

> > ...When you use nmap to resolve, is the time always constant or do you run 
> > it until the situation resolves?
> 
> 	The time is constant. It takes just 3 seconds to nmap to
> "solve" the problem. I always have to use Ctrl+C to stop nmap before it
> completes the scanning because in the first 3 seconds the problem is
> "solved".

Thanks (though I hoped the other way around :-)).

> > There are constantly 9 items in sk_ack_backlog (ie., connections which are 
> > not yet accept), those connections are in TCP_CLOSE_WAIT, then there are 
> > ~7 connections hanging in SYN_RECV which cannot make progress (all of them 
> > from a single address besides two flows of yours in SYN_RECV).
> > 
> > So I guess that the configured 128 is not related to the number that 
> > is given to listen syscall, as it seems to be 9.
> > 
> > ...Next we need to find out why dovecot is not accept()ing or is doing 
> > that dead slow (the client's state is hardly significant, so I guess 
> > it's no longer mandatory to collect it every time)...
> 
> 	Would it be useful if I do the same for port 119? Because inn
> (nntp) stalls too. And proftp too. So I'm sure it isn't related to
> dovecot, otherwise the other services wouldn't stall too.

Sure. Whatever of them you feel is the best choice but I doubt there's 
much benefit from doing that for many at the same time. Once we find out 
what is happening for one, the others are the same.

ftp is problematic to tcpdump. Nntp should be fine I guess.

> > Can you provide these to familiarize myself a bit to the server's 
> > environment (no need to wait for the stall):
> > 
> > ps ax | grep dovecot  (or whatever the process is named)
> 
> fraga@teleporto ~$ ps ax|grep dovecot
>  2361 ?        Ss     0:13 /usr/local/sbin/dovecot
>  2363 ?        S      0:07 dovecot-auth
>  4751 ?        S      0:00 dovecot-auth -w
>  6133 ?        S      0:00 dovecot-auth -w
>  6134 ?        S      0:00 dovecot-auth -w
> 15963 ?        S      0:00 dovecot-auth -w
> 
> 	The dovecot-auth I use for postfix too.
> 
> > netstat -p -n -l | grep "995"
> 
> fraga@teleporto ~$ sudo netstat -p -n -l | grep "995"
> Password:
> tcp        0      0 0.0.0.0:995             0.0.0.0:*       LISTEN      2361/dovecot        
> 
> > But you'll mostly have to resort to strace during the stall, I recommend 
> > trying to trace just part of the syscalls, eg at least these:
> > 
> > strace -e trace=accept,listen,close,shutdown,select
> > 
> > ...as it would probably not be wise to make a full dump available (that it 
> > would contain every syscall). Alternatively, you can create one full dump 
> > for yourself and just grep the relevant parts. There may be need to strace
> > more than one process (all dovecot related).
> 	
> 	Ok, at next stall I'll do that.
> 
> 	Maybe it's good to strace inn and proftp too, right?

I'm fine with either way. Basically we just want to find out where server 
processes are waiting when the stall happens. If at least one of them was 
in accept() but never made progress it's related to wakeup somehow, if not 
in accept, well, lets reconsider then...

> Don't you think it's interesting that http (apache) and ssh never 
> stalls?

It is interesting, yes... but do you have some idea how that would help
to solve the problem (I don't)? Only thing that I could think of is that 
it could related to setsockopt()s they set differently.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-08-30  6:56                                                                                                                             ` Dâniel Fraga
@ 2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
  2008-09-07  8:17                                                                                                                                 ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-01  7:11 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1297 bytes --]

On Sat, 30 Aug 2008, Dâniel Fraga wrote:

> On Fri, 29 Aug 2008 16:07:04 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > ...as it would probably not be wise to make a full dump available (that it 
> > would contain every syscall). Alternatively, you can create one full dump 
> > for yourself and just grep the relevant parts. There may be need to strace
> > more than one process (all dovecot related).
> 
> 	While waiting for a stall, I was thinking here: is there any
> chance it could be a bug generated by gcc 4.3? I saw the date gcc 4.3.0
> was released and it's just after 2.6.24 and before 2.6.25...
> 
> 	I was using gcc 4.3.1 and now 4.3.2... but maybe I could try go
> back to gcc 4.2.4 to test...

That's one option. If you do that, you could try catching two flies at the 
same time by selecting something else than tickless.

> 	Which version of gcc you developers are using?

I guess that on x86 most use some recent/semi-recent by default but there 
are some with old as well, while the non-x86 archs tend to have more often 
a bit older gccs I guess.

Anyway, if gcc did something wrong, it is still mostly correct, ie., 
there's just some race (which is likely non-corrupting even). And hitting 
that might not be very easy for some of the devs.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
@ 2008-09-07  8:17                                                                                                                                 ` Dâniel Fraga
  2008-09-08 10:27                                                                                                                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-07  8:17 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Mon, 1 Sep 2008 10:11:25 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> That's one option. If you do that, you could try catching two flies at the 
> same time by selecting something else than tickless.

	Hi Ilpo. I *think* I discovered the source of the problem. It's
not related to gcc, neither dynticks.

	I'm almost sure it's related to *High Resolution Timer*. I
simply disabled this option and the problems disappeared.

	I'd like to ask you if it does make sense, based on the
problem we've being discussing over these weeks. What's your opinion?

	Thank you.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-07  8:17                                                                                                                                 ` Dâniel Fraga
@ 2008-09-08 10:27                                                                                                                                   ` Ilpo Järvinen
  2008-09-08 20:20                                                                                                                                     ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-08 10:27 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1016 bytes --]

On Sun, 7 Sep 2008, Dâniel Fraga wrote:

> On Mon, 1 Sep 2008 10:11:25 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > That's one option. If you do that, you could try catching two flies at the 
> > same time by selecting something else than tickless.
> 
> 	Hi Ilpo. I *think* I discovered the source of the problem. It's
> not related to gcc, neither dynticks.
> 
> 	I'm almost sure it's related to *High Resolution Timer*. I
> simply disabled this option and the problems disappeared.
> 
> 	I'd like to ask you if it does make sense, based on the
> problem we've being discussing over these weeks. What's your opinion?

It could well be possible, accept seems to call schedule_timeout if 
nothing is immediately available (but I don't know well enough what 
end up being hrtimer'ed when you enable them and what will not)... 
Anyway, how long did you test for that to confirm it?

Does this explain the 2.6.24->2.6.25 change in behavior as well (ie., 
did they got enabled there)?

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-08 10:27                                                                                                                                   ` Ilpo Järvinen
@ 2008-09-08 20:20                                                                                                                                     ` Dâniel Fraga
  2008-09-11 13:44                                                                                                                                       ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-08 20:20 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Mon, 8 Sep 2008 13:27:43 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> It could well be possible, accept seems to call schedule_timeout if 
> nothing is immediately available (but I don't know well enough what 
> end up being hrtimer'ed when you enable them and what will not)... 
> Anyway, how long did you test for that to confirm it?

	It has been five days since I disable high resolution timer and
have not got any problems anymore.

> Does this explain the 2.6.24->2.6.25 change in behavior as well (ie., 
> did they got enabled there)?

	Well, as far as I know, high res timer related, what changed
in 2.6.25 is the following:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=8f4d37ec073c17e2d4aa8851df5837d798606d6f

	(although if I disable dynticks, the problem persists)

	Although in 2.6.24 we already had, for example the x86-32/64
arch reunification and I don't know if it has anything to do with my
problem in 2.6.25... just some thoughts... I wrote that because the
problem doesn't happen in 32 bit machines, but only in x86_64...

	Of course I'm not saying for sure that the high res timer is
causing this. Maybe, as you said before, the problem is much more
complex and realy don't know what in fact uses high res timer.

	Anyway, I'll leave high res timer disabled for now until we
discover something new.
	
-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-08 20:20                                                                                                                                     ` Dâniel Fraga
@ 2008-09-11 13:44                                                                                                                                       ` Ilpo Järvinen
  2008-09-11 17:30                                                                                                                                         ` Dâniel Fraga
  2008-09-11 18:12                                                                                                                                         ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-11 13:44 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1782 bytes --]

On Mon, 8 Sep 2008, Dâniel Fraga wrote:

> On Mon, 8 Sep 2008 13:27:43 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > It could well be possible, accept seems to call schedule_timeout if 
> > nothing is immediately available (but I don't know well enough what 
> > end up being hrtimer'ed when you enable them and what will not)... 
> > Anyway, how long did you test for that to confirm it?
> 
> 	It has been five days since I disable high resolution timer and
> have not got any problems anymore.
> 
> > Does this explain the 2.6.24->2.6.25 change in behavior as well (ie., 
> > did they got enabled there)?
> 
> 	Well, as far as I know, high res timer related, what changed
> in 2.6.25 is the following:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=8f4d37ec073c17e2d4aa8851df5837d798606d6f
> 
> 	(although if I disable dynticks, the problem persists)
> 
> 	Although in 2.6.24 we already had, for example the x86-32/64
> arch reunification and I don't know if it has anything to do with my
> problem in 2.6.25... just some thoughts... I wrote that because the
> problem doesn't happen in 32 bit machines, but only in x86_64...
>
> 	Of course I'm not saying for sure that the high res timer is
> causing this. Maybe, as you said before, the problem is much more
> complex and realy don't know what in fact uses high res timer.
> 
> 	Anyway, I'll leave high res timer disabled for now until we
> discover something new.

...I guess it would be possible to remove SCHED_FEAT_HRTICK from
/proc/sys/kernel/sched_features then while keeping the hrtimers
otherwise enabled to test this.

It's possible that hrtimers just affect on how easy it is to trigger
but at least it seems an useful lead until proven otherwise.


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-11 13:44                                                                                                                                       ` Ilpo Järvinen
@ 2008-09-11 17:30                                                                                                                                         ` Dâniel Fraga
  2008-09-12 10:16                                                                                                                                           ` Ilpo Järvinen
  2008-09-11 18:12                                                                                                                                         ` Dâniel Fraga
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-11 17:30 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Thu, 11 Sep 2008 16:44:20 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> ...I guess it would be possible to remove SCHED_FEAT_HRTICK from
> /proc/sys/kernel/sched_features then while keeping the hrtimers
> otherwise enabled to test this.
> 
> It's possible that hrtimers just affect on how easy it is to trigger
> but at least it seems an useful lead until proven otherwise.

	You're right Ilpo. After days and days without the problem,
today it triggered (but I wasn't online at the time, so I couldn't grab
any data).

	So, you're correct. HRtimers just affect on how easy it is to
trigger the issue. In other words: with high resolution timer enabled,
the problem appears more frequently.

	At least if we discovered a way how to trigger this, we could
test it more easily. The problem is to wait a long time for it to
happen.

	Just a curiosity: on your servers, do you use x86_64? It seems
this problem is very specific to x86_64 or appear more often on x86_64
than x86_32. It never happens on my x86_32 bit servers.

	


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-11 13:44                                                                                                                                       ` Ilpo Järvinen
  2008-09-11 17:30                                                                                                                                         ` Dâniel Fraga
@ 2008-09-11 18:12                                                                                                                                         ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-11 18:12 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Thu, 11 Sep 2008 16:44:20 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> ...I guess it would be possible to remove SCHED_FEAT_HRTICK from
> /proc/sys/kernel/sched_features then while keeping the hrtimers
> otherwise enabled to test this.
> 
> It's possible that hrtimers just affect on how easy it is to trigger
> but at least it seems an useful lead until proven otherwise.

	Well, I have a new suspect now: ntpd. It seems that when ntpd
syncs the clock, the problem happens (just a guess):

Sep 11 13:55:31 tux ntpd[2652]: synchronized to 143.107.255.15, stratum 2 
Sep 11 13:55:31 tux ntpd[2652]: kernel time sync enabled 0001

	I disabled ntpd (and I'll just sync the clock with ntpdate just one 
time at the boot) and see what happens. I think the problem could be related
to this, since "sudo" is affected too and as far as I know, sudo is very 
sensible to timer.


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-11 17:30                                                                                                                                         ` Dâniel Fraga
@ 2008-09-12 10:16                                                                                                                                           ` Ilpo Järvinen
  2008-09-13 23:31                                                                                                                                             ` Dâniel Fraga
  2008-09-15 19:42                                                                                                                                             ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-12 10:16 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2414 bytes --]

On Thu, 11 Sep 2008, Dâniel Fraga wrote:

> On Thu, 11 Sep 2008 16:44:20 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > ...I guess it would be possible to remove SCHED_FEAT_HRTICK from
> > /proc/sys/kernel/sched_features then while keeping the hrtimers
> > otherwise enabled to test this.
> > 
> > It's possible that hrtimers just affect on how easy it is to trigger
> > but at least it seems an useful lead until proven otherwise.
> 
> 	You're right Ilpo. After days and days without the problem,
> today it triggered (but I wasn't online at the time, so I couldn't grab
> any data).

Thanks. Once we know what the userspace at the server is doing, it might 
make the problem immediately obvious, though I'm a bit afraid that e.g., 
strace might interfere with the problem so that it resolves right away and 
we're again left with nothing...

> 	So, you're correct. HRtimers just affect on how easy it is to
> trigger the issue. In other words: with high resolution timer enabled,
> the problem appears more frequently.
> 
> 	At least if we discovered a way how to trigger this, we could
> test it more easily. The problem is to wait a long time for it to
> happen.
> 
> 	Just a curiosity: on your servers,

I don't really have any I would call "server" in the sense you mean, I 
might occassionally set up one for test from time to time for a very 
limited period but normally it's just ssh and some other which I use so 
rarely that I'd hardly notice, and that's it. I was planning, however,
to setup some day a distcc stress test using all my spare cpu cycles 
(I'd like to put it under kvm but that got stalled due to some timing 
issue at the guest making it to go into an infinite loop), once I get
that working I could probably easily put other test-only stuff to that 
framework as well.

But but, there are other people around the world besides us :-), and 
afaict this is the only (outstanding) report which relates to ceasing of 
accept() so I doubt it's something very regularly occuring thing or we 
would have heard of it.

> do you use x86_64?

At least on some machines, but like you have discovered it seems to 
service dependant, so that some processes never got stuck, I might only 
run such or so, who knows...

> It seems
> this problem is very specific to x86_64 or appear more often on x86_64
> than x86_32. It never happens on my x86_32 bit servers.

Ok.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-12 10:16                                                                                                                                           ` Ilpo Järvinen
@ 2008-09-13 23:31                                                                                                                                             ` Dâniel Fraga
  2008-09-16 12:10                                                                                                                                               ` Ilpo Järvinen
  2008-09-15 19:42                                                                                                                                             ` Dâniel Fraga
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-13 23:31 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Fri, 12 Sep 2008 13:16:19 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Ok.

	Ilpo, except for DROP [INPUT] lines, does the log below means something to you?

Sep 13 20:01:21 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20607 DF PROTO=TCP SPT=4038 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:21 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20616 DF PROTO=TCP SPT=4054 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:21 teleporto vmunix: C193.8. S=5.5.5.5 E=8TS00 RC00 T=1 D262DF PROTO=TCP SPT=4146 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20690 DF PROTO=TCP SPT=4179 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20698 DF PROTO=TCP SPT=4201 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20707 DF PROTO=TCP SPT=4231 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20725 DF PROTO=TCP SPT=4294 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:22 teleporto vmunix: OOTPST45 P=89WNO=53 E=x0SNUG= 

	I mean these lines:

Sep 13 20:01:21 teleporto vmunix: C193.8. S=5.5.5.5 E=8TS00 RC00 T=1 D262DF PROTO=TCP SPT=4146 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
Sep 13 20:01:22 teleporto vmunix: OOTPST45 P=89WNO=53 E=x0SNUG= 

	This was registered during a stall. I didn't collect more data because I had to restore the server as fast as I can.

	If it doesn't help or doen't mean anything useful, please ignore.

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-12 10:16                                                                                                                                           ` Ilpo Järvinen
  2008-09-13 23:31                                                                                                                                             ` Dâniel Fraga
@ 2008-09-15 19:42                                                                                                                                             ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-15 19:42 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Fri, 12 Sep 2008 13:16:19 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Ok.

Sep 15 09:22:38 teleporto vmunix: 600 RC00 T=4 D337POOIM YE3CD=3[R=93.812DT1210624LN4 O=x0PE=x0TL5 D0D RT=C NOPEE[ ye]]
Sep 15 09:53:49 teleporto vmunix: 6DO IPT:I=t0OT A=ff:ff:ff:05:36:b50:0SC13192522DT25252525LN4 O=x0PE=x0TL12I=
FPOOTPST51 P=31 IDW650RS00 C Y RP0

	This strange kernel messages are normal or is this clearly something buggy? This is logged whenever the connection 
is stalled.

	Anyway, I'm 50% almost sure that it's something related to ntpd adjusting time. I do not mean that whenever ntpd syncs 
the time, the connection is stalled but I need a few more weeks to assure this.

	Basically without ntpd, everything is fine, but when ntpd is running, the stall happens. Maybe the kernel gets confused when 
ntpd changes the time? It shouldn't happen of course. I'll reply in a few weeks. Thanks.


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-13 23:31                                                                                                                                             ` Dâniel Fraga
@ 2008-09-16 12:10                                                                                                                                               ` Ilpo Järvinen
  2008-09-16 14:24                                                                                                                                                 ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-16 12:10 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3578 bytes --]

I've copied stuff from the other mail to here... Sorry for the delay, I 
had already looked into it but left it as postponed and I've been busy in 
other things...

On Sat, 13 Sep 2008, Dâniel Fraga wrote:

> 	Ilpo, except for DROP [INPUT] lines, does the log below means something to you?
> 
> Sep 13 20:01:21 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20607 DF PROTO=TCP SPT=4038 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:21 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20616 DF PROTO=TCP SPT=4054 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:21 teleporto vmunix: C193.8. S=5.5.5.5 E=8TS00 RC00 T=1 D262DF PROTO=TCP SPT=4146 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20690 DF PROTO=TCP SPT=4179 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20698 DF PROTO=TCP SPT=4201 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20707 DF PROTO=TCP SPT=4231 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:22 teleporto vmunix: DROP [INPUT]: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:50:73:6c:4b:e5:08:00 SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20725 DF PROTO=TCP SPT=4294 DPT=4899 WINDOW=65535 RES=0x00 SYN URGP=0 
> Sep 13 20:01:22 teleporto vmunix: OOTPST45 P=89WNO=53 E=x0SNUG= 
> 
> 	I mean these lines:
> 
> SRC=189.31.180.2 DST=255.255.255.255 LEN=48 TOS=0x00 PREC=0x00 TTL=116 ID=20616
> C193.8. S=5.5.5.5 E=8TS00 RC00 T=1 D262

...Funny, it's printing every second character of a correct line. How 
that can happen, other people are much more qualified to give a meaningful 
answer...

> Sep 13 20:01:22 teleporto vmunix: OOTPST45 P=89WNO=53 E=x0SNUG= 
> 
> This strange kernel messages are normal or is this clearly something
> buggy? This is logged whenever the connection is stalled.

I've no idea how they get generated.

>        Anyway, I'm 50% almost sure that it's something related to ntpd
> adjusting time. I do not mean that whenever ntpd syncs 
> the time, the connection is stalled but I need a few more weeks to 
> assure this.

How positive you actually are that it's exactly at that time? Ie., have 
you really check that the timing really matches as ntp syncs time every 
now and then, I wouldn't be surprised if it would happen "always" close 
enough to give a false alarm.

"50% almost sure" didn't sound that convincing (whatever it means in the 
first place).

>        Basically without ntpd, everything is fine, but when ntpd is
> running, the stall happens. Maybe the kernel gets confused when 
> ntpd changes the time? It shouldn't happen of course. I'll reply in a 
> few weeks. Thanks.

Only thing I know to ask is, do you have any idea if your ntpd is 
hard-stepping the time instead of adjusting the clock's rate a bit (the 
latter should keep the clock monotonious besides potential bugs)?

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-16 12:10                                                                                                                                               ` Ilpo Järvinen
@ 2008-09-16 14:24                                                                                                                                                 ` Dâniel Fraga
  2008-09-17 10:23                                                                                                                                                   ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-16 14:24 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Tue, 16 Sep 2008 15:10:33 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Only thing I know to ask is, do you have any idea if your ntpd is 
> hard-stepping the time instead of adjusting the clock's rate a bit (the 
> latter should keep the clock monotonious besides potential bugs)?

	I assume it's adjusting the clock's rate a bit. Anyway, it's a
pretty simple config:

fraga@teleporto ~$ cat /etc/ntp.conf 
server ntp.usp.br
server ntp.nasa.gov
driftfile /etc/ntp.drift

	And ntpd is running without any special parameters.

	The log messages are as simple as:

Sep 15 03:56:04 teleporto ntpd[2301]: frequency initialized 5.891 PPM from /etc/ntp.drift
Sep 15 03:59:25 teleporto ntpd[2304]: frequency initialized 5.891 PPM from /etc/ntp.drift
Sep 15 04:03:49 teleporto ntpd[2304]: synchronized to 143.107.255.15, stratum 2
Sep 15 04:03:49 teleporto ntpd[2304]: kernel time sync status change 0001
Sep 15 04:10:16 teleporto ntpd[2304]: synchronized to 198.123.30.132, stratum 1
Sep 15 04:11:58 teleporto ntpd[2301]: frequency initialized 5.891 PPM from /etc/ntp.drift
Sep 15 04:16:18 teleporto ntpd[2301]: synchronized to 198.123.30.132, stratum 1
Sep 15 04:16:18 teleporto ntpd[2301]: kernel time sync status change 0001
Sep 15 12:08:53 teleporto ntpd[2301]: kernel time sync status change 4001
Sep 15 12:34:31 teleporto ntpd[2301]: kernel time sync status change 0001
Sep 15 14:34:06 teleporto ntpd[2301]: kernel time sync status change 4001
Sep 15 14:51:12 teleporto ntpd[2301]: kernel time sync status change 0001 

	If I understood correctly what do you mean, ntpd adjusts nicely the time to not 
cause huge differences in the time.

	And we're reaching the conclusion that the timer code from 2.6.25 and above 
have something wrong, since 2.6.24 and below is ok, which causes those stalls.

	But I'll wait some more time to confirm this, although I'm almost sure it's a timer
related bug which has this colateral effect of stalling connections.

	And just a question: do you use ntpd on your own desktop?
	
	Thank you!

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-16 14:24                                                                                                                                                 ` Dâniel Fraga
@ 2008-09-17 10:23                                                                                                                                                   ` Ilpo Järvinen
  2008-09-18 20:35                                                                                                                                                     ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-17 10:23 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2600 bytes --]

On Tue, 16 Sep 2008, Dâniel Fraga wrote:

> On Tue, 16 Sep 2008 15:10:33 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Only thing I know to ask is, do you have any idea if your ntpd is 
> > hard-stepping the time instead of adjusting the clock's rate a bit (the 
> > latter should keep the clock monotonious besides potential bugs)?
> 
> 	I assume it's adjusting the clock's rate a bit. Anyway, it's a
> pretty simple config:
> 
> fraga@teleporto ~$ cat /etc/ntp.conf 
> server ntp.usp.br
> server ntp.nasa.gov
> driftfile /etc/ntp.drift
> 
> 	And ntpd is running without any special parameters.
> 
> 	The log messages are as simple as:
> 
> Sep 15 03:56:04 teleporto ntpd[2301]: frequency initialized 5.891 PPM from /etc/ntp.drift
> Sep 15 03:59:25 teleporto ntpd[2304]: frequency initialized 5.891 PPM from /etc/ntp.drift
> Sep 15 04:03:49 teleporto ntpd[2304]: synchronized to 143.107.255.15, stratum 2
> Sep 15 04:03:49 teleporto ntpd[2304]: kernel time sync status change 0001
> Sep 15 04:10:16 teleporto ntpd[2304]: synchronized to 198.123.30.132, stratum 1
> Sep 15 04:11:58 teleporto ntpd[2301]: frequency initialized 5.891 PPM from /etc/ntp.drift
> Sep 15 04:16:18 teleporto ntpd[2301]: synchronized to 198.123.30.132, stratum 1
> Sep 15 04:16:18 teleporto ntpd[2301]: kernel time sync status change 0001
> Sep 15 12:08:53 teleporto ntpd[2301]: kernel time sync status change 4001
> Sep 15 12:34:31 teleporto ntpd[2301]: kernel time sync status change 0001
> Sep 15 14:34:06 teleporto ntpd[2301]: kernel time sync status change 4001
> Sep 15 14:51:12 teleporto ntpd[2301]: kernel time sync status change 0001 

I was to look where these (or actually the ones you mentioned earlier) 
messages exactly originate from in the source of ntpd but didn't yet have 
time.

> 	If I understood correctly what do you mean, ntpd adjusts nicely the time to not 
> cause huge differences in the time.

It is definately the default, if it's even possible to configure ntpd to 
just set forcibly the new time (with ntpdate you can decide that with 
-b/-B switch iirc).

> 	And we're reaching the conclusion that the timer code from 2.6.25 
> and above have something wrong, since 2.6.24 and below is ok, which 
> causes those stalls.

There were some other timer related complications in 2.6.25 but it's so 
long time ago that I hardly remember anything about those anymore (and 
I'm not an expert on those things anyway). And it's still very open issue 
how that would cause the problem you're seeing.

> 	And just a question: do you use ntpd on your own desktop?

Yes.


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-17 10:23                                                                                                                                                   ` Ilpo Järvinen
@ 2008-09-18 20:35                                                                                                                                                     ` Dâniel Fraga
  2008-09-18 21:04                                                                                                                                                       ` Ilpo Järvinen
  0 siblings, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-18 20:35 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Wed, 17 Sep 2008 13:23:28 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> There were some other timer related complications in 2.6.25 but it's so 
> long time ago that I hardly remember anything about those anymore (and 
> I'm not an expert on those things anyway). And it's still very open issue 
> how that would cause the problem you're seeing.

	I opened a bug report to timer developers...
let's see if they can help:

http://bugzilla.kernel.org/show_bug.cgi?id=11588

-- 

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-18 20:35                                                                                                                                                     ` Dâniel Fraga
@ 2008-09-18 21:04                                                                                                                                                       ` Ilpo Järvinen
  2008-09-21  3:02                                                                                                                                                         ` Dâniel Fraga
  2008-09-22  4:23                                                                                                                                                         ` Dâniel Fraga
  0 siblings, 2 replies; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-18 21:04 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1362 bytes --]

On Thu, 18 Sep 2008, Dâniel Fraga wrote:

> On Wed, 17 Sep 2008 13:23:28 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > There were some other timer related complications in 2.6.25 but it's so 
> > long time ago that I hardly remember anything about those anymore (and 
> > I'm not an expert on those things anyway). And it's still very open issue 
> > how that would cause the problem you're seeing.
> 
> 	I opened a bug report to timer developers...
> let's see if they can help:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=11588

Ok. Another potential candidate might be scheduler (my wording was sloppy 
when I used "timer related complications" while time related was my main 
intention)...

Anyway, if/when you succeed collecting some strace of the server 
processes, please let me know (though putting a full one available might 
not be wise thing like I said earlier). After I thought it a bit, it might 
be enough the start the strace with -p for all server processes of a 
service during a stall and then resolve it after some amount of waiting 
with nmap (and hope that strace doesn't resolve it by interfering 
something relevant :-), you will see that from the fact that it resolves 
without nmap then). That would probably reveal if the processes where 
waiting in accept() or not, and if not, where they were.

-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-18 21:04                                                                                                                                                       ` Ilpo Järvinen
@ 2008-09-21  3:02                                                                                                                                                         ` Dâniel Fraga
  2008-09-22  4:23                                                                                                                                                         ` Dâniel Fraga
  1 sibling, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-21  3:02 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Fri, 19 Sep 2008 00:04:23 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Anyway, if/when you succeed collecting some strace of the server 
> processes, please let me know (though putting a full one available might 
> not be wise thing like I said earlier). After I thought it a bit, it might 
> be enough the start the strace with -p for all server processes of a 
> service during a stall and then resolve it after some amount of waiting 
> with nmap (and hope that strace doesn't resolve it by interfering 
> something relevant :-), you will see that from the fact that it resolves 
> without nmap then). That would probably reveal if the processes where 
> waiting in accept() or not, and if not, where they were.

	I got a stall, tried to use strace but even strace couldn't
trace nothing. Everything which uses some kind of network connection is
stalled (or because everything is stalled, strace couldn't trace
anything).
	
	I'll try to leave strace running all the time, but I'm afraid
it could prevent the stall. Anyway, I'll test and return soon.


-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-18 21:04                                                                                                                                                       ` Ilpo Järvinen
  2008-09-21  3:02                                                                                                                                                         ` Dâniel Fraga
@ 2008-09-22  4:23                                                                                                                                                         ` Dâniel Fraga
  2008-09-22 11:22                                                                                                                                                           ` Ilpo Järvinen
  1 sibling, 1 reply; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-22  4:23 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Fri, 19 Sep 2008 00:04:23 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Anyway, if/when you succeed collecting some strace of the server 
> processes, please let me know (though putting a full one available might 
> not be wise thing like I said earlier). After I thought it a bit, it might 
> be enough the start the strace with -p for all server processes of a 
> service during a stall and then resolve it after some amount of waiting 
> with nmap (and hope that strace doesn't resolve it by interfering 
> something relevant :-), you will see that from the fact that it resolves 
> without nmap then). That would probably reveal if the processes where 
> waiting in accept() or not, and if not, where they were.

	Hi again Ilpo, I waited the whole day for a stall, and
fortunatelly it happened while I was stracing dovecot and child
processes. The stall happened at 01:11 (at the end). I hope that it
has something useful.

http://www.abusar.org/strace/dovecot.txt.bz2

	I then nmap'ed the server and killed strace.

	I used the following:

strace -t -p 2315 -f -e trace=accept,listen,close,shutdown,select -o dovecot.txt

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-22  4:23                                                                                                                                                         ` Dâniel Fraga
@ 2008-09-22 11:22                                                                                                                                                           ` Ilpo Järvinen
  2008-09-22 16:13                                                                                                                                                             ` Dâniel Fraga
  0 siblings, 1 reply; 116+ messages in thread
From: Ilpo Järvinen @ 2008-09-22 11:22 UTC (permalink / raw)
  To: Dâniel Fraga
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2166 bytes --]

On Mon, 22 Sep 2008, Dâniel Fraga wrote:

> On Fri, 19 Sep 2008 00:04:23 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Anyway, if/when you succeed collecting some strace of the server 
> > processes, please let me know (though putting a full one available might 
> > not be wise thing like I said earlier). After I thought it a bit, it might 
> > be enough the start the strace with -p for all server processes of a 
> > service during a stall and then resolve it after some amount of waiting 
> > with nmap (and hope that strace doesn't resolve it by interfering 
> > something relevant :-), you will see that from the fact that it resolves 
> > without nmap then). That would probably reveal if the processes where 
> > waiting in accept() or not, and if not, where they were.
> 
> 	Hi again Ilpo, I waited the whole day for a stall, and
> fortunatelly it happened while I was stracing dovecot and child
> processes. The stall happened at 01:11 (at the end). I hope that it
> has something useful.

It definately shows a stall, there are _no_ events between 0:53 and 1:11 
while there isn't any other period like that, every other minute since the 
start has some activity going on :-). So this might not be related to 
networking at all like we've kind of already figured out (definately 
accept() has very little to do here). There weren't close()'es there 
either so it looks very stuck on something that's outside of the syscalls 
we listed in -e, I suppose...

It seems that next sensible step is to just obtain a full strace to see 
what actually took place during those long minutes if anything (it's 
better that you keep that log private and just use grep over it on 
request). ...A full strace might grow huge though. Also, for strace use 
-tt instead of -t to get more accurate timestamps and add -T.

When you get the stall next time, please also check that the processes are 
actually sleeping instead of looping like crazy in some buggy userspace 
code :-) (obviously before resolving it with nmap).

When using nmap to resolve, take note on exact timestamp (including 
seconds). E.g., 
$ date > nmap.ts; nmap ...


-- 
 i.

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround
  2008-09-22 11:22                                                                                                                                                           ` Ilpo Järvinen
@ 2008-09-22 16:13                                                                                                                                                             ` Dâniel Fraga
  0 siblings, 0 replies; 116+ messages in thread
From: Dâniel Fraga @ 2008-09-22 16:13 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, thomas.jarosch, billfink, Netdev, Patrick Hardy,
	netfilter-devel, kadlec

On Mon, 22 Sep 2008 14:22:12 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> It seems that next sensible step is to just obtain a full strace to see 
> what actually took place during those long minutes if anything (it's 
> better that you keep that log private and just use grep over it on 
> request). ...A full strace might grow huge though. Also, for strace use 
> -tt instead of -t to get more accurate timestamps and add -T.
> 
> When you get the stall next time, please also check that the processes are 
> actually sleeping instead of looping like crazy in some buggy userspace 
> code :-) (obviously before resolving it with nmap).
> 
> When using nmap to resolve, take note on exact timestamp (including 
> seconds). E.g., 
> $ date > nmap.ts; nmap ...

	Thanks! Today I'm lucky. I got the stall fast. It seems that it
happens more frequently as more connections are made.

	What should I grep?

-- 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 116+ messages in thread

end of thread, other threads:[~2008-09-22 16:13 UTC | newest]

Thread overview: 116+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-26  8:47 Transfer stalls with NAT under 2.6.24.3 Sven Riedel
2008-03-26  9:24 ` Patrick McHardy
2008-03-26 10:21   ` Sven Riedel
2008-03-26 15:47     ` Patrick McHardy
2008-03-26 18:45       ` Jozsef Kadlecsik
2008-03-26 19:16         ` Krzysztof Oledzki
2008-03-31  6:53         ` Sven Riedel
2008-07-04 14:54           ` TCP connection stalls under 2.6.24.7 Thomas Jarosch
2008-07-04 20:58             ` Jozsef Kadlecsik
2008-07-04 21:04               ` Jozsef Kadlecsik
2008-07-07  9:18               ` Thomas Jarosch
2008-07-07 13:18                 ` Thomas Jarosch
2008-07-10 13:17                   ` Jozsef Kadlecsik
2008-07-10 14:12                     ` Thomas Jarosch
2008-07-10 21:21                       ` Jozsef Kadlecsik
2008-07-11 14:33                         ` Thomas Jarosch
2008-07-15 11:47                           ` Thomas Jarosch
2008-07-15 16:10                             ` Thomas Jarosch
2008-07-15 18:30                               ` Dâniel Fraga
2008-07-31  4:47                                 ` Dâniel Fraga
2008-07-31  7:39                                   ` Ilpo Järvinen
2008-08-02 12:24                                     ` Dâniel Fraga
2008-07-15 20:17                               ` Ilpo Järvinen
2008-07-16  8:07                                 ` Thomas Jarosch
2008-07-16  9:03                                 ` Thomas Jarosch
2008-07-17 13:55                                   ` Ilpo Järvinen
2008-07-17 15:15                                     ` Thomas Jarosch
2008-07-17 15:53                                       ` Ilpo Järvinen
2008-07-18  9:14                                         ` Thomas Jarosch
2008-07-18 13:55                                           ` Ilpo Järvinen
2008-07-18 14:02                                             ` Thomas Jarosch
2008-07-19  7:35                                               ` Ilpo Järvinen
2008-07-25 10:00                                               ` Ilpo Järvinen
2008-07-25 13:00                                                 ` Thomas Jarosch
2008-07-25 14:06                                                   ` Ilpo Järvinen
2008-07-25 15:34                                                     ` Thomas Jarosch
2008-07-31  7:39                                                       ` Thomas Jarosch
2008-07-31 12:44                                                         ` Dâniel Fraga
2008-07-31 13:47                                                           ` Thomas Jarosch
2008-07-31 14:11                                                             ` Dâniel Fraga
2008-08-06 18:53                                                             ` Dâniel Fraga
2008-08-07  6:54                                                               ` Ilpo Järvinen
2008-08-07 11:50                                                                 ` Denys Fedoryshchenko
2008-08-07 12:11                                                                   ` Thomas Jarosch
2008-08-07 12:14                                                                   ` Ilpo Järvinen
2008-08-07 12:23                                                                     ` Denys Fedoryshchenko
2008-08-08  9:56                                                                       ` Ilpo Järvinen
2008-08-08 10:32                                                                         ` Denys Fedoryshchenko
2008-08-07 11:33                                                               ` [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround Ilpo Järvinen
2008-08-08  4:42                                                                 ` Bill Fink
2008-08-08 10:32                                                                   ` Ilpo Järvinen
2008-08-11 21:44                                                                     ` David Miller
2008-08-12  7:46                                                                       ` Thomas Jarosch
2008-08-12  8:18                                                                         ` David Miller
2008-08-12 17:43                                                                           ` Dâniel Fraga
2008-08-12 17:52                                                                             ` Ilpo Järvinen
2008-08-13 17:53                                                                               ` Dâniel Fraga
2008-08-13 18:34                                                                                 ` Ilpo Järvinen
2008-08-15  4:34                                                                                   ` Dâniel Fraga
2008-08-15  7:06                                                                                     ` Ilpo Järvinen
2008-08-15 21:35                                                                                       ` Dâniel Fraga
2008-08-15 22:06                                                                                         ` Ilpo Järvinen
2008-08-15 23:57                                                                                           ` Dâniel Fraga
2008-08-16  2:15                                                                                           ` Dâniel Fraga
2008-08-16  7:10                                                                                             ` Ilpo Järvinen
2008-08-16 19:18                                                                                               ` Ilpo Järvinen
2008-08-17  0:36                                                                                                 ` Dâniel Fraga
2008-08-19 10:38                                                                                                   ` Ilpo Järvinen
2008-08-20  0:34                                                                                                     ` Dâniel Fraga
2008-08-20  7:57                                                                                                       ` Ilpo Järvinen
2008-08-20 12:37                                                                                                       ` Ilpo Järvinen
2008-08-22 21:32                                                                                                         ` Dâniel Fraga
2008-08-22 21:37                                                                                                           ` David Miller
2008-08-23 14:14                                                                                                             ` Dâniel Fraga
2008-08-23 14:38                                                                                                               ` Ilpo Järvinen
2008-08-24 19:38                                                                                                                 ` Dâniel Fraga
2008-08-26 14:10                                                                                                                   ` Ilpo Järvinen
2008-08-26 14:32                                                                                                                     ` Ilpo Järvinen
2008-08-26 17:18                                                                                                                     ` Dâniel Fraga
2008-08-26 20:40                                                                                                                       ` Ilpo Järvinen
2008-08-26 21:17                                                                                                                         ` Dâniel Fraga
2008-08-27 10:22                                                                                                                           ` Ilpo Järvinen
2008-08-27 19:51                                                                                                                             ` Dâniel Fraga
2008-08-27 20:32                                                                                                                               ` Ilpo Järvinen
2008-08-27 20:50                                                                                                                                 ` Dâniel Fraga
2008-08-27 21:25                                                                                                                                   ` Ilpo Järvinen
2008-08-27 21:42                                                                                                                                     ` Dâniel Fraga
2008-08-27 22:24                                                                                                                                       ` Dâniel Fraga
2008-08-28 21:49                                                                                                                         ` Dâniel Fraga
2008-08-29 13:07                                                                                                                           ` Ilpo Järvinen
2008-08-29 17:41                                                                                                                             ` Dâniel Fraga
2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
2008-08-30  6:56                                                                                                                             ` Dâniel Fraga
2008-09-01  7:11                                                                                                                               ` Ilpo Järvinen
2008-09-07  8:17                                                                                                                                 ` Dâniel Fraga
2008-09-08 10:27                                                                                                                                   ` Ilpo Järvinen
2008-09-08 20:20                                                                                                                                     ` Dâniel Fraga
2008-09-11 13:44                                                                                                                                       ` Ilpo Järvinen
2008-09-11 17:30                                                                                                                                         ` Dâniel Fraga
2008-09-12 10:16                                                                                                                                           ` Ilpo Järvinen
2008-09-13 23:31                                                                                                                                             ` Dâniel Fraga
2008-09-16 12:10                                                                                                                                               ` Ilpo Järvinen
2008-09-16 14:24                                                                                                                                                 ` Dâniel Fraga
2008-09-17 10:23                                                                                                                                                   ` Ilpo Järvinen
2008-09-18 20:35                                                                                                                                                     ` Dâniel Fraga
2008-09-18 21:04                                                                                                                                                       ` Ilpo Järvinen
2008-09-21  3:02                                                                                                                                                         ` Dâniel Fraga
2008-09-22  4:23                                                                                                                                                         ` Dâniel Fraga
2008-09-22 11:22                                                                                                                                                           ` Ilpo Järvinen
2008-09-22 16:13                                                                                                                                                             ` Dâniel Fraga
2008-09-15 19:42                                                                                                                                             ` Dâniel Fraga
2008-09-11 18:12                                                                                                                                         ` Dâniel Fraga
2008-08-15 21:59                                                                                       ` Dâniel Fraga
2008-08-13  8:00                                                                           ` Thomas Jarosch
2008-08-22 21:18                                                                         ` Ilpo Järvinen
2008-08-11 21:41                                                                   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.