linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* TCP LAST-ACK state broken in 2.4.17-pre2
@ 2001-12-10 18:16 Mika Liljeberg
  2001-12-10 18:34 ` kuznet
  2001-12-11  0:13 ` David S. Miller
  0 siblings, 2 replies; 11+ messages in thread
From: Mika Liljeberg @ 2001-12-10 18:16 UTC (permalink / raw)
  To: kuznet, davem; +Cc: linux-kernel

Hi,

I came across the following behavior (sorry, no tcpdump but this should
be easy to reproduce with the right tools):

hostA                 hostB
  --------FIN----------->
  <-----data+FIN---------
  --------ACK-------X       (packet lost)
  <-----data+FIN---------   (retransmit)
  <-----data+FIN---------   (retransmit)
  <-----data+FIN---------   (retransmit)
          ....
  <-----data+FIN---------   (retransmit)
  --------RST----------->

HostA is running Linux 2.4.17-pre2. HostB is running Symbian OS. All the
sequence numbers pan out.

Either LAST-ACK is completely broken or Linux just cannot handle a
FIN-ACK that is piggybacked on a data segment, when received in LAST-ACK
state. It should be acked as an out-of-window segment, as usual.
Finally, the LAST-ACK state times out and Linux responds to the FIN
segment with an RST.

Cheers,

	MikaL

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-10 18:16 TCP LAST-ACK state broken in 2.4.17-pre2 Mika Liljeberg
@ 2001-12-10 18:34 ` kuznet
  2001-12-10 18:54   ` Mika Liljeberg
  2001-12-11  0:13 ` David S. Miller
  1 sibling, 1 reply; 11+ messages in thread
From: kuznet @ 2001-12-10 18:34 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: davem, linux-kernel

Hello!

> Either LAST-ACK is completely broken or Linux just cannot handle a
> FIN-ACK that is piggybacked on a data segment, when received in LAST-ACK
> state. 

It cannot handle even pure FIN in this state. :-( I bring apologies,
it is my fault. Thank you.

Well, you can just add one line to tcp_input.c to repair this.

                }
                /* Fall through */
+       case TCP_LAST_ACK:
        case TCP_ESTABLISHED:
                tcp_data_queue(sk, skb);


Dave, "official" patch will follow later. I must think about
some marginal effect in TCP_CLOSE_WAIT and TCP_CLOSING, which can break
out of switch too. Duh, do specs say something about segments with seqs
above fin? I do not remember.

Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-10 18:34 ` kuznet
@ 2001-12-10 18:54   ` Mika Liljeberg
  2001-12-11 17:51     ` kuznet
  0 siblings, 1 reply; 11+ messages in thread
From: Mika Liljeberg @ 2001-12-10 18:54 UTC (permalink / raw)
  To: kuznet; +Cc: davem, linux-kernel

kuznet@ms2.inr.ac.ru wrote:
> Well, you can just add one line to tcp_input.c to repair this.

Thanks, that was quick! :)

> Duh, do specs say something about segments with seqs
> above fin? I do not remember.

I don't think they do, aside from that LAST-ACK is a synchronized state.
I.e., if you set RCV.WND to zero after receiving a FIN, any subsequent
out-of-window (below or above) segment will be acked. However, I don't
think it matters much, since above-window packets would in this case
always be caused by a bug in the sender.

> Alexey

	MikaL

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-10 18:16 TCP LAST-ACK state broken in 2.4.17-pre2 Mika Liljeberg
  2001-12-10 18:34 ` kuznet
@ 2001-12-11  0:13 ` David S. Miller
  2001-12-11 17:24   ` kuznet
  1 sibling, 1 reply; 11+ messages in thread
From: David S. Miller @ 2001-12-11  0:13 UTC (permalink / raw)
  To: kuznet; +Cc: Mika.Liljeberg, linux-kernel

   From: kuznet@ms2.inr.ac.ru
   Date: Mon, 10 Dec 2001 21:34:47 +0300 (MSK)

   Dave, "official" patch will follow later. I must think about
   some marginal effect in TCP_CLOSE_WAIT and TCP_CLOSING, which can break
   out of switch too. Duh, do specs say something about segments with seqs
   above fin? I do not remember.

A socket in a synchronized state is required to enforce legal sequence
numbers, is it not?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-11  0:13 ` David S. Miller
@ 2001-12-11 17:24   ` kuznet
  0 siblings, 0 replies; 11+ messages in thread
From: kuznet @ 2001-12-11 17:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: Mika.Liljeberg, linux-kernel

Hello!

> A socket in a synchronized state is required to enforce legal sequence
> numbers, is it not?

They are . :-)

Well, assuming that this is really illegal we could just add
missing LAST_ACK close to its relative CLOSING, CLOSE_WAIT
(where it was forgotten old days occasionally, I think).
It is minimal change and this is good.

But I look at problem at our side: if we receive such packet yet,
what should we make? Earlier we sent an ACK and dropped
bad segment or aborted connection. Now we just blackhole them
and the bug with missing case LAST_ACK just allowed to see the fact
that we changed behaviour, which is not good. :-)

Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-10 18:54   ` Mika Liljeberg
@ 2001-12-11 17:51     ` kuznet
  2001-12-12 20:31       ` Mika Liljeberg
  0 siblings, 1 reply; 11+ messages in thread
From: kuznet @ 2001-12-11 17:51 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: davem, linux-kernel

Hello!

> Thanks, that was quick! :)

If everyone were "quick" in this manner, linux kernel even would not boot. :-)

Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-11 17:51     ` kuznet
@ 2001-12-12 20:31       ` Mika Liljeberg
  2001-12-13 17:59         ` kuznet
  0 siblings, 1 reply; 11+ messages in thread
From: Mika Liljeberg @ 2001-12-12 20:31 UTC (permalink / raw)
  To: kuznet; +Cc: davem, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 356 bytes --]

Alexey,

Looks like there are still problems after applying your quick patch.
Back at the lab we observed a case where the FIN-ACK packet is dropped
and Linux fails to retransmit it. See the attached dump for the details
(Linux is 10.0.5.11). The action ends there, with Linux timing out to
CLOSED state and the remote stuck in FIN-WAIT-2.

Cheers,

	MikaL

[-- Attachment #2: last-ack.txt --]
[-- Type: text/plain, Size: 977 bytes --]

11:11:57.149389 10.0.5.3.3071 > 10.0.5.11.1327: P 254033:255481(1448) ack 1 win 7300 <timestamp 3704210538 8515686,eol> (DF) (ttl 63, id 860, len 1500)
11:11:57.149451 10.0.5.11.1327 > 10.0.5.3.3071: . [tcp sum ok] 1:1(0) ack 255481 win 65160 <nop,nop,timestamp 8544990 3704210538> (DF) (ttl 64, id 30696, len 52)
11:11:57.661595 10.0.5.3.3071 > 10.0.5.11.1327: FP 255481:256001(520) ack 1 win 7300 <timestamp 3705679288 8515686,eol> (DF) (ttl 63, id 861, len 572)
11:11:57.661660 10.0.5.11.1327 > 10.0.5.3.3071: F [tcp sum ok] 1:1(0) ack 256002 win 65160 <nop,nop,timestamp 8545041 3705679288> (DF) (ttl 64, id 30697, len 52)
11:12:11.340666 10.0.5.3.3071 > 10.0.5.11.1327: FP 255481:256001(520) ack 1 win 7300 <timestamp 3727069913 8515686,eol> (DF) (ttl 63, id 863, len 572)
11:12:11.340698 10.0.5.11.1327 > 10.0.5.3.3071: . [tcp sum ok] 2:2(0) ack 256002 win 65160 <nop,nop,timestamp 8546409 3727069913,nop,nop,sack sack 1 {255481:256002} > (DF) (ttl 64, id 30698, len 64)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-12 20:31       ` Mika Liljeberg
@ 2001-12-13 17:59         ` kuznet
  2001-12-13 19:30           ` Mika Liljeberg
  0 siblings, 1 reply; 11+ messages in thread
From: kuznet @ 2001-12-13 17:59 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: davem, linux-kernel

Hello!

> Looks like there are still problems

This is not related to that problem.


> 11:11:57.149389 10.0.5.3.3071 > 10.0.5.11.1327: P 254033:255481(1448) ack 1 win 7300 <timestamp 3704210538 8515686,eol> (DF) (ttl 63, id 860, len 1500)
> 11:11:57.149451 10.0.5.11.1327 > 10.0.5.3.3071: . [tcp sum ok] 1:1(0) ack 255481 win 65160 <nop,nop,timestamp 8544990 3704210538> (DF) (ttl 64, id 30696, len 52)
> 11:11:57.661595 10.0.5.3.3071 > 10.0.5.11.1327: FP 255481:256001(520) ack 1 win 7300 <timestamp 3705679288 8515686,eol> (DF) (ttl 63, id 861, len 572)
> 11:11:57.661660 10.0.5.11.1327 > 10.0.5.3.3071: F [tcp sum ok] 1:1(0) ack 256002 win 65160 <nop,nop,timestamp 8545041 3705679288> (DF) (ttl 64, id 30697, len 52)
> 11:12:11.340666 10.0.5.3.3071 > 10.0.5.11.1327: FP 255481:256001(520) ack 1 win 7300 <timestamp 3727069913 8515686,eol> (DF) (ttl 63, id 863, len 572)
> 11:12:11.340698 10.0.5.11.1327 > 10.0.5.3.3071: . [tcp sum ok] 2:2(0) ack 256002 win 65160 <nop,nop,timestamp 8546409 3727069913,nop,nop,sack sack 1 {255481:256002} > (DF) (ttl 64, id 30698, len 64)

Please, make cat /proc/net/tcp at this point. To be honest I do not believe
that tcpdump finishes _here_. When will retransmit timer expire? Taking
into account that 10.0.5.3 has rto of 14 seconds (distance between retransmits
of its FIN :-)), linux can have even more. In the case of such bad connection
closing fin-wait-2 via abort is pretty normal.

Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-13 17:59         ` kuznet
@ 2001-12-13 19:30           ` Mika Liljeberg
  2001-12-13 19:38             ` kuznet
  0 siblings, 1 reply; 11+ messages in thread
From: Mika Liljeberg @ 2001-12-13 19:30 UTC (permalink / raw)
  To: kuznet; +Cc: davem, linux-kernel, mika.liljeberg

kuznet@ms2.inr.ac.ru wrote:
> This is not related to that problem.

I believe you. Nevertheless, it appears to be a problem that happens in
the LAST-ACK state.

> Please, make cat /proc/net/tcp at this point.

I'll do that if it happens again.

> To be honest I do not believe
> that tcpdump finishes _here_. When will retransmit timer expire? Taking
> into account that 10.0.5.3 has rto of 14 seconds (distance between retransmits
> of its FIN :-)), linux can have even more. In the case of such bad connection
> closing fin-wait-2 via abort is pretty normal.

I'm afraid it did end there. :( The data transfer was unidirectional
from the remote towards the Linux machine. During the SYN exchange the
RTT is less than one second. The rest is queuing delay. So Linux should
have a fairly low RTO. There were no FIN retransmissions, I'm sorry to
say.

> Alexey

Cheers,

	MikaL

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
  2001-12-13 19:30           ` Mika Liljeberg
@ 2001-12-13 19:38             ` kuznet
  0 siblings, 0 replies; 11+ messages in thread
From: kuznet @ 2001-12-13 19:38 UTC (permalink / raw)
  To: Mika Liljeberg; +Cc: davem, linux-kernel, mika.liljeberg

Hello!

> have a fairly low RTO. There were no FIN retransmissions, I'm sorry to
> say.

I believe, believe. :-)

It is possible _only_ if rto is at 120 seconds. It is the only case
when retransmissions do not happen and this would be normal behaviour.

For now it is the only hypothesis and it will be clear from /proc/net/tcp,
whether is this right or not.

Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: TCP LAST-ACK state broken in 2.4.17-pre2
@ 2001-12-13 10:19 Pasi Sarolahti
  0 siblings, 0 replies; 11+ messages in thread
From: Pasi Sarolahti @ 2001-12-13 10:19 UTC (permalink / raw)
  To: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

Mika wrote:
> Looks like there are still problems after applying your quick patch.
> Back at the lab we observed a case where the FIN-ACK packet is dropped
> and Linux fails to retransmit it. See the attached dump for the details
> (Linux is 10.0.5.11). The action ends there, with Linux timing out to
> CLOSED state and the remote stuck in FIN-WAIT-2.

I think following might happen: When the receiver gets FIN and acks it, it
should be in CLOSE_WAIT or LAST_ACK state depending on the situation,
right? In tcp_rcv_state_process() the receiver calls ack_snd_check, which
has the following test:

            if (!tcp_ack_scheduled(tp)) {
		/* We sent a data segment already. */
		return;
	}
	__tcp_ack_snd_check(sk, 1);

I think in this situation it may be possible that ack_scheduled is false,
which would mean that the receiver never acks the further FIN segments if
the first FIN-ack is lost. Maybe something like the following might work,
although it looks pretty ugly :-)

       if (!tcp_ack_scheduled(tp) &&
                                      (sk->state == TCP_ESTABLISHED ||
                                       sk->state == TCP_FIN_WAIT1)) {
                /* We sent a data segment already. */
                return;
        }

(Btw, I'm not on the lkml, so I would like to be cc'd of the further
discussion on this thread)

- - Pasi

- -- 
http://www.cs.helsinki.fi/u/sarolaht/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8GIDRoNa7NH1G2csRAvoLAKC5JbdYF524KMGKOG7X7jObLIkifgCffIbG
tA/Cr4FqSeWhEArt/mPlHGY=
=KD8M
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-12-13 19:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-10 18:16 TCP LAST-ACK state broken in 2.4.17-pre2 Mika Liljeberg
2001-12-10 18:34 ` kuznet
2001-12-10 18:54   ` Mika Liljeberg
2001-12-11 17:51     ` kuznet
2001-12-12 20:31       ` Mika Liljeberg
2001-12-13 17:59         ` kuznet
2001-12-13 19:30           ` Mika Liljeberg
2001-12-13 19:38             ` kuznet
2001-12-11  0:13 ` David S. Miller
2001-12-11 17:24   ` kuznet
2001-12-13 10:19 Pasi Sarolahti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).