All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] TCP: fix a bug that triggers large number of TCP RST by mistake
@ 2011-01-22 19:06 H.K. Jerry Chu
  2011-01-25 21:48 ` David Miller
  0 siblings, 1 reply; 3+ messages in thread
From: H.K. Jerry Chu @ 2011-01-22 19:06 UTC (permalink / raw)
  To: netdev; +Cc: Jerry Chu

From: Jerry Chu <hkchu@google.com>

This patch fixes a bug that causes TCP RST packets to be generated
on otherwise correctly behaved applications, e.g., no unread data
on close,..., etc. To trigger the bug, at least two conditions must
be met:

1. The FIN flag is set on the last data packet, i.e., it's not on a
separate, FIN only packet.
2. The size of the last data chunk on the receive side matches
exactly with the size of buffer posted by the receiver, and the
receiver closes the socket without any further read attempt.

This bug was first noticed on our netperf based testbed for our IW10
proposal to IETF where a large number of RST packets were observed.
netperf's read side code meets the condition 2 above 100%.

Before the fix, tcp_data_queue() will queue the last skb that meets
condition 1 to sk_receive_queue even though it has fully copied out
(skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
tcp_recvmsg() often returns all the copied out data successfully
without actually consuming the skb, due to a check
"if ((chunk = len - tp->ucopy.len) != 0) {"
and
"len -= chunk;"
after tcp_prequeue_process() that causes "len" to become 0 and an
early exit from the big while loop.

I don't see any reason not to free the skb whose data have been fully
consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
get there if MSG_PEEK is on. Am I missing some arcane cases related
to urgent data?

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
---
 net/ipv4/tcp_input.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2549b29..eb7f82e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4399,7 +4399,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 			if (!skb_copy_datagram_iovec(skb, 0, tp->ucopy.iov, chunk)) {
 				tp->ucopy.len -= chunk;
 				tp->copied_seq += chunk;
-				eaten = (chunk == skb->len && !th->fin);
+				eaten = (chunk == skb->len);
 				tcp_rcv_space_adjust(sk);
 			}
 			local_bh_disable();
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] TCP: fix a bug that triggers large number of TCP RST by mistake
  2011-01-22 19:06 [PATCH] TCP: fix a bug that triggers large number of TCP RST by mistake H.K. Jerry Chu
@ 2011-01-25 21:48 ` David Miller
  2011-01-25 23:48   ` Jerry Chu
  0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2011-01-25 21:48 UTC (permalink / raw)
  To: hkchu; +Cc: netdev

From: "H.K. Jerry Chu" <hkchu@google.com>
Date: Sat, 22 Jan 2011 11:06:17 -0800

> From: Jerry Chu <hkchu@google.com>
> 
> This patch fixes a bug that causes TCP RST packets to be generated
> on otherwise correctly behaved applications, e.g., no unread data
> on close,..., etc. To trigger the bug, at least two conditions must
> be met:
> 
> 1. The FIN flag is set on the last data packet, i.e., it's not on a
> separate, FIN only packet.
> 2. The size of the last data chunk on the receive side matches
> exactly with the size of buffer posted by the receiver, and the
> receiver closes the socket without any further read attempt.
> 
> This bug was first noticed on our netperf based testbed for our IW10
> proposal to IETF where a large number of RST packets were observed.
> netperf's read side code meets the condition 2 above 100%.
> 
> Before the fix, tcp_data_queue() will queue the last skb that meets
> condition 1 to sk_receive_queue even though it has fully copied out
> (skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
> tcp_recvmsg() often returns all the copied out data successfully
> without actually consuming the skb, due to a check
> "if ((chunk = len - tp->ucopy.len) != 0) {"
> and
> "len -= chunk;"
> after tcp_prequeue_process() that causes "len" to become 0 and an
> early exit from the big while loop.
> 
> I don't see any reason not to free the skb whose data have been fully
> consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
> get there if MSG_PEEK is on. Am I missing some arcane cases related
> to urgent data?
> 
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>

This bug goes as far back as January, 2000 right after the softnet
mega-merge happened via the netdev CVS tree (netdev-vger-cvs GIT
commit 214d457e)

Good work, applied, thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] TCP: fix a bug that triggers large number of TCP RST by mistake
  2011-01-25 21:48 ` David Miller
@ 2011-01-25 23:48   ` Jerry Chu
  0 siblings, 0 replies; 3+ messages in thread
From: Jerry Chu @ 2011-01-25 23:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

On Tue, Jan 25, 2011 at 1:48 PM, David Miller <davem@davemloft.net> wrote:
> From: "H.K. Jerry Chu" <hkchu@google.com>
> Date: Sat, 22 Jan 2011 11:06:17 -0800
>
>> From: Jerry Chu <hkchu@google.com>
>>
>> This patch fixes a bug that causes TCP RST packets to be generated
>> on otherwise correctly behaved applications, e.g., no unread data
>> on close,..., etc. To trigger the bug, at least two conditions must
>> be met:
>>
>> 1. The FIN flag is set on the last data packet, i.e., it's not on a
>> separate, FIN only packet.
>> 2. The size of the last data chunk on the receive side matches
>> exactly with the size of buffer posted by the receiver, and the
>> receiver closes the socket without any further read attempt.
>>
>> This bug was first noticed on our netperf based testbed for our IW10
>> proposal to IETF where a large number of RST packets were observed.
>> netperf's read side code meets the condition 2 above 100%.
>>
>> Before the fix, tcp_data_queue() will queue the last skb that meets
>> condition 1 to sk_receive_queue even though it has fully copied out
>> (skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
>> tcp_recvmsg() often returns all the copied out data successfully
>> without actually consuming the skb, due to a check
>> "if ((chunk = len - tp->ucopy.len) != 0) {"
>> and
>> "len -= chunk;"
>> after tcp_prequeue_process() that causes "len" to become 0 and an
>> early exit from the big while loop.
>>
>> I don't see any reason not to free the skb whose data have been fully
>> consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
>> get there if MSG_PEEK is on. Am I missing some arcane cases related
>> to urgent data?
>>
>> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
>
> This bug goes as far back as January, 2000 right after the softnet
> mega-merge happened via the netdev CVS tree (netdev-vger-cvs GIT
> commit 214d457e)

Yes I also tried to trace how long the bug has been there and it seems
to go back to the prehistoric era :). Guess only TCP types got alarmed
by spurious RSTs.

Jerry

>
> Good work, applied, thanks!
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-01-25 23:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-22 19:06 [PATCH] TCP: fix a bug that triggers large number of TCP RST by mistake H.K. Jerry Chu
2011-01-25 21:48 ` David Miller
2011-01-25 23:48   ` Jerry Chu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.