linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode
@ 2013-11-19 18:10 Andrey Vagin
  2013-11-19 18:20 ` Pavel Emelyanov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Andrey Vagin @ 2013-11-19 18:10 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, criu, Andrey Vagin, Pavel Emelyanov, Eric Dumazet,
	David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy

snd_nxt must be updated synchronously with sk_send_head.  Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.

Here is a kernel panic from my host.
[  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[  146.301158] Call Trace:
[  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.

In that moment we had a socket where write queue looks like this:
snd_una    snd_nxt   write_seq
   |_________|________|
             |
	  sk_send_head

After a second switching from repair mode the state of socket was
changed:

snd_una          snd_nxt, write_seq
   |_________ ________|
             |
	  sk_send_head

This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.

Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.

tcp_write_wakeup
	skb = tcp_send_head(sk);
	tcp_fragment
		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
			tcp_adjust_pcount(sk, skb, diff);
	tcp_event_new_data_sent
		tp->packets_out += tcp_skb_pcount(skb);

I think update of snd_nxt isn't required, when a socket is switched from
repair mode.  Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.

I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
---
 net/ipv4/tcp_output.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6728546..fa9f57d 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3093,7 +3093,6 @@ void tcp_send_window_probe(struct sock *sk)
 {
 	if (sk->sk_state == TCP_ESTABLISHED) {
 		tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
-		tcp_sk(sk)->snd_nxt = tcp_sk(sk)->write_seq;
 		tcp_xmit_probe_skb(sk, 0);
 	}
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode
  2013-11-19 18:10 [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode Andrey Vagin
@ 2013-11-19 18:20 ` Pavel Emelyanov
  2013-11-19 18:29 ` Eric Dumazet
  2013-11-19 21:15 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2013-11-19 18:20 UTC (permalink / raw)
  To: Andrey Vagin, David S. Miller
  Cc: netdev, linux-kernel, criu, Eric Dumazet, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy

On 11/19/2013 10:10 PM, Andrey Vagin wrote:
> 
> Cc: Pavel Emelyanov <xemul@parallels.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Signed-off-by: Andrey Vagin <avagin@openvz.org>

Acked-by: Pavel Emelyanov <xemul@parallels.com>

> ---
>  net/ipv4/tcp_output.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 6728546..fa9f57d 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -3093,7 +3093,6 @@ void tcp_send_window_probe(struct sock *sk)
>  {
>  	if (sk->sk_state == TCP_ESTABLISHED) {
>  		tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
> -		tcp_sk(sk)->snd_nxt = tcp_sk(sk)->write_seq;
>  		tcp_xmit_probe_skb(sk, 0);
>  	}
>  }
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode
  2013-11-19 18:10 [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode Andrey Vagin
  2013-11-19 18:20 ` Pavel Emelyanov
@ 2013-11-19 18:29 ` Eric Dumazet
  2013-11-19 21:15 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2013-11-19 18:29 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: netdev, linux-kernel, criu, Pavel Emelyanov, Eric Dumazet,
	David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy

On Tue, 2013-11-19 at 22:10 +0400, Andrey Vagin wrote:
> snd_nxt must be updated synchronously with sk_send_head.  Otherwise
> tp->packets_out may be updated incorrectly, what may bring a kernel panic.
> 
> Here is a kernel panic from my host.
> [  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
> [  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
> ...
> [  146.301158] Call Trace:
> [  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0
> 
> Before this panic a tcp socket was restored. This socket had sent and
> unsent data in the write queue. Sent data was restored in repair mode,
> then the socket was switched from reapair mode and unsent data was
> restored. After that the socket was switched back into repair mode.
> 
> In that moment we had a socket where write queue looks like this:
> snd_una    snd_nxt   write_seq
>    |_________|________|
>              |
> 	  sk_send_head
> 
> After a second switching from repair mode the state of socket was
> changed:
> 
> snd_una          snd_nxt, write_seq
>    |_________ ________|
>              |
> 	  sk_send_head
> 
> This state is inconsistent, because snd_nxt and sk_send_head are not
> synchronized.
> 
> Bellow you can find a call trace, how packets_out can be incremented
> twice for one skb, if snd_nxt and sk_send_head are not synchronized.
> In this case packets_out will be always positive, even when
> sk_write_queue is empty.
> 
> tcp_write_wakeup
> 	skb = tcp_send_head(sk);
> 	tcp_fragment
> 		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
> 			tcp_adjust_pcount(sk, skb, diff);
> 	tcp_event_new_data_sent
> 		tp->packets_out += tcp_skb_pcount(skb);
> 
> I think update of snd_nxt isn't required, when a socket is switched from
> repair mode.  Because it's initialized in tcp_connect_init. Then when a
> write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
> so it's always is in consistent state.
> 
> I have checked, that the bug is not reproduced with this patch and
> all tests about restoring tcp connections work fine.
> 
> Cc: Pavel Emelyanov <xemul@parallels.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Signed-off-by: Andrey Vagin <avagin@openvz.org>
> ---
>  net/ipv4/tcp_output.c | 1 -
>  1 file changed, 1 deletion(-)

Fixes: c0e88ff0f256 ("tcp: Repair socket queues")
Acked-by: Eric Dumazet <edumazet@google.com>




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode
  2013-11-19 18:10 [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode Andrey Vagin
  2013-11-19 18:20 ` Pavel Emelyanov
  2013-11-19 18:29 ` Eric Dumazet
@ 2013-11-19 21:15 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2013-11-19 21:15 UTC (permalink / raw)
  To: avagin
  Cc: netdev, linux-kernel, criu, xemul, edumazet, kuznet, jmorris,
	yoshfuji, kaber

From: Andrey Vagin <avagin@openvz.org>
Date: Tue, 19 Nov 2013 22:10:06 +0400

> snd_nxt must be updated synchronously with sk_send_head.  Otherwise
> tp->packets_out may be updated incorrectly, what may bring a kernel panic.
> 
> Here is a kernel panic from my host.
> [  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
> [  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
> ...
> [  146.301158] Call Trace:
> [  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0
> 
> Before this panic a tcp socket was restored. This socket had sent and
> unsent data in the write queue. Sent data was restored in repair mode,
> then the socket was switched from reapair mode and unsent data was
> restored. After that the socket was switched back into repair mode.
> 
> In that moment we had a socket where write queue looks like this:
> snd_una    snd_nxt   write_seq
>    |_________|________|
>              |
> 	  sk_send_head
> 
> After a second switching from repair mode the state of socket was
> changed:
> 
> snd_una          snd_nxt, write_seq
>    |_________ ________|
>              |
> 	  sk_send_head
> 
> This state is inconsistent, because snd_nxt and sk_send_head are not
> synchronized.
> 
> Bellow you can find a call trace, how packets_out can be incremented
> twice for one skb, if snd_nxt and sk_send_head are not synchronized.
> In this case packets_out will be always positive, even when
> sk_write_queue is empty.
> 
> tcp_write_wakeup
> 	skb = tcp_send_head(sk);
> 	tcp_fragment
> 		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
> 			tcp_adjust_pcount(sk, skb, diff);
> 	tcp_event_new_data_sent
> 		tp->packets_out += tcp_skb_pcount(skb);
> 
> I think update of snd_nxt isn't required, when a socket is switched from
> repair mode.  Because it's initialized in tcp_connect_init. Then when a
> write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
> so it's always is in consistent state.
> 
> I have checked, that the bug is not reproduced with this patch and
> all tests about restoring tcp connections work fine.
> 
> Signed-off-by: Andrey Vagin <avagin@openvz.org>

Applied and queued up for -stable, thank you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-11-19 21:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-19 18:10 [PATCH] tcp: don't update snd_nxt, when a socket is switched from repair mode Andrey Vagin
2013-11-19 18:20 ` Pavel Emelyanov
2013-11-19 18:29 ` Eric Dumazet
2013-11-19 21:15 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).