All of lore.kernel.org
 help / color / mirror / Atom feed
* Suggesting patch for tcp_close
       [not found] <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p8>
@ 2018-11-23  7:22 ` 배석진
       [not found]   ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p7>
  0 siblings, 1 reply; 8+ messages in thread
From: 배석진 @ 2018-11-23  7:22 UTC (permalink / raw)
  To: netdev, lorenzo; +Cc: 유금환

Dear all,


This is Soukin Bae working on Samsung Elec. Mobile Division.
we have a problem with tcp closing.

in shortly, 
1. on 4-way handshking to close session
2. if ack pkt is not arrived from opposite side
3. then session can't be closed forever


in mobile device, condition 2 can be happend in various case.
like as turn off wifi or mobile data. or bad condition of air network, etc..

this could be occur in both side of connection.
when issue happened during active closing, the session remained with FIN_WAIT1 state.
and at passive closing, remained with LAST_ACK state.

---------------------------------------------------------------------------------------------
below is test result after wifi on/off repetition (without mobile data).
maybe 'Foreign Address' sent the fin-ack when wifi-off state.
so device coun't recieve ack pkt further, and the session is remained permanently.
and their count is growing up. this is resource leak.

### turn on wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660   2404:6800:4008:c00::bc:5228   ESTABLISHED 10041      74347       6523/com.google.android.gms.persistent
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512   64:ff9b::3444:f3dc:443        ESTABLISHED 10137      77447       9522/com.samsung.android.game.gos
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -

### turn off wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -


---------------------------------------------------------------------------------------------
this is our analysis
when app finished using the socket(tcp session), it calls sock_close.
then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by excute sock_orphan().

11-23 11:40:55.676 [5:      Thread-44:11210] TCP: bsj: tcp_set_state: TCP sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK, [2404:6800:4004:800::2003]
11-23 11:40:55.676 [5:      Thread-44:11210] Call trace:
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffdda8>] tcp_set_state+0x1b8/0x1f0
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffe3f8>] tcp_close+0x484/0x534
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff800902f830>] inet_release+0x60/0x74
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8009074238>] inet6_release+0x30/0x48
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f35e4c>] __sock_release+0x40/0x104
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f3bab0>] sock_close+0x18/0x28
11-23 11:40:55.678 [5:      Thread-44:11210] TCP: bsj: sock_orphan: TCP sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003]


at this point, if the FIN_ACK comes, there's no problem. all is well~
but without that and when turn off wifi,
netd trying to close all the session by calling tcp_abort, sock_diag_destory.

11-23 11:41:38.463 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4004:800::2003]
11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4005:80c::2004]
11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: <sock_diag_destroy>, [64:ff9b::34d0:9421]

but because of this sock was already changed to SOCK_DEAT state by tcp_close(), tcp_done() can't be excuted.
so this session can't be closed.

int tcp_abort(struct sock *sk, int err)
{
	...
	if (!sock_flag(sk, SOCK_DEAD)) {	//// when SOCK_DEAD, tcp_done() be skip.
		...
		sk->sk_error_report(sk);
		if (tcp_need_reset(sk->sk_state))
			tcp_send_active_reset(sk, GFP_ATOMIC);
		tcp_done(sk);
	}
	...
	return 0;
}


---------------------------------------------------------------------------------------------
wh thought that just sk_error_report have to reside in that condition, SOCK_DEAD.
and send-reset and tcp_done should to be always.
we fixed it like as below, and confirmed that issue resolved.
please check this.

Best regards,



>From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001
From: soukjin bae <soukjin.bae@samsung.com>
Date: Fri, 23 Nov 2018 15:56:53 +0900
Subject: [PATCH] net: close session always when tcp_abort

session before recieve the FIN_ACK couldn't be closed by tcp_abort
they will remained permanently. close them

Signed-off-by: soukjin bae <soukjin.bae@samsung.com>
Signed-off-by: geumhwan you <geumhwan.you@samsung.com>
---
 net/ipv4/tcp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9e6bc4d6daa7..faf4a8bbec8e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err)
 		/* This barrier is coupled with smp_rmb() in tcp_poll() */
 		smp_wmb();
 		sk->sk_error_report(sk);
-		if (tcp_need_reset(sk->sk_state))
-			tcp_send_active_reset(sk, GFP_ATOMIC);
-		tcp_done(sk);
 	}
 
+	if (tcp_need_reset(sk->sk_state))
+		tcp_send_active_reset(sk, GFP_ATOMIC);
+	tcp_done(sk);
+
 	bh_unlock_sock(sk);
 	local_bh_enable();
 	tcp_write_queue_purge(sk);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* FW: [Resource Leak] Suggesting patch for tcp_close
       [not found]   ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p7>
@ 2018-11-28  1:09     ` 배석진
  2018-11-28  4:57       ` Eric Dumazet
       [not found]       ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p1>
  0 siblings, 2 replies; 8+ messages in thread
From: 배석진 @ 2018-11-28  1:09 UTC (permalink / raw)
  To: netdev, lorenzo; +Cc: 배석진, 유금환

Hello all,

nobody concern about this? minor problem? :(
we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.

and easily reproduced in wifi network because of android gms... :p
i heard that gms try connect to their server to confirm wifi connection.
and they can't recieve the fin-ack frequently.

thanks,

 
--------- Original Message ---------
Sender : 배석진 <soukjin.bae@samsung.com> Staff Engineer/System개발2그룹(무선)/삼성전자
Date   : 2018-11-23 16:22 (GMT+9)
Title  : Suggesting patch for tcp_close
 
Dear all,
 
 
This is Soukin Bae working on Samsung Elec. Mobile Division.
we have a problem with tcp closing.
 
in shortly, 
1. on 4-way handshking to close session
2. if ack pkt is not arrived from opposite side
3. then session can't be closed forever
 
 
in mobile device, condition 2 can be happend in various case.
like as turn off wifi or mobile data. or bad condition of air network, etc..
 
this could be occur in both side of connection.
when issue happened during active closing, the session remained with FIN_WAIT1 state.
and at passive closing, remained with LAST_ACK state.
 
---------------------------------------------------------------------------------------------
below is test result after wifi on/off repetition (without mobile data).
maybe 'Foreign Address' sent the fin-ack when wifi-off state.
so device coun't recieve ack pkt further, and the session is remained permanently.
and their count is growing up. this is resource leak.
 
### turn on wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660   2404:6800:4008:c00::bc:5228   ESTABLISHED 10041      74347       6523/com.google.android.gms.persistent
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512   64:ff9b::3444:f3dc:443        ESTABLISHED 10137      77447       9522/com.samsung.android.game.gos
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -
 
### turn off wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -
 
 
---------------------------------------------------------------------------------------------
this is our analysis
when app finished using the socket(tcp session), it calls sock_close.
then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by excute sock_orphan().
 
11-23 11:40:55.676 [5:      Thread-44:11210] TCP: bsj: tcp_set_state: TCP sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK, [2404:6800:4004:800::2003]
11-23 11:40:55.676 [5:      Thread-44:11210] Call trace:
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffdda8>] tcp_set_state+0x1b8/0x1f0
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffe3f8>] tcp_close+0x484/0x534
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff800902f830>] inet_release+0x60/0x74
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8009074238>] inet6_release+0x30/0x48
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f35e4c>] __sock_release+0x40/0x104
11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f3bab0>] sock_close+0x18/0x28
11-23 11:40:55.678 [5:      Thread-44:11210] TCP: bsj: sock_orphan: TCP sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003]
 
 
at this point, if the FIN_ACK comes, there's no problem. all is well~
but without that and when turn off wifi,
netd trying to close all the session by calling tcp_abort, sock_diag_destory.
 
11-23 11:41:38.463 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4004:800::2003]
11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4005:80c::2004]
11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: <sock_diag_destroy>, [64:ff9b::34d0:9421]
 
but because of this sock was already changed to SOCK_DEAD state by tcp_close(), tcp_done() can't be excuted.
so this session can't be closed.
 
int tcp_abort(struct sock *sk, int err)
{
        ...
        if (!sock_flag(sk, SOCK_DEAD)) {        //// when SOCK_DEAD, tcp_done() be skip.
                ...
                sk->sk_error_report(sk);
                if (tcp_need_reset(sk->sk_state))
                        tcp_send_active_reset(sk, GFP_ATOMIC);
                tcp_done(sk);
        }
        ...
        return 0;
}
 
 
---------------------------------------------------------------------------------------------
we thought that just sk_error_report have to reside in that condition, SOCK_DEAD.
and send-reset and tcp_done should to be always.
we fixed it like as below, and confirmed that issue resolved.
please check this.
 
Best regards,
 
 
 
From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001
From: soukjin bae <soukjin.bae@samsung.com>
Date: Fri, 23 Nov 2018 15:56:53 +0900
Subject: [PATCH] net: close session always when tcp_abort
 
session before recieve the FIN_ACK couldn't be closed by tcp_abort
they will remained permanently. close them
 
Signed-off-by: soukjin bae <soukjin.bae@samsung.com>
Signed-off-by: geumhwan yu <geumhwan.yu@samsung.com>
---
 net/ipv4/tcp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9e6bc4d6daa7..faf4a8bbec8e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err)
                 /* This barrier is coupled with smp_rmb() in tcp_poll() */
                 smp_wmb();
                 sk->sk_error_report(sk);
-                if (tcp_need_reset(sk->sk_state))
-                        tcp_send_active_reset(sk, GFP_ATOMIC);
-                tcp_done(sk);
         }
 
+        if (tcp_need_reset(sk->sk_state))
+                tcp_send_active_reset(sk, GFP_ATOMIC);
+        tcp_done(sk);
+
         bh_unlock_sock(sk);
         local_bh_enable();
         tcp_write_queue_purge(sk);
-- 
2.13.0
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FW: [Resource Leak] Suggesting patch for tcp_close
  2018-11-28  1:09     ` FW: [Resource Leak] " 배석진
@ 2018-11-28  4:57       ` Eric Dumazet
  2018-11-28  4:58         ` Eric Dumazet
       [not found]       ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p1>
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2018-11-28  4:57 UTC (permalink / raw)
  To: soukjin.bae, netdev, lorenzo; +Cc: 유금환



On 11/27/2018 05:09 PM, 배석진 wrote:
> Hello all,
> 
> nobody concern about this? minor problem? :(
> we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.

These sessions should have a timer, and eventually disappear.

Do you have a test to demonstrate the issue ?

I know Lorenzo wrote tests, so presumably new tests are needed.

> 
> and easily reproduced in wifi network because of android gms... :p
> i heard that gms try connect to their server to confirm wifi connection.
> and they can't recieve the fin-ack frequently.
> 
> thanks,
> 
>  
> --------- Original Message ---------
> Sender : 배석진 <soukjin.bae@samsung.com> Staff Engineer/System개발2그룹(무선)/삼성전자
> Date   : 2018-11-23 16:22 (GMT+9)
> Title  : Suggesting patch for tcp_close
>  
> Dear all,
>  
>  
> This is Soukin Bae working on Samsung Elec. Mobile Division.
> we have a problem with tcp closing.
>  
> in shortly, 
> 1. on 4-way handshking to close session
> 2. if ack pkt is not arrived from opposite side
> 3. then session can't be closed forever
>  
>  
> in mobile device, condition 2 can be happend in various case.
> like as turn off wifi or mobile data. or bad condition of air network, etc..
>  
> this could be occur in both side of connection.
> when issue happened during active closing, the session remained with FIN_WAIT1 state.
> and at passive closing, remained with LAST_ACK state.
>  
> ---------------------------------------------------------------------------------------------
> below is test result after wifi on/off repetition (without mobile data).
> maybe 'Foreign Address' sent the fin-ack when wifi-off state.
> so device coun't recieve ack pkt further, and the session is remained permanently.
> and their count is growing up. this is resource leak.
>  
> ### turn on wifi
> D:\Test>adb shell netstat -npWae
> Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
> tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660   2404:6800:4008:c00::bc:5228   ESTABLISHED 10041      74347       6523/com.google.android.gms.persistent
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512   64:ff9b::3444:f3dc:443        ESTABLISHED 10137      77447       9522/com.samsung.android.game.gos
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
> tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -
>  
> ### turn off wifi
> D:\Test>adb shell netstat -npWae
> Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
> tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
> tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -
>  
>  
> ---------------------------------------------------------------------------------------------
> this is our analysis
> when app finished using the socket(tcp session), it calls sock_close.
> then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by excute sock_orphan().
>  
> 11-23 11:40:55.676 [5:      Thread-44:11210] TCP: bsj: tcp_set_state: TCP sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK, [2404:6800:4004:800::2003]
> 11-23 11:40:55.676 [5:      Thread-44:11210] Call trace:
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffdda8>] tcp_set_state+0x1b8/0x1f0
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffe3f8>] tcp_close+0x484/0x534
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff800902f830>] inet_release+0x60/0x74
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8009074238>] inet6_release+0x30/0x48
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f35e4c>] __sock_release+0x40/0x104
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f3bab0>] sock_close+0x18/0x28
> 11-23 11:40:55.678 [5:      Thread-44:11210] TCP: bsj: sock_orphan: TCP sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003]
>  
>  
> at this point, if the FIN_ACK comes, there's no problem. all is well~
> but without that and when turn off wifi,
> netd trying to close all the session by calling tcp_abort, sock_diag_destory.
>  
> 11-23 11:41:38.463 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4004:800::2003]
> 11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4005:80c::2004]
> 11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: <sock_diag_destroy>, [64:ff9b::34d0:9421]
>  
> but because of this sock was already changed to SOCK_DEAD state by tcp_close(), tcp_done() can't be excuted.
> so this session can't be closed.
>  
> int tcp_abort(struct sock *sk, int err)
> {
>         ...
>         if (!sock_flag(sk, SOCK_DEAD)) {        //// when SOCK_DEAD, tcp_done() be skip.
>                 ...
>                 sk->sk_error_report(sk);
>                 if (tcp_need_reset(sk->sk_state))
>                         tcp_send_active_reset(sk, GFP_ATOMIC);
>                 tcp_done(sk);
>         }
>         ...
>         return 0;
> }
>  
>  
> ---------------------------------------------------------------------------------------------
> we thought that just sk_error_report have to reside in that condition, SOCK_DEAD.
> and send-reset and tcp_done should to be always.
> we fixed it like as below, and confirmed that issue resolved.
> please check this.
>  
> Best regards,
>  
>  
>  
> From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001
> From: soukjin bae <soukjin.bae@samsung.com>
> Date: Fri, 23 Nov 2018 15:56:53 +0900
> Subject: [PATCH] net: close session always when tcp_abort
>  
> session before recieve the FIN_ACK couldn't be closed by tcp_abort
> they will remained permanently. close them
>  
> Signed-off-by: soukjin bae <soukjin.bae@samsung.com>
> Signed-off-by: geumhwan yu <geumhwan.yu@samsung.com>
> ---
>  net/ipv4/tcp.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>  
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 9e6bc4d6daa7..faf4a8bbec8e 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err)
>                  /* This barrier is coupled with smp_rmb() in tcp_poll() */
>                  smp_wmb();
>                  sk->sk_error_report(sk);
> -                if (tcp_need_reset(sk->sk_state))
> -                        tcp_send_active_reset(sk, GFP_ATOMIC);
> -                tcp_done(sk);
>          }
>  
> +        if (tcp_need_reset(sk->sk_state))
> +                tcp_send_active_reset(sk, GFP_ATOMIC);
> +        tcp_done(sk);
> +
>          bh_unlock_sock(sk);
>          local_bh_enable();
>          tcp_write_queue_purge(sk);
> -- 
> 2.13.0
>  
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FW: [Resource Leak] Suggesting patch for tcp_close
  2018-11-28  4:57       ` Eric Dumazet
@ 2018-11-28  4:58         ` Eric Dumazet
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2018-11-28  4:58 UTC (permalink / raw)
  To: soukjin.bae, netdev, lorenzo; +Cc: 유금환



On 11/27/2018 08:57 PM, Eric Dumazet wrote:
> 
> 
> On 11/27/2018 05:09 PM, 배석진 wrote:
>> Hello all,
>>
>> nobody concern about this? minor problem? :(
>> we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.
> 
> These sessions should have a timer, and eventually disappear.
> 
> Do you have a test to demonstrate the issue ?
> 
> I know Lorenzo wrote tests, so presumably new tests are needed.

BTW your patch title is confusing.
You are changing tcp_abort(), not tcp_close()

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE:(2) FW: [Resource Leak] Suggesting patch for tcp_close
       [not found]       ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p1>
@ 2018-11-28  6:17         ` 배석진
  2018-11-28  7:36           ` (2) " Lorenzo Colitti
                             ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: 배석진 @ 2018-11-28  6:17 UTC (permalink / raw)
  To: Eric Dumazet, netdev, lorenzo
  Cc: 유금환, 배석진

>> we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.
> 
> These sessions should have a timer, and eventually disappear.

FIN_WAIT2 and TIME_WAIT have a timer.
but FIN_WAIT1 and LAST_ACK are have too?

> Do you have a test to demonstrate the issue ?
>  
> I know Lorenzo wrote tests, so presumably new tests are needed.

yes, i had a test myself, and attached the result.
step is easy, with android device.

1. remove simcard (no mobile online)
2. connect wifi and check tcp sessions
3. disconnect wifi and check again.
4. repeat step 2~3

as i said, gms make this problem often at port 443 or 5228. 
i've confirm session owner is them by pid, when session was ESTABLISH state.
and i found this at speedguide.net

- Port(s)  : 5228
- Protocol : tcp,udp
- Service  : android
- Details  : Port 5228 is used by the Google Playstore (Android market). 
             Google talk also uses ports 443, 5222 and 5228. 
             Google Chrome user settings sync (facorites, history, passwords) uses port 5228. 
- Related ports : 443 5222


below is report we have received.
i omitted full list shortly. its list up to 900 line.

------ NETSTAT (netstat -npWae) ------
Active Internet connections (established and servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       User  Inode   PID/Program Name
tcp   0      0      127.0.0.1:5037              0.0.0.0:*                   LISTEN      0     617046  2768/adbd
tcp   0      1      192.168.0.112:53828         211.231.100.211:443         FIN_WAIT1   0     0       -
tcp   0      1      192.168.0.112:33908         27.0.236.141:443            FIN_WAIT1   0     0       -
tcp   0      1      192.168.0.112:58480         183.111.43.14:443           LAST_ACK    0     0       -
tcp   0      1      192.168.0.112:56558         203.133.167.207:443         FIN_WAIT1   0     0       -
...                                                                                                  
tcp6  0     32      ::ffff:192.168.0.112:50072  ::ffff:74.125.203.188:5228  FIN_WAIT1   0     0       -
tcp6  0      1      ::ffff:192.168.1.5:59744    ::ffff:216.58.197.238:443   FIN_WAIT1   0     0       -
tcp6  0     32      ::ffff:192.168.0.112:40512  ::ffff:108.177.125.188:5228 FIN_WAIT1   0     0       -
tcp6  0     32      ::ffff:192.168.0.112:46560  ::ffff:108.177.97.188:5228  FIN_WAIT1   0     0       -


> BTW your patch title is confusing.
> You are changing tcp_abort(), not tcp_close()

yes right, i meant tcp session close as title.
but my finger put the underbar himself :o

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (2) FW: [Resource Leak] Suggesting patch for tcp_close
  2018-11-28  6:17         ` 배석진
@ 2018-11-28  7:36           ` Lorenzo Colitti
       [not found]           ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p6>
  2018-11-28 14:40           ` Eric Dumazet
  2 siblings, 0 replies; 8+ messages in thread
From: Lorenzo Colitti @ 2018-11-28  7:36 UTC (permalink / raw)
  To: 배석진
  Cc: Eric Dumazet, netdev, 유금환, Maciej Żenczykowski

On Tue, Nov 27, 2018 at 10:17 PM 배석진 <soukjin.bae@samsung.com> wrote:
> >> we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.
> >
> > These sessions should have a timer, and eventually disappear.
>
> FIN_WAIT2 and TIME_WAIT have a timer.
> but FIN_WAIT1 and LAST_ACK are have too?

What harm is caused by these stale sessions? I thought that was the
intended behaviour.

If you look at the original design discussions that led to the
SOCK_DESTROY and tcp_abort patch, the goal of SOCK_DESTROY was not
primarily to clear TCP state, but primarily to unblock applications
that were stuck in blocking socket operations such as read(), write()
and connect. That is the reason why it only calls tcp_done if the
SOCK_DEAD flag is not set. See in particular
https://www.spinics.net/lists/netdev/msg352716.html , where opposition
was voiced to being able to close sockets in TIME_WAIT_STATE. That
said, I don't have a strong opinion on this: whatever works for Eric
works for me.

> > Do you have a test to demonstrate the issue ?
> >
> > I know Lorenzo wrote tests, so presumably new tests are needed.

There are definitely tests that check that SOCK_DESTROY does nothing
for dead sockets (i.e., the current behaviour without this patch).
Those tests are part of the Android conformance test suites (VTS), so
I think taking this patch will cause the device to fail the Android
conformance here:

https://cs.corp.google.com/android/kernel/tests/net/test/sock_diag_test.py?l=663

Soukjin, you will need to modify that test if this patch is applied.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE:(2) (2) FW: [Resource Leak] Suggesting patch for tcp_close
       [not found]           ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p6>
@ 2018-11-28  8:49             ` 배석진
  0 siblings, 0 replies; 8+ messages in thread
From: 배석진 @ 2018-11-28  8:49 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: Eric Dumazet, netdev, 유금환,
	Maciej Żenczykowski, 배석진

> What harm is caused by these stale sessions? I thought that was the
> intended behaviour.
>

our system stability guys concern about this.
when its count grows up too much, whether it can be harm to system or not.

> If you look at the original design discussions that led to the
> SOCK_DESTROY and tcp_abort patch, the goal of SOCK_DESTROY was not
> primarily to clear TCP state, but primarily to unblock applications
> that were stuck in blocking socket operations such as read(), write()
> and connect. That is the reason why it only calls tcp_done if the
> SOCK_DEAD flag is not set. See in particular
> https://www.spinics.net/lists/netdev/msg352716.html , where opposition
> was voiced to being able to close sockets in TIME_WAIT_STATE. That
> said, I don't have a strong opinion on this: whatever works for Eric
> works for me.
> 

ok, i got your saying about purpose of SOCK_DESTROY.
maybe we have two options.
add a new point that can managing those tcp sessions.
or, just let it them.

thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (2) FW: [Resource Leak] Suggesting patch for tcp_close
  2018-11-28  6:17         ` 배석진
  2018-11-28  7:36           ` (2) " Lorenzo Colitti
       [not found]           ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p6>
@ 2018-11-28 14:40           ` Eric Dumazet
  2 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2018-11-28 14:40 UTC (permalink / raw)
  To: soukjin.bae, Eric Dumazet, netdev, lorenzo; +Cc: 유금환



On 11/27/2018 10:17 PM, 배석진 wrote:
>>> we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.
>>
>> These sessions should have a timer, and eventually disappear.
> 
> FIN_WAIT2 and TIME_WAIT have a timer.
> but FIN_WAIT1 and LAST_ACK are have too?

Sure. Otherwise we have a more serious bug than tcp_abort().


> 
>> Do you have a test to demonstrate the issue ?
>>  
>> I know Lorenzo wrote tests, so presumably new tests are needed.
> 
> yes, i had a test myself, and attached the result.
> step is easy, with android device.
> 
> 1. remove simcard (no mobile online)
> 2. connect wifi and check tcp sessions
> 3. disconnect wifi and check again.
> 4. repeat step 2~3
> 

A real test is something that can be automated.

Like a packetdrill test.

> as i said, gms make this problem often at port 443 or 5228. 
> i've confirm session owner is them by pid, when session was ESTABLISH state.
> and i found this at speedguide.net
> 
> - Port(s)  : 5228
> - Protocol : tcp,udp
> - Service  : android
> - Details  : Port 5228 is used by the Google Playstore (Android market). 
>              Google talk also uses ports 443, 5222 and 5228. 
>              Google Chrome user settings sync (facorites, history, passwords) uses port 5228. 
> - Related ports : 443 5222
> 
> 
> below is report we have received.
> i omitted full list shortly. its list up to 900 line.
> 
> ------ NETSTAT (netstat -npWae) ------
> Active Internet connections (established and servers)
> Proto Recv-Q Send-Q Local Address               Foreign Address             State       User  Inode   PID/Program Name
> tcp   0      0      127.0.0.1:5037              0.0.0.0:*                   LISTEN      0     617046  2768/adbd
> tcp   0      1      192.168.0.112:53828         211.231.100.211:443         FIN_WAIT1   0     0       -
> tcp   0      1      192.168.0.112:33908         27.0.236.141:443            FIN_WAIT1   0     0       -
> tcp   0      1      192.168.0.112:58480         183.111.43.14:443           LAST_ACK    0     0       -
> tcp   0      1      192.168.0.112:56558         203.133.167.207:443         FIN_WAIT1   0     0       -
> ...                                                                                                  
> tcp6  0     32      ::ffff:192.168.0.112:50072  ::ffff:74.125.203.188:5228  FIN_WAIT1   0     0       -
> tcp6  0      1      ::ffff:192.168.1.5:59744    ::ffff:216.58.197.238:443   FIN_WAIT1   0     0       -
> tcp6  0     32      ::ffff:192.168.0.112:40512  ::ffff:108.177.125.188:5228 FIN_WAIT1   0     0       -
> tcp6  0     32      ::ffff:192.168.0.112:46560  ::ffff:108.177.97.188:5228  FIN_WAIT1   0     0       -
> 
> 
>> BTW your patch title is confusing.
>> You are changing tcp_abort(), not tcp_close()
> 
> yes right, i meant tcp session close as title.
> but my finger put the underbar himself :o
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-11-29  1:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p8>
2018-11-23  7:22 ` Suggesting patch for tcp_close 배석진
     [not found]   ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p7>
2018-11-28  1:09     ` FW: [Resource Leak] " 배석진
2018-11-28  4:57       ` Eric Dumazet
2018-11-28  4:58         ` Eric Dumazet
     [not found]       ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p1>
2018-11-28  6:17         ` 배석진
2018-11-28  7:36           ` (2) " Lorenzo Colitti
     [not found]           ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p6>
2018-11-28  8:49             ` 배석진
2018-11-28 14:40           ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.