linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
@ 2023-03-08 19:01 Siddharth Kawar
  2023-03-14 19:29 ` Trond Myklebust
  0 siblings, 1 reply; 3+ messages in thread
From: Siddharth Kawar @ 2023-03-08 19:01 UTC (permalink / raw)
  To: trond.myklebust, anna.schumaker, bfields, chuck.lever, davem,
	kuba, linux-nfs, netdev, linux-kernel
  Cc: Dave Bauer, Siddharth Kawar

Subject: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket

NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
reconnecting using the same local TCP port. Unique NFS operations are
identified by the per-TCP connection set of XIDs. This prevents file
corruption when non-idempotent NFS operations are retried.

Currently, NFS client TCP connections are using different local TCP ports
when reconnecting to NFS servers.

After an NFS server initiates shutdown of the TCP connection, the NFS
client's TCP socket is set to NULL after the socket state has reached
TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
the same local port but fails with EADDRNOTAVAIL (99). This forces the
socket to use a different local TCP port to reconnect to the remote NFS
server.

State Transition and Events:
TCP_CLOSING(8)
TCP_LAST_ACK(9)
connect(fail EADDRNOTAVAIL(99))
TCP_CLOSE(7)
bind on new port
connect success

dmesg excerpts showing reconnect switching from local port of 688 to 1014:
[590701.200229] NFS: mkdir(0:61/560857152), testQ
[590701.200231] NFS call  mkdir testQ
[590701.200259] RPC:       xs_tcp_send_request(224) = 0
[590751.883111] RPC:       xs_tcp_state_change client 0000000051f4e000...
[590751.883146] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[590751.883160] RPC:       xs_data_ready...
[590751.883232] RPC:       xs_tcp_state_change client 0000000051f4e000...
[590751.883235] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[590751.883238] RPC:       xs_tcp_state_change client 0000000051f4e000...
[590751.883239] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[590751.883283] RPC:       xs_connect scheduled xprt 0000000051f4e000
[590751.883314] RPC:       xs_bind 0.0.0.0:688: ok (0)
[590751.883321] RPC:       worker connecting xprt 0000000051f4e000 via tcp
			   to 10.101.1.30 (port 2049)
[590751.883330] RPC:       0000000051f4e000 connect status 99 connected 0
			   sock state 7
[590751.883342] RPC:       xs_tcp_state_change client 0000000051f4e000...
[590751.883343] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[590751.883356] RPC:       xs_connect scheduled xprt 0000000051f4e000
[590751.883383] RPC:       xs_bind 0.0.0.0:1014: ok (0)
[590751.883388] RPC:       worker connecting xprt 0000000051f4e000 via tcp
			   to 10.101.1.30 (port 2049)
[590751.883420] RPC:       0000000051f4e000 connect status 115 connected 0
			   sock state 2
...


State Transition and Events with patch applied:
TCP_CLOSING(8)
TCP_LAST_ACK(9)
TCP_CLOSE(7)
connect(reuse of port succeeds)

dmesg excerpts showing reconnect on same port (936):
[  257.139935] NFS: mkdir(0:59/560857152), testQ
[  257.139937] NFS call  mkdir testQ
...
[  307.822702] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[  307.822714] RPC:       xs_data_ready...
[  307.822817] RPC:       xs_tcp_state_change client 00000000ce702f14...
[  307.822821] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[  307.822825] RPC:       xs_tcp_state_change client 00000000ce702f14...
[  307.822826] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[  307.823606] RPC:       xs_tcp_state_change client 00000000ce702f14...
[  307.823609] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  307.823629] RPC:       xs_tcp_state_change client 00000000ce702f14...
[  307.823632] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  307.823676] RPC:       xs_connect scheduled xprt 00000000ce702f14
[  307.823704] RPC:       xs_bind 0.0.0.0:936: ok (0)
[  307.823709] RPC:       worker connecting xprt 00000000ce702f14 via tcp
			  to 10.101.1.30 (port 2049)
[  307.823748] RPC:       00000000ce702f14 connect status 115 connected 0
			  sock state 2
...
[  314.916193] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[  314.916251] RPC:       xs_connect scheduled xprt 00000000ce702f14
[  314.916282] RPC:       xs_bind 0.0.0.0:936: ok (0)
[  314.916292] RPC:       worker connecting xprt 00000000ce702f14 via tcp
			  to 10.101.1.30 (port 2049)
[  314.916342] RPC:       00000000ce702f14 connect status 115 connected 0
			  sock state 2

Fixes: 7c81e6a9d75b (SUNRPC: Tweak TCP socket shutdown in the RPC client)
Signed-off-by: Siddharth Rajendra Kawar <sikawar@microsoft.com>
---
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index d8ee06a9650a..e55e380bc10c 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2094,6 +2094,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
		break;
	case TCP_ESTABLISHED:
	case TCP_CLOSE_WAIT:
+	case TCP_LAST_ACK:
		kernel_sock_shutdown(sock, SHUT_RDWR);
		trace_rpc_socket_shutdown(xprt, sock);
		break;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
  2023-03-08 19:01 [PATCH] SUNRPC: fix shutdown of NFS TCP client socket Siddharth Kawar
@ 2023-03-14 19:29 ` Trond Myklebust
  2023-03-15 19:34   ` Siddharth Kawar
  0 siblings, 1 reply; 3+ messages in thread
From: Trond Myklebust @ 2023-03-14 19:29 UTC (permalink / raw)
  To: Siddharth Kawar
  Cc: Anna Schumaker, Bruce Fields, Charles Edward Lever,
	David S. Miller, kuba, linux-nfs, netdev, linux-kernel,
	Dave Bauer



> On Mar 8, 2023, at 14:01, Siddharth Kawar <Siddharth.Kawar@microsoft.com> wrote:
> 
> [You don't often get email from siddharth.kawar@microsoft.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> Subject: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
> 
> NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
> reconnecting using the same local TCP port. Unique NFS operations are
> identified by the per-TCP connection set of XIDs. This prevents file
> corruption when non-idempotent NFS operations are retried.
> 
> Currently, NFS client TCP connections are using different local TCP ports
> when reconnecting to NFS servers.
> 
> After an NFS server initiates shutdown of the TCP connection, the NFS
> client's TCP socket is set to NULL after the socket state has reached
> TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
> the same local port but fails with EADDRNOTAVAIL (99). This forces the
> socket to use a different local TCP port to reconnect to the remote NFS
> server.
> 
> State Transition and Events:
> TCP_CLOSING(8)

No: TCP state == 8 is TCP_CLOSE_WAIT

> TCP_LAST_ACK(9)
> connect(fail EADDRNOTAVAIL(99))
> TCP_CLOSE(7)
> bind on new port
> connect success
> 
> dmesg excerpts showing reconnect switching from local port of 688 to 1014:
> [590701.200229] NFS: mkdir(0:61/560857152), testQ
> [590701.200231] NFS call  mkdir testQ
> [590701.200259] RPC:       xs_tcp_send_request(224) = 0
> [590751.883111] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883146] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> [590751.883160] RPC:       xs_data_ready...
> [590751.883232] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883235] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883238] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883239] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883283] RPC:       xs_connect scheduled xprt 0000000051f4e000
> [590751.883314] RPC:       xs_bind 0.0.0.0:688: ok (0)
> [590751.883321] RPC:       worker connecting xprt 0000000051f4e000 via tcp
>                           to 10.101.1.30 (port 2049)
> [590751.883330] RPC:       0000000051f4e000 connect status 99 connected 0
>                           sock state 7
> [590751.883342] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883343] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883356] RPC:       xs_connect scheduled xprt 0000000051f4e000
> [590751.883383] RPC:       xs_bind 0.0.0.0:1014: ok (0)
> [590751.883388] RPC:       worker connecting xprt 0000000051f4e000 via tcp
>                           to 10.101.1.30 (port 2049)
> [590751.883420] RPC:       0000000051f4e000 connect status 115 connected 0
>                           sock state 2
> ...
> 
> 
> State Transition and Events with patch applied:
> TCP_CLOSING(8)
> TCP_LAST_ACK(9)
> TCP_CLOSE(7)
> connect(reuse of port succeeds)
> 
> dmesg excerpts showing reconnect on same port (936):
> [  257.139935] NFS: mkdir(0:59/560857152), testQ
> [  257.139937] NFS call  mkdir testQ
> ...
> [  307.822702] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> [  307.822714] RPC:       xs_data_ready...
> [  307.822817] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.822821] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.822825] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.822826] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.823606] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.823609] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.823629] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.823632] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.823676] RPC:       xs_connect scheduled xprt 00000000ce702f14
> [  307.823704] RPC:       xs_bind 0.0.0.0:936: ok (0)
> [  307.823709] RPC:       worker connecting xprt 00000000ce702f14 via tcp
>                          to 10.101.1.30 (port 2049)
> [  307.823748] RPC:       00000000ce702f14 connect status 115 connected 0
>                          sock state 2
> ...
> [  314.916193] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  314.916251] RPC:       xs_connect scheduled xprt 00000000ce702f14
> [  314.916282] RPC:       xs_bind 0.0.0.0:936: ok (0)
> [  314.916292] RPC:       worker connecting xprt 00000000ce702f14 via tcp
>                          to 10.101.1.30 (port 2049)
> [  314.916342] RPC:       00000000ce702f14 connect status 115 connected 0
>                          sock state 2
> 
> Fixes: 7c81e6a9d75b (SUNRPC: Tweak TCP socket shutdown in the RPC client)
> Signed-off-by: Siddharth Rajendra Kawar <sikawar@microsoft.com>
> ---
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index d8ee06a9650a..e55e380bc10c 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2094,6 +2094,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
>                break;
>        case TCP_ESTABLISHED:
>        case TCP_CLOSE_WAIT:
> +       case TCP_LAST_ACK:
>                kernel_sock_shutdown(sock, SHUT_RDWR);
>                trace_rpc_socket_shutdown(xprt, sock);
>                break;

The only way to transition from TCP_CLOSE_WAIT to TCP_LAST_ACK is by closing the socket ( see https://www.ibm.com/docs/en/zos/2.1.0?topic=SSLTBW_2.1.0/com.ibm.zos.v2r1.halu101/constatus.html ), so closing it while in TCP_LAST_ACK is redundant.


_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
  2023-03-14 19:29 ` Trond Myklebust
@ 2023-03-15 19:34   ` Siddharth Kawar
  0 siblings, 0 replies; 3+ messages in thread
From: Siddharth Kawar @ 2023-03-15 19:34 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Anna Schumaker, Bruce Fields, Charles Edward Lever,
	David S. Miller, kuba, linux-nfs, netdev, linux-kernel,
	Dave Bauer

> [Some people who received this message don't often get email from trondmy@hammerspace.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> > On Mar 8, 2023, at 14:01, Siddharth Kawar <Siddharth.Kawar@microsoft.com> wrote:
> >
> > [You don't often get email from siddharth.kawar@microsoft.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > Subject: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
> >
> > NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
> > reconnecting using the same local TCP port. Unique NFS operations are
> > identified by the per-TCP connection set of XIDs. This prevents file
> > corruption when non-idempotent NFS operations are retried.
> >
> > Currently, NFS client TCP connections are using different local TCP ports
> > when reconnecting to NFS servers.
> >
> > After an NFS server initiates shutdown of the TCP connection, the NFS
> > client's TCP socket is set to NULL after the socket state has reached
> > TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
> > the same local port but fails with EADDRNOTAVAIL (99). This forces the
> > socket to use a different local TCP port to reconnect to the remote NFS
> > server.
> >
> > State Transition and Events:
> > TCP_CLOSING(8)
> 
> No: TCP state == 8 is TCP_CLOSE_WAIT

You are right.  Will fix in v2.

> 
> > TCP_LAST_ACK(9)
> > connect(fail EADDRNOTAVAIL(99))
> > TCP_CLOSE(7)
> > bind on new port
> > connect success
> >
> > dmesg excerpts showing reconnect switching from local port of 688 to 1014:
> > [590701.200229] NFS: mkdir(0:61/560857152), testQ
> > [590701.200231] NFS call  mkdir testQ
> > [590701.200259] RPC:       xs_tcp_send_request(224) = 0
> > [590751.883111] RPC:       xs_tcp_state_change client 0000000051f4e000...
> > [590751.883146] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> > [590751.883160] RPC:       xs_data_ready...
> > [590751.883232] RPC:       xs_tcp_state_change client 0000000051f4e000...
> > [590751.883235] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [590751.883238] RPC:       xs_tcp_state_change client 0000000051f4e000...
> > [590751.883239] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [590751.883283] RPC:       xs_connect scheduled xprt 0000000051f4e000
> > [590751.883314] RPC:       xs_bind 0.0.0.0:688: ok (0)
> > [590751.883321] RPC:       worker connecting xprt 0000000051f4e000 via tcp
> >                           to 10.101.1.30 (port 2049)
> > [590751.883330] RPC:       0000000051f4e000 connect status 99 connected 0
> >                           sock state 7
> > [590751.883342] RPC:       xs_tcp_state_change client 0000000051f4e000...
> > [590751.883343] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [590751.883356] RPC:       xs_connect scheduled xprt 0000000051f4e000
> > [590751.883383] RPC:       xs_bind 0.0.0.0:1014: ok (0)
> > [590751.883388] RPC:       worker connecting xprt 0000000051f4e000 via tcp
> >                           to 10.101.1.30 (port 2049)
> > [590751.883420] RPC:       0000000051f4e000 connect status 115 connected 0
> >                           sock state 2
> > ...
> >
> >
> > State Transition and Events with patch applied:
> > TCP_CLOSING(8)
> > TCP_LAST_ACK(9)
> > TCP_CLOSE(7)
> > connect(reuse of port succeeds)
> >
> > dmesg excerpts showing reconnect on same port (936):
> > [  257.139935] NFS: mkdir(0:59/560857152), testQ
> > [  257.139937] NFS call  mkdir testQ
> > ...
> > [  307.822702] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> > [  307.822714] RPC:       xs_data_ready...
> > [  307.822817] RPC:       xs_tcp_state_change client 00000000ce702f14...
> > [  307.822821] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [  307.822825] RPC:       xs_tcp_state_change client 00000000ce702f14...
> > [  307.822826] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [  307.823606] RPC:       xs_tcp_state_change client 00000000ce702f14...
> > [  307.823609] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [  307.823629] RPC:       xs_tcp_state_change client 00000000ce702f14...
> > [  307.823632] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [  307.823676] RPC:       xs_connect scheduled xprt 00000000ce702f14
> > [  307.823704] RPC:       xs_bind 0.0.0.0:936: ok (0)
> > [  307.823709] RPC:       worker connecting xprt 00000000ce702f14 via tcp
> >                          to 10.101.1.30 (port 2049)
> > [  307.823748] RPC:       00000000ce702f14 connect status 115 connected 0
> >                          sock state 2
> > ...
> > [  314.916193] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> > [  314.916251] RPC:       xs_connect scheduled xprt 00000000ce702f14
> > [  314.916282] RPC:       xs_bind 0.0.0.0:936: ok (0)
> > [  314.916292] RPC:       worker connecting xprt 00000000ce702f14 via tcp
> >                          to 10.101.1.30 (port 2049)
> > [  314.916342] RPC:       00000000ce702f14 connect status 115 connected 0
> >                          sock state 2
> >
> > Fixes: 7c81e6a9d75b (SUNRPC: Tweak TCP socket shutdown in the RPC client)
> > Signed-off-by: Siddharth Rajendra Kawar <sikawar@microsoft.com>
> > ---
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index d8ee06a9650a..e55e380bc10c 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -2094,6 +2094,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
> >                break;
> >        case TCP_ESTABLISHED:
> >        case TCP_CLOSE_WAIT:
> > +       case TCP_LAST_ACK:
> >                kernel_sock_shutdown(sock, SHUT_RDWR);
> >                trace_rpc_socket_shutdown(xprt, sock);
> >                break;
> 
> The only way to transition from TCP_CLOSE_WAIT to TCP_LAST_ACK is by closing the socket ( see https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdocs%2Fen%2Fzos%2F2.1.0%3Ftopic%3DSSLTBW_2.1.0%2Fcom.ibm.zos.v2r1.halu101%2Fconstatus.html&data=05%7C01%7CSiddharth.Kawar%40microsoft.com%7C96b3c75820714bd3753208db24c266f8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638144189616407625%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uuXzgjaPk7YpnY6nO0NgNBPeq5HaA5%2BNLyrNFe%2BcjE8%3D&reserved=0 ), so closing it while in TCP_LAST_ACK is redundant.

You are right. The TCP_LAST_ACK state should not cause the socket to be
closed again. Instead, as in the case of TCP_FIN_WAIT1 and TCP_FIN_WAIT2,
the TCP_LAST_ACK state should not be calling xs_reset_transport. Will fix
in v2.

> 
> 
> _________________________________
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-03-15 19:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-08 19:01 [PATCH] SUNRPC: fix shutdown of NFS TCP client socket Siddharth Kawar
2023-03-14 19:29 ` Trond Myklebust
2023-03-15 19:34   ` Siddharth Kawar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).