From: Trond Myklebust <trondmy@hammerspace.com>
To: Siddharth Kawar <Siddharth.Kawar@microsoft.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>,
Bruce Fields <bfields@fieldses.org>,
Charles Edward Lever <chuck.lever@oracle.com>,
"David S. Miller" <davem@davemloft.net>,
"kuba@kernel.org" <kuba@kernel.org>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Dave Bauer <Dave.Bauer@microsoft.com>
Subject: Re: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
Date: Tue, 14 Mar 2023 19:29:13 +0000 [thread overview]
Message-ID: <116C9B6B-8990-4D85-8D9C-97764F9AA089@hammerspace.com> (raw)
In-Reply-To: <DM4PR21MB368002F6F7A9A0EB8D7CFC87FCB49@DM4PR21MB3680.namprd21.prod.outlook.com>
> On Mar 8, 2023, at 14:01, Siddharth Kawar <Siddharth.Kawar@microsoft.com> wrote:
>
> [You don't often get email from siddharth.kawar@microsoft.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Subject: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
>
> NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
> reconnecting using the same local TCP port. Unique NFS operations are
> identified by the per-TCP connection set of XIDs. This prevents file
> corruption when non-idempotent NFS operations are retried.
>
> Currently, NFS client TCP connections are using different local TCP ports
> when reconnecting to NFS servers.
>
> After an NFS server initiates shutdown of the TCP connection, the NFS
> client's TCP socket is set to NULL after the socket state has reached
> TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
> the same local port but fails with EADDRNOTAVAIL (99). This forces the
> socket to use a different local TCP port to reconnect to the remote NFS
> server.
>
> State Transition and Events:
> TCP_CLOSING(8)
No: TCP state == 8 is TCP_CLOSE_WAIT
> TCP_LAST_ACK(9)
> connect(fail EADDRNOTAVAIL(99))
> TCP_CLOSE(7)
> bind on new port
> connect success
>
> dmesg excerpts showing reconnect switching from local port of 688 to 1014:
> [590701.200229] NFS: mkdir(0:61/560857152), testQ
> [590701.200231] NFS call mkdir testQ
> [590701.200259] RPC: xs_tcp_send_request(224) = 0
> [590751.883111] RPC: xs_tcp_state_change client 0000000051f4e000...
> [590751.883146] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> [590751.883160] RPC: xs_data_ready...
> [590751.883232] RPC: xs_tcp_state_change client 0000000051f4e000...
> [590751.883235] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883238] RPC: xs_tcp_state_change client 0000000051f4e000...
> [590751.883239] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883283] RPC: xs_connect scheduled xprt 0000000051f4e000
> [590751.883314] RPC: xs_bind 0.0.0.0:688: ok (0)
> [590751.883321] RPC: worker connecting xprt 0000000051f4e000 via tcp
> to 10.101.1.30 (port 2049)
> [590751.883330] RPC: 0000000051f4e000 connect status 99 connected 0
> sock state 7
> [590751.883342] RPC: xs_tcp_state_change client 0000000051f4e000...
> [590751.883343] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883356] RPC: xs_connect scheduled xprt 0000000051f4e000
> [590751.883383] RPC: xs_bind 0.0.0.0:1014: ok (0)
> [590751.883388] RPC: worker connecting xprt 0000000051f4e000 via tcp
> to 10.101.1.30 (port 2049)
> [590751.883420] RPC: 0000000051f4e000 connect status 115 connected 0
> sock state 2
> ...
>
>
> State Transition and Events with patch applied:
> TCP_CLOSING(8)
> TCP_LAST_ACK(9)
> TCP_CLOSE(7)
> connect(reuse of port succeeds)
>
> dmesg excerpts showing reconnect on same port (936):
> [ 257.139935] NFS: mkdir(0:59/560857152), testQ
> [ 257.139937] NFS call mkdir testQ
> ...
> [ 307.822702] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> [ 307.822714] RPC: xs_data_ready...
> [ 307.822817] RPC: xs_tcp_state_change client 00000000ce702f14...
> [ 307.822821] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [ 307.822825] RPC: xs_tcp_state_change client 00000000ce702f14...
> [ 307.822826] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [ 307.823606] RPC: xs_tcp_state_change client 00000000ce702f14...
> [ 307.823609] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [ 307.823629] RPC: xs_tcp_state_change client 00000000ce702f14...
> [ 307.823632] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [ 307.823676] RPC: xs_connect scheduled xprt 00000000ce702f14
> [ 307.823704] RPC: xs_bind 0.0.0.0:936: ok (0)
> [ 307.823709] RPC: worker connecting xprt 00000000ce702f14 via tcp
> to 10.101.1.30 (port 2049)
> [ 307.823748] RPC: 00000000ce702f14 connect status 115 connected 0
> sock state 2
> ...
> [ 314.916193] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [ 314.916251] RPC: xs_connect scheduled xprt 00000000ce702f14
> [ 314.916282] RPC: xs_bind 0.0.0.0:936: ok (0)
> [ 314.916292] RPC: worker connecting xprt 00000000ce702f14 via tcp
> to 10.101.1.30 (port 2049)
> [ 314.916342] RPC: 00000000ce702f14 connect status 115 connected 0
> sock state 2
>
> Fixes: 7c81e6a9d75b (SUNRPC: Tweak TCP socket shutdown in the RPC client)
> Signed-off-by: Siddharth Rajendra Kawar <sikawar@microsoft.com>
> ---
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index d8ee06a9650a..e55e380bc10c 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2094,6 +2094,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
> break;
> case TCP_ESTABLISHED:
> case TCP_CLOSE_WAIT:
> + case TCP_LAST_ACK:
> kernel_sock_shutdown(sock, SHUT_RDWR);
> trace_rpc_socket_shutdown(xprt, sock);
> break;
The only way to transition from TCP_CLOSE_WAIT to TCP_LAST_ACK is by closing the socket ( see https://www.ibm.com/docs/en/zos/2.1.0?topic=SSLTBW_2.1.0/com.ibm.zos.v2r1.halu101/constatus.html ), so closing it while in TCP_LAST_ACK is redundant.
_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
next prev parent reply other threads:[~2023-03-14 19:29 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-08 19:01 [PATCH] SUNRPC: fix shutdown of NFS TCP client socket Siddharth Kawar
2023-03-14 19:29 ` Trond Myklebust [this message]
2023-03-15 19:34 ` Siddharth Kawar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=116C9B6B-8990-4D85-8D9C-97764F9AA089@hammerspace.com \
--to=trondmy@hammerspace.com \
--cc=Dave.Bauer@microsoft.com \
--cc=Siddharth.Kawar@microsoft.com \
--cc=anna.schumaker@netapp.com \
--cc=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).