From: Benjamin Coddington <bcodding@redhat.com> To: bcodding@redhat.com Cc: dwysocha@redhat.com, linux-nfs@vger.kernel.org, smayhew@redhat.com, trondmy@hammerspace.com Subject: [STABLE PATCH] SUNRPC: Always drop the XPRT_LOCK on XPRT_CLOSE_WAIT Date: Wed, 6 Feb 2019 06:41:11 -0500 [thread overview] Message-ID: <f3dda1a55b9a5d5a7416925358379608cc3a431e.1549453044.git.bcodding@redhat.com> (raw) In-Reply-To: <f3dda1a55b9a5d5a7416925358379608cc3a431e.1547069858.git.bcodding@redhat.com> This patch is only appropriate for stable kernels v4.16 - v4.19 Since commit 9b30889c548a ("SUNRPC: Ensure we always close the socket after a connection shuts down"), and until commit c544577daddb ("SUNRPC: Clean up transport write space handling"), it is possible for the NFS client to spin in the following tight loop: 269.964083: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=0 action=call_bind [sunrpc] 269.964083: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=0 action=call_connect [sunrpc] 269.964083: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=0 action=call_transmit [sunrpc] 269.964085: xprt_transmit: peer=[10.0.1.82]:2049 xid=0x761d3f77 status=-32 269.964085: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=-32 action=call_transmit_status [sunrpc] 269.964085: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=-32 action=call_status [sunrpc] 269.964085: rpc_call_status: task:43@0 status=-32 The issue is that the path through call_transmit_status does not release the XPRT_LOCK when the transmit result is -EPIPE, so the socket cannot be properly shut down. The below commit fixed things up in mainline by unconditionally calling xprt_end_transmit() and releasing the XPRT_LOCK after every pass through call_transmit. However, the entirety of this commit is not appropriate for stable kernels because its original inclusion was part of a series that modifies the sunrpc code to use a different queueing model. As a result, there are machinations within this patch that are not needed for a stable fix and will not make sense without a larger backport of the mainline series. In this patch, we take the slightly modified bit of the mainline patch below, which is to release the XPRT_LOCK on transmission error should we detect that the transport is waiting to close. commit c544577daddb618c7dd5fa7fb98d6a41782f020e upstream Author: Trond Myklebust <trond.myklebust@hammerspace.com> Date: Mon Sep 3 23:39:27 2018 -0400 SUNRPC: Clean up transport write space handling Treat socket write space handling in the same way we now treat transport congestion: by denying the XPRT_LOCK until the transport signals that it has free buffer space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> The original discussion of the problem is here: https://lore.kernel.org/linux-nfs/20181212135157.4489-1-dwysocha@redhat.com/T/#t This passes my usual cthon and xfstests on NFS as applied on v4.19 mainline. Reported-by: Dave Wysochanski <dwysocha@redhat.com> Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> --- include/linux/sunrpc/xprt.h | 5 +++++ net/sunrpc/clnt.c | 6 ++++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 336fd1a19cca..f30bf500888d 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -443,6 +443,11 @@ static inline int xprt_test_and_set_connecting(struct rpc_xprt *xprt) return test_and_set_bit(XPRT_CONNECTING, &xprt->state); } +static inline int xprt_close_wait(struct rpc_xprt *xprt) +{ + return test_bit(XPRT_CLOSE_WAIT, &xprt->state); +} + static inline void xprt_set_bound(struct rpc_xprt *xprt) { test_and_set_bit(XPRT_BOUND, &xprt->state); diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 8ea2f5fadd96..1fc812ba9871 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1992,13 +1992,15 @@ call_transmit(struct rpc_task *task) static void call_transmit_status(struct rpc_task *task) { + struct rpc_xprt *xprt = task->tk_rqstp->rq_xprt; task->tk_action = call_status; /* * Common case: success. Force the compiler to put this - * test first. + * test first. Or, if any error and xprt_close_wait, + * release the xprt lock so the socket can close. */ - if (task->tk_status == 0) { + if (task->tk_status == 0 || xprt_close_wait(xprt)) { xprt_end_transmit(task); rpc_task_force_reencode(task); return; -- 2.14.3
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Coddington <bcodding@redhat.com> To: stable@vger.kernel.org Cc: dwysocha@redhat.com, linux-nfs@vger.kernel.org, smayhew@redhat.com, trondmy@hammerspace.com Subject: [STABLE PATCH] SUNRPC: Always drop the XPRT_LOCK on XPRT_CLOSE_WAIT Date: Wed, 6 Feb 2019 06:42:15 -0500 [thread overview] Message-ID: <f3dda1a55b9a5d5a7416925358379608cc3a431e.1549453044.git.bcodding@redhat.com> (raw) Message-ID: <20190206114215.DEKMsSGdBxbPDnpAcS7wUP_pa612AIsfT1wMeQ5uJC8@z> (raw) In-Reply-To: <f3dda1a55b9a5d5a7416925358379608cc3a431e.1547069858.git.bcodding@redhat.com> This patch is only appropriate for stable kernels v4.16 - v4.19 Since commit 9b30889c548a ("SUNRPC: Ensure we always close the socket after a connection shuts down"), and until commit c544577daddb ("SUNRPC: Clean up transport write space handling"), it is possible for the NFS client to spin in the following tight loop: 269.964083: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=0 action=call_bind [sunrpc] 269.964083: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=0 action=call_connect [sunrpc] 269.964083: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=0 action=call_transmit [sunrpc] 269.964085: xprt_transmit: peer=[10.0.1.82]:2049 xid=0x761d3f77 status=-32 269.964085: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=-32 action=call_transmit_status [sunrpc] 269.964085: rpc_task_run_action: task:43@0 flags=5a81 state=0005 status=-32 action=call_status [sunrpc] 269.964085: rpc_call_status: task:43@0 status=-32 The issue is that the path through call_transmit_status does not release the XPRT_LOCK when the transmit result is -EPIPE, so the socket cannot be properly shut down. The below commit fixed things up in mainline by unconditionally calling xprt_end_transmit() and releasing the XPRT_LOCK after every pass through call_transmit. However, the entirety of this commit is not appropriate for stable kernels because its original inclusion was part of a series that modifies the sunrpc code to use a different queueing model. As a result, there are machinations within this patch that are not needed for a stable fix and will not make sense without a larger backport of the mainline series. In this patch, we take the slightly modified bit of the mainline patch below, which is to release the XPRT_LOCK on transmission error should we detect that the transport is waiting to close. commit c544577daddb618c7dd5fa7fb98d6a41782f020e upstream Author: Trond Myklebust <trond.myklebust@hammerspace.com> Date: Mon Sep 3 23:39:27 2018 -0400 SUNRPC: Clean up transport write space handling Treat socket write space handling in the same way we now treat transport congestion: by denying the XPRT_LOCK until the transport signals that it has free buffer space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> The original discussion of the problem is here: https://lore.kernel.org/linux-nfs/20181212135157.4489-1-dwysocha@redhat.com/T/#t This passes my usual cthon and xfstests on NFS as applied on v4.19 mainline. Reported-by: Dave Wysochanski <dwysocha@redhat.com> Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> --- include/linux/sunrpc/xprt.h | 5 +++++ net/sunrpc/clnt.c | 6 ++++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 336fd1a19cca..f30bf500888d 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -443,6 +443,11 @@ static inline int xprt_test_and_set_connecting(struct rpc_xprt *xprt) return test_and_set_bit(XPRT_CONNECTING, &xprt->state); } +static inline int xprt_close_wait(struct rpc_xprt *xprt) +{ + return test_bit(XPRT_CLOSE_WAIT, &xprt->state); +} + static inline void xprt_set_bound(struct rpc_xprt *xprt) { test_and_set_bit(XPRT_BOUND, &xprt->state); diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 8ea2f5fadd96..1fc812ba9871 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1992,13 +1992,15 @@ call_transmit(struct rpc_task *task) static void call_transmit_status(struct rpc_task *task) { + struct rpc_xprt *xprt = task->tk_rqstp->rq_xprt; task->tk_action = call_status; /* * Common case: success. Force the compiler to put this - * test first. + * test first. Or, if any error and xprt_close_wait, + * release the xprt lock so the socket can close. */ - if (task->tk_status == 0) { + if (task->tk_status == 0 || xprt_close_wait(xprt)) { xprt_end_transmit(task); rpc_task_force_reencode(task); return; -- 2.14.3
next prev parent reply other threads:[~2019-02-06 11:41 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-09 21:53 [RFC for-stable PATCH] SUNRPC: Always drop the XPRT_LOCK on XPRT_CLOSE_WAIT Benjamin Coddington 2019-02-06 11:41 ` Benjamin Coddington [this message] 2019-02-06 11:42 ` [STABLE " Benjamin Coddington 2019-02-13 14:25 ` Greg KH
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=f3dda1a55b9a5d5a7416925358379608cc3a431e.1549453044.git.bcodding@redhat.com \ --to=bcodding@redhat.com \ --cc=dwysocha@redhat.com \ --cc=linux-nfs@vger.kernel.org \ --cc=smayhew@redhat.com \ --cc=trondmy@hammerspace.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).