From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <trondmy@gmail.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v3 26/44] SUNRPC: Improve latency for interactive tasks
Date: Thu, 27 Dec 2018 14:21:58 -0500 [thread overview]
Message-ID: <4D3465FB-041C-4BB1-AB75-03511FA5AAF1@oracle.com> (raw)
In-Reply-To: <20180917130335.112832-27-trond.myklebust@hammerspace.com>
> On Sep 17, 2018, at 9:03 AM, Trond Myklebust <trondmy@gmail.com> wrote:
>
> One of the intentions with the priority queues was to ensure that no
> single process can hog the transport. The field task->tk_owner therefore
> identifies the RPC call's origin, and is intended to allow the RPC layer
> to organise queues for fairness.
> This commit therefore modifies the transmit queue to group requests
> by task->tk_owner, and ensures that we round robin among those groups.
>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
> include/linux/sunrpc/xprt.h | 1 +
> net/sunrpc/xprt.c | 27 ++++++++++++++++++++++++---
> 2 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index 8c2bb078f00c..e377620b9744 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -89,6 +89,7 @@ struct rpc_rqst {
> };
>
> struct list_head rq_xmit; /* Send queue */
> + struct list_head rq_xmit2; /* Send queue */
>
> void *rq_buffer; /* Call XDR encode buffer */
> size_t rq_callsize;
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 35f5df367591..3e68f35f71f6 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1052,12 +1052,21 @@ xprt_request_need_enqueue_transmit(struct rpc_task *task, struct rpc_rqst *req)
> void
> xprt_request_enqueue_transmit(struct rpc_task *task)
> {
> - struct rpc_rqst *req = task->tk_rqstp;
> + struct rpc_rqst *pos, *req = task->tk_rqstp;
> struct rpc_xprt *xprt = req->rq_xprt;
>
> if (xprt_request_need_enqueue_transmit(task, req)) {
> spin_lock(&xprt->queue_lock);
> + list_for_each_entry(pos, &xprt->xmit_queue, rq_xmit) {
> + if (pos->rq_task->tk_owner != task->tk_owner)
> + continue;
> + list_add_tail(&req->rq_xmit2, &pos->rq_xmit2);
> + INIT_LIST_HEAD(&req->rq_xmit);
> + goto out;
> + }
> list_add_tail(&req->rq_xmit, &xprt->xmit_queue);
> + INIT_LIST_HEAD(&req->rq_xmit2);
> +out:
> set_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate);
> spin_unlock(&xprt->queue_lock);
> }
> @@ -1073,8 +1082,20 @@ xprt_request_enqueue_transmit(struct rpc_task *task)
> static void
> xprt_request_dequeue_transmit_locked(struct rpc_task *task)
> {
> - if (test_and_clear_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate))
> - list_del(&task->tk_rqstp->rq_xmit);
> + struct rpc_rqst *req = task->tk_rqstp;
> +
> + if (!test_and_clear_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate))
> + return;
> + if (!list_empty(&req->rq_xmit)) {
> + list_del(&req->rq_xmit);
> + if (!list_empty(&req->rq_xmit2)) {
> + struct rpc_rqst *next = list_first_entry(&req->rq_xmit2,
> + struct rpc_rqst, rq_xmit2);
> + list_del(&req->rq_xmit2);
> + list_add_tail(&next->rq_xmit, &next->rq_xprt->xmit_queue);
> + }
> + } else
> + list_del(&req->rq_xmit2);
> }
>
> /**
> --
> 2.17.1
Hi Trond-
I've chased down a couple of remaining regressions with the v4.20 NFS client,
and they seem to be rooted in this commit.
When using sec=krb5, krb5i, or krb5p I found that multi-threaded workloads
trigger a lot of server-side disconnects. This is with TCP and RDMA transports.
An instrumented server shows that the client is under-running the GSS sequence
number window. I monitored the order in which GSS sequence numbers appear on
the wire, and after this commit, the sequence numbers are wildly misordered.
If I revert the hunk in xprt_request_enqueue_transmit, the problem goes away.
I also found that reverting that hunk results in a 3-4% improvement in fio
IOPS rates, as well as improvement in average and maximum latency as reported
by fio.
--
Chuck Lever
next prev parent reply other threads:[~2018-12-27 19:22 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-17 13:02 [PATCH v3 00/44] Convert RPC client transmission to a queued model Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 01/44] SUNRPC: Clean up initialisation of the struct rpc_rqst Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 02/44] SUNRPC: If there is no reply expected, bail early from call_decode Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 03/44] SUNRPC: The transmitted message must lie in the RPCSEC window of validity Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 04/44] SUNRPC: Simplify identification of when the message send/receive is complete Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 05/44] SUNRPC: Avoid holding locks across the XDR encoding of the RPC message Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 06/44] SUNRPC: Rename TCP receive-specific state variables Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 07/44] SUNRPC: Move reset of TCP state variables into the reconnect code Trond Myklebust
2018-09-17 13:02 ` [PATCH v3 08/44] SUNRPC: Add socket transmit queue offset tracking Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 09/44] SUNRPC: Simplify dealing with aborted partially transmitted messages Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 10/44] SUNRPC: Refactor the transport request pinning Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 11/44] SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 12/44] SUNRPC: Test whether the task is queued before grabbing the queue spinlocks Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 13/44] SUNRPC: Don't wake queued RPC calls multiple times in xprt_transmit Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 14/44] SUNRPC: Rename xprt->recv_lock to xprt->queue_lock Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 15/44] SUNRPC: Refactor xprt_transmit() to remove the reply queue code Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 16/44] SUNRPC: Refactor xprt_transmit() to remove wait for reply code Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 17/44] SUNRPC: Minor cleanup for call_transmit() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 18/44] SUNRPC: Distinguish between the slot allocation list and receive queue Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 19/44] SUNRPC: Add a transmission queue for RPC requests Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 20/44] SUNRPC: Refactor RPC call encoding Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 21/44] SUNRPC: Fix up the back channel transmit Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 22/44] SUNRPC: Treat the task and request as separate in the xprt_ops->send_request() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 23/44] SUNRPC: Don't reset the request 'bytes_sent' counter when releasing XPRT_LOCK Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 24/44] SUNRPC: Simplify xprt_prepare_transmit() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 25/44] SUNRPC: Move RPC retransmission stat counter to xprt_transmit() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 26/44] SUNRPC: Improve latency for interactive tasks Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 27/44] SUNRPC: Support for congestion control when queuing is enabled Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 28/44] SUNRPC: Enqueue swapper tagged RPCs at the head of the transmit queue Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 29/44] SUNRPC: Allow calls to xprt_transmit() to drain the entire " Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 30/44] SUNRPC: Allow soft RPC calls to time out when waiting for the XPRT_LOCK Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 31/44] SUNRPC: Turn off throttling of RPC slots for TCP sockets Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 32/44] SUNRPC: Clean up transport write space handling Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 33/44] SUNRPC: Cleanup: remove the unused 'task' argument from the request_send() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 34/44] SUNRPC: Don't take transport->lock unnecessarily when taking XPRT_LOCK Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 35/44] SUNRPC: Convert xprt receive queue to use an rbtree Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 36/44] SUNRPC: Fix priority queue fairness Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 37/44] SUNRPC: Convert the xprt->sending queue back to an ordinary wait queue Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 38/44] SUNRPC: Add a label for RPC calls that require allocation on receive Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 39/44] SUNRPC: Add a bvec array to struct xdr_buf for use with iovec_iter() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 40/44] SUNRPC: Simplify TCP receive code by switching to using iterators Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 41/44] SUNRPC: Clean up - rename xs_tcp_data_receive() to xs_stream_data_receive() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 42/44] SUNRPC: Allow AF_LOCAL sockets to use the generic stream receive Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 43/44] SUNRPC: Clean up xs_udp_data_receive() Trond Myklebust
2018-09-17 13:03 ` [PATCH v3 44/44] SUNRPC: Unexport xdr_partial_copy_from_skb() Trond Myklebust
2018-09-17 20:44 ` [PATCH v3 40/44] SUNRPC: Simplify TCP receive code by switching to using iterators Trond Myklebust
2018-11-09 11:19 ` Catalin Marinas
2018-11-29 19:28 ` Cristian Marussi
2018-11-29 19:56 ` Trond Myklebust
2018-11-30 16:19 ` Cristian Marussi
2018-11-30 19:31 ` Trond Myklebust
2018-12-02 16:44 ` Trond Myklebust
2018-12-03 11:45 ` Catalin Marinas
2018-12-03 11:53 ` Cristian Marussi
2018-12-03 18:54 ` Cristian Marussi
2018-12-27 19:21 ` Chuck Lever [this message]
2018-12-27 22:14 ` [PATCH v3 26/44] SUNRPC: Improve latency for interactive tasks Trond Myklebust
2018-12-27 22:34 ` Chuck Lever
2018-12-31 18:09 ` Trond Myklebust
2018-12-31 18:44 ` Chuck Lever
2018-12-31 18:59 ` Trond Myklebust
2018-12-31 19:09 ` Chuck Lever
2018-12-31 19:18 ` Trond Myklebust
2018-12-31 19:21 ` Trond Myklebust
2019-01-02 18:17 ` Chuck Lever
2019-01-02 18:45 ` Trond Myklebust
2019-01-02 18:51 ` Chuck Lever
2019-01-02 18:57 ` Trond Myklebust
2019-01-02 19:06 ` Trond Myklebust
2019-01-02 19:24 ` Trond Myklebust
2019-01-02 19:33 ` Chuck Lever
2019-01-02 19:08 ` Chuck Lever
2019-01-02 19:11 ` Trond Myklebust
2018-09-18 21:01 ` [PATCH v3 15/44] SUNRPC: Refactor xprt_transmit() to remove the reply queue code Anna Schumaker
2018-09-19 15:48 ` Trond Myklebust
2018-09-19 17:30 ` Anna Schumaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D3465FB-041C-4BB1-AB75-03511FA5AAF1@oracle.com \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).