From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] SUNRPC: Remove rpc_xprt::tsh_size
Date: Thu, 3 Jan 2019 15:53:56 -0500 [thread overview]
Message-ID: <90B38E07-3241-4CCD-A4C8-AB78BADFB0CD@oracle.com> (raw)
In-Reply-To: <0331de80b8161f8bf16a92de20049cafb0c228da.camel@hammerspace.com>
> On Jan 3, 2019, at 1:47 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Thu, 2019-01-03 at 13:29 -0500, Chuck Lever wrote:
>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>> index d5ce1a8..66b08aa 100644
>> --- a/net/sunrpc/xprtsock.c
>> +++ b/net/sunrpc/xprtsock.c
>> @@ -678,6 +678,31 @@ static void xs_stream_data_receive_workfn(struct
>> work_struct *work)
>>
>> #define XS_SENDMSG_FLAGS (MSG_DONTWAIT | MSG_NOSIGNAL)
>>
>> +static int xs_send_record_marker(struct sock_xprt *transport,
>> + const struct rpc_rqst *req)
>> +{
>> + static struct msghdr msg = {
>> + .msg_name = NULL,
>> + .msg_namelen = 0,
>> + .msg_flags = (XS_SENDMSG_FLAGS | MSG_MORE),
>> + };
>> + rpc_fraghdr marker;
>> + struct kvec iov = {
>> + .iov_base = &marker,
>> + .iov_len = sizeof(marker),
>> + };
>> + u32 reclen;
>> +
>> + if (unlikely(!transport->sock))
>> + return -ENOTSOCK;
>> + if (req->rq_bytes_sent)
>> + return 0;
>
> The test needs to use transport->xmit.offset, not req->rq_bytes_sent.
OK, that seems to work better.
> You also need to update transport->xmit.offset on success,
That causes the first 4 bytes of the rq_snd_buf to not be sent.
Not updating xmit.offset seems more correct.
> and be
> prepared to handle the case where < sizeof(marker) bytes get
> transmitted due to a write_space condition.
Probably the only recourse is to break the connection.
>> +
>> + reclen = req->rq_snd_buf.len;
>> + marker = cpu_to_be32(RPC_LAST_STREAM_FRAGMENT | reclen);
>> + return kernel_sendmsg(transport->sock, &msg, &iov, 1,
>> iov.iov_len);
>
>
> So what does this do for performance? I'd expect that adding another
> dive into the socket layer will come with penalties.
NFSv3 on TCP, sec=sys, 56Gbs IBoIP, v4.20 + my v4.21 patches
fio, 8KB random, 70% read, 30% write, 16 threads, iodepth=16
Without this patch:
read: IOPS=28.7k, BW=224MiB/s (235MB/s)(11.2GiB/51092msec)
write: IOPS=12.3k, BW=96.3MiB/s (101MB/s)(4918MiB/51092msec)
With this patch:
read: IOPS=28.6k, BW=224MiB/s (235MB/s)(11.2GiB/51276msec)
write: IOPS=12.3k, BW=95.8MiB/s (100MB/s)(4914MiB/51276msec)
Seems like that's in the noise.
--
Chuck Lever
next prev parent reply other threads:[~2019-01-03 20:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-03 18:29 [PATCH] SUNRPC: Remove rpc_xprt::tsh_size Chuck Lever
2019-01-03 18:47 ` Trond Myklebust
2019-01-03 20:53 ` Chuck Lever [this message]
2019-01-03 21:07 ` Chuck Lever
2019-01-03 21:28 ` Trond Myklebust
2019-01-03 21:35 ` Chuck Lever
2019-01-03 22:49 ` Chuck Lever
2019-01-04 4:00 ` Trond Myklebust
2019-01-04 21:35 ` Chuck Lever
2019-01-04 22:44 ` Trond Myklebust
2019-01-10 17:13 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=90B38E07-3241-4CCD-A4C8-AB78BADFB0CD@oracle.com \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).