linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] SUNRPC: Remove rpc_xprt::tsh_size
Date: Fri, 4 Jan 2019 22:44:19 +0000	[thread overview]
Message-ID: <e69a8e3965258109772f1adaedebc02d0863f3d9.camel@hammerspace.com> (raw)
In-Reply-To: <EE4DED67-E4A7-4F82-A5D0-9137381A2B4A@oracle.com>

On Fri, 2019-01-04 at 16:35 -0500, Chuck Lever wrote:
> > On Jan 3, 2019, at 11:00 PM, Trond Myklebust <
> > trondmy@hammerspace.com> wrote:
> > 
> > On Thu, 2019-01-03 at 17:49 -0500, Chuck Lever wrote:
> > > > On Jan 3, 2019, at 4:35 PM, Chuck Lever <chuck.lever@oracle.com
> > > > >
> > > > wrote:
> > > > 
> > > > > On Jan 3, 2019, at 4:28 PM, Trond Myklebust <
> > > > > trondmy@hammerspace.com> wrote:
> > > > > 
> > > > > On Thu, 2019-01-03 at 16:07 -0500, Chuck Lever wrote:
> > > > > > > On Jan 3, 2019, at 3:53 PM, Chuck Lever <
> > > > > > > chuck.lever@oracle.com>
> > > > > > > wrote:
> > > > > > > 
> > > > > > > > On Jan 3, 2019, at 1:47 PM, Trond Myklebust <
> > > > > > > > trondmy@hammerspace.com> wrote:
> > > > > > > > 
> > > > > > > > On Thu, 2019-01-03 at 13:29 -0500, Chuck Lever wrote:
> > > > > > > > > +	reclen = req->rq_snd_buf.len;
> > > > > > > > > +	marker = cpu_to_be32(RPC_LAST_STREAM_FRAGMENT |
> > > > > > > > > reclen);
> > > > > > > > > +	return kernel_sendmsg(transport->sock, &msg,
> > > > > > > > > &iov, 1,
> > > > > > > > > iov.iov_len);
> > > > > > > > 
> > > > > > > > So what does this do for performance? I'd expect that
> > > > > > > > adding
> > > > > > > > another
> > > > > > > > dive into the socket layer will come with penalties.
> > > > > > > 
> > > > > > > NFSv3 on TCP, sec=sys, 56Gbs IBoIP, v4.20 + my v4.21
> > > > > > > patches
> > > > > > > fio, 8KB random, 70% read, 30% write, 16 threads,
> > > > > > > iodepth=16
> > > > > > > 
> > > > > > > Without this patch:
> > > > > > > 
> > > > > > > read: IOPS=28.7k, BW=224MiB/s
> > > > > > > (235MB/s)(11.2GiB/51092msec)
> > > > > > > write: IOPS=12.3k, BW=96.3MiB/s
> > > > > > > (101MB/s)(4918MiB/51092msec)
> > > > > > > 
> > > > > > > With this patch:
> > > > > > > 
> > > > > > > read: IOPS=28.6k, BW=224MiB/s
> > > > > > > (235MB/s)(11.2GiB/51276msec)
> > > > > > > write: IOPS=12.3k, BW=95.8MiB/s
> > > > > > > (100MB/s)(4914MiB/51276msec)
> > > > > > > 
> > > > > > > Seems like that's in the noise.
> > > > > > 
> > > > > > Sigh. That's because it was the same kernel. Again, with
> > > > > > feeling:
> > > > > > 
> > > > > > 4.20.0-rc7-00048-g9274254:
> > > > > > read: IOPS=28.6k, BW=224MiB/s (235MB/s)(11.2GiB/51276msec)
> > > > > > write: IOPS=12.3k, BW=95.8MiB/s
> > > > > > (100MB/s)(4914MiB/51276msec)
> > > > > > 
> > > > > > 4.20.0-rc7-00049-ga4dea15:
> > > > > > read: IOPS=27.2k, BW=212MiB/s (223MB/s)(11.2GiB/53979msec)
> > > > > > write: IOPS=11.7k, BW=91.1MiB/s
> > > > > > (95.5MB/s)(4917MiB/53979msec)
> > > > > > 
> > > > > 
> > > > > So about a 5% reduction in performance?
> > > > 
> > > > On this workload, yes.
> > > > 
> > > > Could send the record marker in xs_send_kvec with the head[0]
> > > > iovec.
> > > > I'm going to try that next.
> > > 
> > > That helps:
> > > 
> > > Linux 4.20.0-rc7-00049-g664f679 #651 SMP Thu Jan 3 17:35:26 EST
> > > 2019
> > > 
> > >   read: IOPS=28.7k, BW=224MiB/s (235MB/s)(11.2GiB/51185msec)
> > >  write: IOPS=12.3k, BW=96.1MiB/s (101MB/s)(4919MiB/51185msec)
> > > 
> > 
> > Interesting... Perhaps we might be able to eke out a few more
> > percent
> > performance on file writes by also converting xs_send_pagedata() to
> > use
> > a single sock_sendmsg() w/ iov_iter rather than looping through
> > several
> > calls to sendpage()?
> 
> IMO...
> 
> For small requests (say, smaller than 17 pages), packing the head,
> pagevec,
> and tail into an iov_iter and sending them all via a single
> sock_sendmsg
> call would likely be efficient.
> 
> For larger requests, other overheads would dominate. And you'd have
> to keep around an iter array that held 257 entries... You could pass
> a
> large pagevec to sock_sendmsg in smaller chunks.
> 
> Are you thinking of converting xs_sendpages (or even xdr_bufs) to use
> iov_iter directly?

For now, I was thinking of just converting xs_sendpages to call
xdr_alloc_bvec(), and then do the equivalent of what xs_read_bvec()
does for receives today.

The next step is to convert xdr_bufs to use bvecs natively instead of
having to allocate them to shadow the array of pages. I believe someone
was working on allowing a single bvec to take an array of pages
(containing contiguous data), which would make that conversion almost
trivial.

The final step would be to do as you say, to pack the kvecs into the
same call to sock_sendmsg() as the bvecs. We might imagine adding a new
type of iov_iter that can iterate over an array of struct iov_iter in
order to deal with this case?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2019-01-04 22:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-03 18:29 [PATCH] SUNRPC: Remove rpc_xprt::tsh_size Chuck Lever
2019-01-03 18:47 ` Trond Myklebust
2019-01-03 20:53   ` Chuck Lever
2019-01-03 21:07     ` Chuck Lever
2019-01-03 21:28       ` Trond Myklebust
2019-01-03 21:35         ` Chuck Lever
2019-01-03 22:49           ` Chuck Lever
2019-01-04  4:00             ` Trond Myklebust
2019-01-04 21:35               ` Chuck Lever
2019-01-04 22:44                 ` Trond Myklebust [this message]
2019-01-10 17:13                   ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e69a8e3965258109772f1adaedebc02d0863f3d9.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).