linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Bruce Fields <bfields@fieldses.org>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv
Date: Fri, 17 Jan 2020 16:48:36 -0500	[thread overview]
Message-ID: <4AD54268-CDD2-4011-8EC8-8B259A1C2FB2@oracle.com> (raw)
In-Reply-To: <20200117214614.GC27294@fieldses.org>



> On Jan 17, 2020, at 4:46 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Wed, Jan 15, 2020 at 03:37:33PM -0500, Chuck Lever wrote:
>> svcrdma expects that the READ payload falls precisely into the
>> xdr_buf's page vector. Adding "xdr->iov = NULL" forces
>> xdr_reserve_space() to always use pages from xdr->buf->pages when
>> calling nfsd_readv.
>> 
>> Also, the XDR padding is problematic. For NFS/RDMA Write chunks,
>> the padding needs to be in xdr->buf->tail so that the transport can
>> skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks,
>> the padding has to be retained. Not yet sure how to add this.
>> 
>> Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
>> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> Howdy Bruce-
>> 
>> I'm struggling with nfsd4_encode_readv().
>> 
>> - for NFS/RDMA Write chunks, the READ payload has to be in
>>  buf->pages. I've fixed that.
>> 
>> - xdr_reserve_space() calls don't need to explicitly align the
>>  @nbytes argument: xdr_reserve_space() already does this?
>> 
>> - the while loop probably won't work if a later READ in the COMPOUND
>>  doesn't start on a page boundary. This isn't a problem until we
>>  run into a Solaris client in forcedirectio mode.
> 
> So the Solaris client sends multiple reads per compound in that case?

If the application uses readv() and the mount is in forcedirectio
mode, I'm told Solaris will send an NFSv4 COMPOUND with multiple
READ operations in it.


>> - the XDR padding doesn't work for NFS/RDMA Write chunks, which are
>>  supposed to skip padding altogether.
> 
> krb5i/p has to treat read data as padded regardless of the transport,
> doesn't it?

Yes. See my reply from earlier today: I think we can get away with
inserting the pad just before wrapping the Reply.


> --b.
> 
>> Do you have suggestions? Thanks in advance.
>> 
>> 
>> fs/nfsd/nfs4xdr.c |   17 +++++++----------
>> 1 file changed, 7 insertions(+), 10 deletions(-)
>> 
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index d2dc4c0e22e8..14c68a136b4e 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>> 	u32 zzz = 0;
>> 	int pad;
>> 
>> +	/* Ensure xdr_reserve_space behaves itself */
>> +	if (xdr->iov == xdr->buf->head) {
>> +		xdr->iov = NULL;
>> +		xdr->end = xdr->p;
>> +	}
>> +
>> 	len = maxcount;
>> 	v = 0;
>> -
>> -	thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
>> -	p = xdr_reserve_space(xdr, (thislen+3)&~3);
>> -	WARN_ON_ONCE(!p);
>> -	resp->rqstp->rq_vec[v].iov_base = p;
>> -	resp->rqstp->rq_vec[v].iov_len = thislen;
>> -	v++;
>> -	len -= thislen;
>> -
>> 	while (len) {
>> 		thislen = min_t(long, len, PAGE_SIZE);
>> 		p = xdr_reserve_space(xdr, (thislen+3)&~3);
>> @@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>> 	read->rd_length = maxcount;
>> 	if (nfserr)
>> 		return nfserr;
>> -	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
>> +	xdr_truncate_encode(xdr, starting_len + 8 + maxcount);
>> 
>> 	tmp = htonl(eof);
>> 	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);

--
Chuck Lever




      reply	other threads:[~2020-01-17 21:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15 20:37 [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv Chuck Lever
2020-01-17 18:39 ` Chuck Lever
2020-01-17 21:46 ` J. Bruce Fields
2020-01-17 21:48   ` Chuck Lever [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AD54268-CDD2-4011-8EC8-8B259A1C2FB2@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).