linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv
@ 2020-01-15 20:37 Chuck Lever
  2020-01-17 18:39 ` Chuck Lever
  2020-01-17 21:46 ` J. Bruce Fields
  0 siblings, 2 replies; 4+ messages in thread
From: Chuck Lever @ 2020-01-15 20:37 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs

svcrdma expects that the READ payload falls precisely into the
xdr_buf's page vector. Adding "xdr->iov = NULL" forces
xdr_reserve_space() to always use pages from xdr->buf->pages when
calling nfsd_readv.

Also, the XDR padding is problematic. For NFS/RDMA Write chunks,
the padding needs to be in xdr->buf->tail so that the transport can
skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks,
the padding has to be retained. Not yet sure how to add this.

Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
Howdy Bruce-

I'm struggling with nfsd4_encode_readv().

- for NFS/RDMA Write chunks, the READ payload has to be in
  buf->pages. I've fixed that.

- xdr_reserve_space() calls don't need to explicitly align the
  @nbytes argument: xdr_reserve_space() already does this?

- the while loop probably won't work if a later READ in the COMPOUND
  doesn't start on a page boundary. This isn't a problem until we
  run into a Solaris client in forcedirectio mode.

- the XDR padding doesn't work for NFS/RDMA Write chunks, which are
  supposed to skip padding altogether.

Do you have suggestions? Thanks in advance.


 fs/nfsd/nfs4xdr.c |   17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index d2dc4c0e22e8..14c68a136b4e 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	u32 zzz = 0;
 	int pad;
 
+	/* Ensure xdr_reserve_space behaves itself */
+	if (xdr->iov == xdr->buf->head) {
+		xdr->iov = NULL;
+		xdr->end = xdr->p;
+	}
+
 	len = maxcount;
 	v = 0;
-
-	thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
-	p = xdr_reserve_space(xdr, (thislen+3)&~3);
-	WARN_ON_ONCE(!p);
-	resp->rqstp->rq_vec[v].iov_base = p;
-	resp->rqstp->rq_vec[v].iov_len = thislen;
-	v++;
-	len -= thislen;
-
 	while (len) {
 		thislen = min_t(long, len, PAGE_SIZE);
 		p = xdr_reserve_space(xdr, (thislen+3)&~3);
@@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	read->rd_length = maxcount;
 	if (nfserr)
 		return nfserr;
-	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
+	xdr_truncate_encode(xdr, starting_len + 8 + maxcount);
 
 	tmp = htonl(eof);
 	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv
  2020-01-15 20:37 [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv Chuck Lever
@ 2020-01-17 18:39 ` Chuck Lever
  2020-01-17 21:46 ` J. Bruce Fields
  1 sibling, 0 replies; 4+ messages in thread
From: Chuck Lever @ 2020-01-17 18:39 UTC (permalink / raw)
  To: Bruce Fields; +Cc: Linux NFS Mailing List



> On Jan 15, 2020, at 3:37 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> svcrdma expects that the READ payload falls precisely into the
> xdr_buf's page vector. Adding "xdr->iov = NULL" forces
> xdr_reserve_space() to always use pages from xdr->buf->pages when
> calling nfsd_readv.
> 
> Also, the XDR padding is problematic. For NFS/RDMA Write chunks,
> the padding needs to be in xdr->buf->tail so that the transport can
> skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks,
> the padding has to be retained. Not yet sure how to add this.
> 
> Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> Howdy Bruce-
> 
> I'm struggling with nfsd4_encode_readv().
> 
> - for NFS/RDMA Write chunks, the READ payload has to be in
>  buf->pages. I've fixed that.
> 
> - xdr_reserve_space() calls don't need to explicitly align the
>  @nbytes argument: xdr_reserve_space() already does this?
> 
> - the while loop probably won't work if a later READ in the COMPOUND
>  doesn't start on a page boundary. This isn't a problem until we
>  run into a Solaris client in forcedirectio mode.
> 
> - the XDR padding doesn't work for NFS/RDMA Write chunks, which are
>  supposed to skip padding altogether.
> 
> Do you have suggestions? Thanks in advance.

I'm experimenting with an idea I think has been mentioned on list
a few times:

Having the RPC layer and transports deal with the padding of the
xdr_buf->pages vector, and moving that responsibility out of the
NFSD Reply encoder functions. xdr_buf->tail[0] then always begins
on an XDR-aligned boundary.

This should be straightforward for NFSv3. The only two Reply
encoders that are updated are READ and READLINK. I'm starting
with that.

Not quite sure yet how krb5i/krb5p will deal with this. Obviously
the page list pad needs to be inserted before each Reply is
wrapped.

I'll post more as experimentation progresses.


> fs/nfsd/nfs4xdr.c |   17 +++++++----------
> 1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index d2dc4c0e22e8..14c68a136b4e 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
> 	u32 zzz = 0;
> 	int pad;
> 
> +	/* Ensure xdr_reserve_space behaves itself */
> +	if (xdr->iov == xdr->buf->head) {
> +		xdr->iov = NULL;
> +		xdr->end = xdr->p;
> +	}
> +
> 	len = maxcount;
> 	v = 0;
> -
> -	thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
> -	p = xdr_reserve_space(xdr, (thislen+3)&~3);
> -	WARN_ON_ONCE(!p);
> -	resp->rqstp->rq_vec[v].iov_base = p;
> -	resp->rqstp->rq_vec[v].iov_len = thislen;
> -	v++;
> -	len -= thislen;
> -
> 	while (len) {
> 		thislen = min_t(long, len, PAGE_SIZE);
> 		p = xdr_reserve_space(xdr, (thislen+3)&~3);
> @@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
> 	read->rd_length = maxcount;
> 	if (nfserr)
> 		return nfserr;
> -	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
> +	xdr_truncate_encode(xdr, starting_len + 8 + maxcount);
> 
> 	tmp = htonl(eof);
> 	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv
  2020-01-15 20:37 [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv Chuck Lever
  2020-01-17 18:39 ` Chuck Lever
@ 2020-01-17 21:46 ` J. Bruce Fields
  2020-01-17 21:48   ` Chuck Lever
  1 sibling, 1 reply; 4+ messages in thread
From: J. Bruce Fields @ 2020-01-17 21:46 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Wed, Jan 15, 2020 at 03:37:33PM -0500, Chuck Lever wrote:
> svcrdma expects that the READ payload falls precisely into the
> xdr_buf's page vector. Adding "xdr->iov = NULL" forces
> xdr_reserve_space() to always use pages from xdr->buf->pages when
> calling nfsd_readv.
> 
> Also, the XDR padding is problematic. For NFS/RDMA Write chunks,
> the padding needs to be in xdr->buf->tail so that the transport can
> skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks,
> the padding has to be retained. Not yet sure how to add this.
> 
> Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> Howdy Bruce-
> 
> I'm struggling with nfsd4_encode_readv().
> 
> - for NFS/RDMA Write chunks, the READ payload has to be in
>   buf->pages. I've fixed that.
> 
> - xdr_reserve_space() calls don't need to explicitly align the
>   @nbytes argument: xdr_reserve_space() already does this?
> 
> - the while loop probably won't work if a later READ in the COMPOUND
>   doesn't start on a page boundary. This isn't a problem until we
>   run into a Solaris client in forcedirectio mode.

So the Solaris client sends multiple reads per compound in that case?

> - the XDR padding doesn't work for NFS/RDMA Write chunks, which are
>   supposed to skip padding altogether.

krb5i/p has to treat read data as padded regardless of the transport,
doesn't it?

--b.

> Do you have suggestions? Thanks in advance.
> 
> 
>  fs/nfsd/nfs4xdr.c |   17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index d2dc4c0e22e8..14c68a136b4e 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>  	u32 zzz = 0;
>  	int pad;
>  
> +	/* Ensure xdr_reserve_space behaves itself */
> +	if (xdr->iov == xdr->buf->head) {
> +		xdr->iov = NULL;
> +		xdr->end = xdr->p;
> +	}
> +
>  	len = maxcount;
>  	v = 0;
> -
> -	thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
> -	p = xdr_reserve_space(xdr, (thislen+3)&~3);
> -	WARN_ON_ONCE(!p);
> -	resp->rqstp->rq_vec[v].iov_base = p;
> -	resp->rqstp->rq_vec[v].iov_len = thislen;
> -	v++;
> -	len -= thislen;
> -
>  	while (len) {
>  		thislen = min_t(long, len, PAGE_SIZE);
>  		p = xdr_reserve_space(xdr, (thislen+3)&~3);
> @@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>  	read->rd_length = maxcount;
>  	if (nfserr)
>  		return nfserr;
> -	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
> +	xdr_truncate_encode(xdr, starting_len + 8 + maxcount);
>  
>  	tmp = htonl(eof);
>  	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv
  2020-01-17 21:46 ` J. Bruce Fields
@ 2020-01-17 21:48   ` Chuck Lever
  0 siblings, 0 replies; 4+ messages in thread
From: Chuck Lever @ 2020-01-17 21:48 UTC (permalink / raw)
  To: Bruce Fields; +Cc: Linux NFS Mailing List



> On Jan 17, 2020, at 4:46 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Wed, Jan 15, 2020 at 03:37:33PM -0500, Chuck Lever wrote:
>> svcrdma expects that the READ payload falls precisely into the
>> xdr_buf's page vector. Adding "xdr->iov = NULL" forces
>> xdr_reserve_space() to always use pages from xdr->buf->pages when
>> calling nfsd_readv.
>> 
>> Also, the XDR padding is problematic. For NFS/RDMA Write chunks,
>> the padding needs to be in xdr->buf->tail so that the transport can
>> skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks,
>> the padding has to be retained. Not yet sure how to add this.
>> 
>> Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
>> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> Howdy Bruce-
>> 
>> I'm struggling with nfsd4_encode_readv().
>> 
>> - for NFS/RDMA Write chunks, the READ payload has to be in
>>  buf->pages. I've fixed that.
>> 
>> - xdr_reserve_space() calls don't need to explicitly align the
>>  @nbytes argument: xdr_reserve_space() already does this?
>> 
>> - the while loop probably won't work if a later READ in the COMPOUND
>>  doesn't start on a page boundary. This isn't a problem until we
>>  run into a Solaris client in forcedirectio mode.
> 
> So the Solaris client sends multiple reads per compound in that case?

If the application uses readv() and the mount is in forcedirectio
mode, I'm told Solaris will send an NFSv4 COMPOUND with multiple
READ operations in it.


>> - the XDR padding doesn't work for NFS/RDMA Write chunks, which are
>>  supposed to skip padding altogether.
> 
> krb5i/p has to treat read data as padded regardless of the transport,
> doesn't it?

Yes. See my reply from earlier today: I think we can get away with
inserting the pad just before wrapping the Reply.


> --b.
> 
>> Do you have suggestions? Thanks in advance.
>> 
>> 
>> fs/nfsd/nfs4xdr.c |   17 +++++++----------
>> 1 file changed, 7 insertions(+), 10 deletions(-)
>> 
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index d2dc4c0e22e8..14c68a136b4e 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>> 	u32 zzz = 0;
>> 	int pad;
>> 
>> +	/* Ensure xdr_reserve_space behaves itself */
>> +	if (xdr->iov == xdr->buf->head) {
>> +		xdr->iov = NULL;
>> +		xdr->end = xdr->p;
>> +	}
>> +
>> 	len = maxcount;
>> 	v = 0;
>> -
>> -	thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
>> -	p = xdr_reserve_space(xdr, (thislen+3)&~3);
>> -	WARN_ON_ONCE(!p);
>> -	resp->rqstp->rq_vec[v].iov_base = p;
>> -	resp->rqstp->rq_vec[v].iov_len = thislen;
>> -	v++;
>> -	len -= thislen;
>> -
>> 	while (len) {
>> 		thislen = min_t(long, len, PAGE_SIZE);
>> 		p = xdr_reserve_space(xdr, (thislen+3)&~3);
>> @@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
>> 	read->rd_length = maxcount;
>> 	if (nfserr)
>> 		return nfserr;
>> -	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
>> +	xdr_truncate_encode(xdr, starting_len + 8 + maxcount);
>> 
>> 	tmp = htonl(eof);
>> 	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);

--
Chuck Lever




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-01-17 21:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-15 20:37 [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv Chuck Lever
2020-01-17 18:39 ` Chuck Lever
2020-01-17 21:46 ` J. Bruce Fields
2020-01-17 21:48   ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).