All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Dan Aloni <dan.aloni@vastdata.com>
Cc: Anna Schumaker <Anna.Schumaker@netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v2] NFSD: trim reads past NFS_OFFSET_MAX and fix NFSv3 check
Date: Sat, 22 Jan 2022 17:37:16 +0000	[thread overview]
Message-ID: <ADEC85C2-8D72-4E25-A49B-2039C1FF82F2@oracle.com> (raw)
In-Reply-To: <20220122124953.1606281-1-dan.aloni@vastdata.com>

Some additional comments on v2 below. We need to sort the
NFS_OFFSET_MAX v. OFFSET_MAX issue before you send a v3.


> On Jan 22, 2022, at 7:49 AM, Dan Aloni <dan.aloni@vastdata.com> wrote:
> 
> Due to change in client 8cfb9015280d ("NFS: Always provide aligned
> buffers to the RPC read layers"), a read of 0xfff is aligned up to
> server rsize of 0x0fff.

scripts/checkpatch.pl will complain about the way you name the
commit here. It will want:

Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers
to the RPC read layers") on the client, 


> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server and it
> returns an NFS code of EINVAL to the client. The client as a result
> indefinitely retries the request.
> 
> This fixes the issue at server side by trimming reads past
> NFS_OFFSET_MAX. It also adds a missing check for out of bound offset
> in NFSv3.

RFC 1813 section 3.3.6 does say this:

>>       It is possible for the server to return fewer than count
>>       bytes of data. If the server returns less than the count
>>       requested and eof set to FALSE, the client should issue
>>       another READ to get the remaining data. A server may
>>       return less data than requested under several
>>       circumstances. The file may have been truncated by another
>>       client or perhaps on the server itself, changing the file
>>       size from what the requesting client believes to be the
>>       case. This would reduce the actual amount of data
>>       available to the client. It is possible that the server
>>       may back off the transfer size and reduce the read request
>>       return. Server resource exhaustion may also occur
>>       necessitating a smaller read return.

Similar language in RFC 8881 section 18.22.4:

>>    If the server returns a "short read" (i.e., fewer data than requested
>>    and eof is set to FALSE), the client should send another READ to get
>>    the remaining data.  A server may return less data than requested
>>    under several circumstances.  The file may have been truncated by
>>    another client or perhaps on the server itself, changing the file
>>    size from what the requesting client believes to be the case.  This
>>    would reduce the actual amount of data available to the client.  It
>>    is possible that the server reduce the transfer size and so return a
>>    short read result.  Server resource exhaustion may also occur in a
>>    short read.

So the server could be returning INVAL and leaving EOF set
to false. That might be triggering the client to retry the
READ. Does the server need to set the EOF flag if the READ
arguments cross the limit of the range that the server can
return? Does the client need to check for this case and
stop retrying? The specs aren't particularly clear on this
matter.


> Fixes: 8cfb9015280d ("NFS: Always provide aligned buffers to the RPC read layers")

It's arguable that you are fixing 8cfb9015280d. I don't
think that commit is actually broken.

Also, the server behavior is wrong even before that commit,
and that commit is a client change... Mentioning this commit
at the top of the patch description is fine, since that is
how you discovered the problem, but I'd prefer if you drop
this line.

The patch does warrant a Cc: stable, though.


> Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
> ---
> fs/nfsd/nfs4proc.c |  3 +++
> fs/nfsd/vfs.c      | 11 +++++++++++
> 2 files changed, 14 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 486c5dba4b65..3b1e395a93b6 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -788,6 +788,9 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> 	trace_nfsd_read_start(rqstp, &cstate->current_fh,
> 			      read->rd_offset, read->rd_length);
> 
> +	if (unlikely(read->rd_offset + read->rd_length > NFS_OFFSET_MAX))
> +		read->rd_length = NFS_OFFSET_MAX - read->rd_offset;
> +

rd_offset is range-checked before the trace point, so I think
this check needs to go before the trace point as well. That
way the adjusted argument values are recorded.

And we need to verify whether NFS_OFFSET_MAX is the correct
constant for this check.


> 	/*
> 	 * If we do a zero copy read, then a client will see read data
> 	 * that reflects the state of the file *after* performing the
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 738d564ca4ce..4a209f807466 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1046,6 +1046,16 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> 	__be32 err;
> 
> 	trace_nfsd_read_start(rqstp, fhp, offset, *count);
> +
> +	if (unlikely(offset > NFS_OFFSET_MAX)) {
> +		/* We can return this according to Section 3.3.6 */

RFC 1813 section 3.3.6 says that READ is permitted to return
NFS3ERR_INVAL, but does not mandate that status code in this
particular case (it's provided for general issues similar to
this). So returning INVAL here is an implementation choice.

Thus mentioning the spec here is IMO perhaps misleading, so
I'd like you to drop this comment.


> +		err = nfserr_inval;
> +		goto error;
> +	}
> +
> +	if (unlikely(offset + *count > NFS_OFFSET_MAX))
> +		*count = NFS_OFFSET_MAX - offset;
> +

Same as above: these range-checks need to go before the trace
point, IMO.


> 	err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf);
> 	if (err)
> 		return err;
> @@ -1058,6 +1068,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> 
> 	nfsd_file_put(nf);
> 
> +error:
> 	trace_nfsd_read_done(rqstp, fhp, offset, *count);
> 
> 	return err;
> -- 
> 2.23.0


--
Chuck Lever




  reply	other threads:[~2022-01-22 17:37 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-27 18:00 [PATCH] NFS: Always provide aligned buffers to the RPC read layers trondmy
2022-01-18 19:26 ` Dan Aloni
2022-01-18 19:33   ` [PATCH] NFS: fix an infinite request retry in an off-by-one last page read Dan Aloni
2022-01-18 19:43     ` Trond Myklebust
2022-01-21 18:50       ` [PATCH] NFSD: trim reads past NFS_OFFSET_MAX Dan Aloni
2022-01-21 22:32         ` Chuck Lever III
2022-01-22 12:47           ` Dan Aloni
2022-01-22 17:05             ` Chuck Lever III
2022-01-22 18:27               ` Trond Myklebust
2022-01-22 20:15                 ` Chuck Lever III
2022-01-22 20:30                   ` Trond Myklebust
2022-01-23 17:35                     ` Chuck Lever III
2022-01-22 19:01               ` Dan Aloni
2022-01-22 20:33                 ` Chuck Lever III
2022-01-22 12:49           ` [PATCH v2] NFSD: trim reads past NFS_OFFSET_MAX and fix NFSv3 check Dan Aloni
2022-01-22 17:37             ` Chuck Lever III [this message]
2022-01-23  6:29               ` Dan Aloni
2022-01-23  9:50               ` [PATCH v3] " Dan Aloni
2022-01-23 15:29                 ` Trond Myklebust
2022-01-23 17:03                   ` Chuck Lever III
2022-01-26 16:23                     ` Chuck Lever III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ADEC85C2-8D72-4E25-A49B-2039C1FF82F2@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=dan.aloni@vastdata.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.