All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: "dan.aloni@vastdata.com" <dan.aloni@vastdata.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <Anna.Schumaker@netapp.com>
Subject: Re: [PATCH v3] NFSD: trim reads past NFS_OFFSET_MAX and fix NFSv3 check
Date: Wed, 26 Jan 2022 16:23:02 +0000	[thread overview]
Message-ID: <F94FB612-02EF-4524-97A7-C4DCD1E1902E@oracle.com> (raw)
In-Reply-To: <A1AF622A-6794-4104-8A89-EA5CD9E19D7D@oracle.com>

Hi Dan-

Dropping stable@. No need to copy them on this discussion.

Also, you don't need to actually cc: stable when you repost.
Just add the tag as you did below. Automation will pick up
the patch when it goes into mainline.

More below.


> On Jan 23, 2022, at 12:03 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Jan 23, 2022, at 10:29 AM, Trond Myklebust <trondmy@hammerspace.com> wrote:
>> 
>> On Sun, 2022-01-23 at 11:50 +0200, Dan Aloni wrote:
>>> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
>>> the
>>> RPC read layers") on the client, a read of 0xfff is aligned up to
>>> server
>>> rsize of 0x1000.
>>> 
>>> As a result, in a test where the server has a file of size
>>> 0x7fffffffffffffff, and the client tries to read from the offset
>>> 0x7ffffffffffff000, the read causes loff_t overflow in the server and
>>> it
>>> returns an NFS code of EINVAL to the client. The client as a result
>>> indefinitely retries the request.
>>> 
>>> This fixes the issue at server side by trimming reads past
>>> NFS_OFFSET_MAX. It also adds a missing check for out of bound offset
>>> in
>>> NFSv3, copying a similar check from NFSv4.x.
>>> 
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
>>> ---
>>> fs/nfsd/nfs4proc.c | 3 +++
>>> fs/nfsd/vfs.c      | 6 ++++++
>>> 2 files changed, 9 insertions(+)
>>> 
>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>> index 486c5dba4b65..816bdf212559 100644
>>> --- a/fs/nfsd/nfs4proc.c
>>> +++ b/fs/nfsd/nfs4proc.c
>>> @@ -785,6 +785,9 @@ nfsd4_read(struct svc_rqst *rqstp, struct
>>> nfsd4_compound_state *cstate,
>>>        if (read->rd_offset >= OFFSET_MAX)
>>>                return nfserr_inval;
>>> 
>>> +       if (unlikely(read->rd_offset + read->rd_length > OFFSET_MAX))
>>> +               read->rd_length = OFFSET_MAX - read->rd_offset;
>>> +
>>>        trace_nfsd_read_start(rqstp, &cstate->current_fh,
>>>                              read->rd_offset, read->rd_length);
>>> 
>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>> index 738d564ca4ce..ad4df374433e 100644
>>> --- a/fs/nfsd/vfs.c
>>> +++ b/fs/nfsd/vfs.c
>>> @@ -1045,6 +1045,12 @@ __be32 nfsd_read(struct svc_rqst *rqstp,
>>> struct svc_fh *fhp,
>>>        struct file *file;
>>>        __be32 err;
>>> 
>>> +       if (unlikely(offset >= NFS_OFFSET_MAX))
>>> +               return nfserr_inval;
>> 
>> POSIX does allow you to lseek to the maximum filesize offset (sb-
>>> s_maxbytes), however any subsequent read will return 0 bytes (i.e.
>> eof), whereas a subsequent write will return EFBIG.
> 
> I'm simply trying to clarify your request.
> 
> You've stated that the Linux NFS client does not handle INVAL
> responses, even though both RFC 1813 and 8881 permit it. So
> are you suggesting (here) that the Linux NFS server should
> not return INVAL on READs beyond the filesystem's supported
> maximum file size but instead return a successful 0-byte
> result with EOF set?

After some thought and discussion with Solaris NFS server
engineers, I think this is the best response to a READ
whose range arguments go past the server's advertised
maxfilesize.

So can you please confirm that your final fix does:

1. The range of values that was failing before but shouldn't
   have, now succeeds

2. When the offset is less than maxfilesize but offset+count
   exceeds it, the READ should succeed but return a short
   result and set EOF

3. When the range is completely outside maxfilesize, the
   READ should succeed but return zero bytes and set EOF

And I don't mind if you split the fix over two or three
patches if that makes it more clear.


>>> +
>>> +       if (unlikely(offset + *count > NFS_OFFSET_MAX))
>>> +               *count = NFS_OFFSET_MAX - offset;
>> 
>> Can we please fix this to use the actual per-filesystem file size
>> limit, which is set as sb->s_maxbytes, instead of using NFS_OFFSET_MAX?
>> 
>> Ditto for 'read' above.
> 
> That sounds reasonable.

I'm wondering whether the VFS itself will bound the range
arguments relative to sb->s_maxbytes. I'm told that the
kiocb iterators used in fs/nfsd/vfs.c should do that.

All that NFSD has to do is ensure that the incoming
client values are converted from u64 to loff_t without
underflowing. So comparing @offset with OFFSET_MAX here
seems like the right thing to do?

We just don't want the READ to fail with INVAL.


> But I wonder if there are some other
> places that need the same treatment.

I've confirmed that there /are/ other places that need to
be fixed. I've created a series of patches that will address
those. The first of those was the COMMIT patch I posted
yesterday.

Dan, I'd like to include your READ fixes in that series.


>>> +
>>>        trace_nfsd_read_start(rqstp, fhp, offset, *count);
>>>        err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf);
>>>        if (err)
> 
> --
> Chuck Lever

--
Chuck Lever




      reply	other threads:[~2022-01-26 16:23 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-27 18:00 [PATCH] NFS: Always provide aligned buffers to the RPC read layers trondmy
2022-01-18 19:26 ` Dan Aloni
2022-01-18 19:33   ` [PATCH] NFS: fix an infinite request retry in an off-by-one last page read Dan Aloni
2022-01-18 19:43     ` Trond Myklebust
2022-01-21 18:50       ` [PATCH] NFSD: trim reads past NFS_OFFSET_MAX Dan Aloni
2022-01-21 22:32         ` Chuck Lever III
2022-01-22 12:47           ` Dan Aloni
2022-01-22 17:05             ` Chuck Lever III
2022-01-22 18:27               ` Trond Myklebust
2022-01-22 20:15                 ` Chuck Lever III
2022-01-22 20:30                   ` Trond Myklebust
2022-01-23 17:35                     ` Chuck Lever III
2022-01-22 19:01               ` Dan Aloni
2022-01-22 20:33                 ` Chuck Lever III
2022-01-22 12:49           ` [PATCH v2] NFSD: trim reads past NFS_OFFSET_MAX and fix NFSv3 check Dan Aloni
2022-01-22 17:37             ` Chuck Lever III
2022-01-23  6:29               ` Dan Aloni
2022-01-23  9:50               ` [PATCH v3] " Dan Aloni
2022-01-23 15:29                 ` Trond Myklebust
2022-01-23 17:03                   ` Chuck Lever III
2022-01-26 16:23                     ` Chuck Lever III [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F94FB612-02EF-4524-97A7-C4DCD1E1902E@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=dan.aloni@vastdata.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.