All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "dwysocha@redhat.com" <dwysocha@redhat.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH 4/4] NFS: Fix fscache read from NFS after cache error
Date: Tue, 29 Jun 2021 12:45:51 +0000	[thread overview]
Message-ID: <321b6e11718979668b5ab129a7048a511a9886a9.camel@hammerspace.com> (raw)
In-Reply-To: <CALF+zOn=p6wuZ_pdifwWtLOUsiArkxBHHVDEnYcxsBdQy1LtVw@mail.gmail.com>

On Tue, 2021-06-29 at 05:17 -0400, David Wysochanski wrote:
> On Mon, Jun 28, 2021 at 8:39 PM Trond Myklebust
> <trondmy@hammerspace.com> wrote:
> > 
> > On Mon, 2021-06-28 at 19:46 -0400, David Wysochanski wrote:
> > > On Mon, Jun 28, 2021 at 5:59 PM Trond Myklebust
> > > <trondmy@hammerspace.com> wrote:
> > > > 
> > > > On Mon, 2021-06-28 at 17:12 -0400, David Wysochanski wrote:
> > > > > On Mon, Jun 28, 2021 at 3:09 PM Trond Myklebust
> > > > > <trondmy@hammerspace.com> wrote:
> > > > > > 
> > > > > > On Mon, 2021-06-28 at 13:39 -0400, Dave Wysochanski wrote:
> > > > > > > Earlier commits refactored some NFS read code and removed
> > > > > > > nfs_readpage_async(), but neglected to properly fixup
> > > > > > > nfs_readpage_from_fscache_complete().  The code path is
> > > > > > > only hit when something unusual occurs with the
> > > > > > > cachefiles
> > > > > > > backing filesystem, such as an IO error or while a cookie
> > > > > > > is being invalidated.
> > > > > > > 
> > > > > > > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > > > > > > ---
> > > > > > >  fs/nfs/fscache.c | 14 ++++++++++++--
> > > > > > >  1 file changed, 12 insertions(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > > > > > > index c4c021c6ebbd..d308cb7e1dd4 100644
> > > > > > > --- a/fs/nfs/fscache.c
> > > > > > > +++ b/fs/nfs/fscache.c
> > > > > > > @@ -381,15 +381,25 @@ static void
> > > > > > > nfs_readpage_from_fscache_complete(struct page *page,
> > > > > > >                                                void
> > > > > > > *context,
> > > > > > >                                                int error)
> > > > > > >  {
> > > > > > > +       struct nfs_readdesc desc;
> > > > > > > +       struct inode *inode = page->mapping->host;
> > > > > > > +
> > > > > > >         dfprintk(FSCACHE,
> > > > > > >                  "NFS: readpage_from_fscache_complete
> > > > > > > (0x%p/0x%p/%d)\n",
> > > > > > >                  page, context, error);
> > > > > > > 
> > > > > > > -       /* if the read completes with an error, we just
> > > > > > > unlock
> > > > > > > the
> > > > > > > page and let
> > > > > > > -        * the VM reissue the readpage */
> > > > > > >         if (!error) {
> > > > > > >                 SetPageUptodate(page);
> > > > > > >                 unlock_page(page);
> > > > > > > +       } else {
> > > > > > > +               desc.ctx = context;
> > > > > > > +               nfs_pageio_init_read(&desc.pgio, inode,
> > > > > > > false,
> > > > > > > +
> > > > > > > &nfs_async_read_completion_ops);
> > > > > > > +               error = readpage_async_filler(&desc,
> > > > > > > page);
> > > > > > > +               if (error)
> > > > > > > +                       return;
> > > > > > 
> > > > > > This code path can clearly fail too. Why can we not fix
> > > > > > this
> > > > > > code
> > > > > > to
> > > > > > allow it to return that reported error so that we can
> > > > > > handle
> > > > > > the
> > > > > > failure case in nfs_readpage() instead of dead-ending here?
> > > > > > 
> > > > > 
> > > > > Maybe the below patch is what you had in mind?  That way if
> > > > > fscache
> > > > > is enabled, nfs_readpage() should behave the same way as if
> > > > > it's
> > > > > not,
> > > > > for the case where an IO error occurs in the NFS read
> > > > > completion
> > > > > path.
> > > > > 
> > > > > If we call into fscache and we get back that the IO has been
> > > > > submitted,
> > > > > wait until it is completed, so we'll catch any IO errors in
> > > > > the
> > > > > read
> > > > > completion
> > > > > path.  This does not solve the "catch the internal errors",
> > > > > IOW,
> > > > > the
> > > > > ones that show up as pg_error, that will probably require
> > > > > copying
> > > > > pg_error into nfs_open_context.error field.
> > > > > 
> > > > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > > > > index 78b9181e94ba..28e3318080e0 100644
> > > > > --- a/fs/nfs/read.c
> > > > > +++ b/fs/nfs/read.c
> > > > > @@ -357,13 +357,13 @@ int nfs_readpage(struct file *file,
> > > > > struct
> > > > > page
> > > > > *page)
> > > > >         } else
> > > > >                 desc.ctx =
> > > > > get_nfs_open_context(nfs_file_open_context(file));
> > > > > 
> > > > > +       xchg(&desc.ctx->error, 0);
> > > > >         if (!IS_SYNC(inode)) {
> > > > >                 ret = nfs_readpage_from_fscache(desc.ctx,
> > > > > inode,
> > > > > page);
> > > > >                 if (ret == 0)
> > > > > -                       goto out;
> > > > > +                       goto out_wait;
> > > > >         }
> > > > > 
> > > > > -       xchg(&desc.ctx->error, 0);
> > > > >         nfs_pageio_init_read(&desc.pgio, inode, false,
> > > > >                              &nfs_async_read_completion_ops);
> > > > > 
> > > > > @@ -373,6 +373,7 @@ int nfs_readpage(struct file *file,
> > > > > struct
> > > > > page
> > > > > *page)
> > > > > 
> > > > >         nfs_pageio_complete_read(&desc.pgio);
> > > > >         ret = desc.pgio.pg_error < 0 ? desc.pgio.pg_error :
> > > > > 0;
> > > > > +out_wait:
> > > > >         if (!ret) {
> > > > >                 ret = wait_on_page_locked_killable(page);
> > > > >                 if (!PageUptodate(page) && !ret)
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > > > +
> > > > > > > +               nfs_pageio_complete_read(&desc.pgio);
> > > > > > >         }
> > > > > > >  }
> > > > > > > 
> > > > > > 
> > > > > > --
> > > > > > Trond Myklebust
> > > > > > Linux NFS client maintainer, Hammerspace
> > > > > > trond.myklebust@hammerspace.com
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > > Yes, please. This avoids that duplication of NFS read code in
> > > > the
> > > > fscache layer.
> > > > 
> > > 
> > > If you mean patch 4 we still need that - I don't see anyway to
> > > avoid it.  The above just will make the fscache enabled
> > > path waits for the IO to complete, same as the non-fscache case.
> > > 
> > 
> > With the above, you can simplify patch 4/4 to just make the page
> > unlock
> > unconditional on the error, no?
> > 
> > i.e.
> >         if (!error)
> >                 SetPageUptodate(page);
> >         unlock_page(page);
> > 
> > End result: the client just does the same check as before and let's
> > the
> > vfs/mm decide based on the status of the PG_uptodate flag what to
> > do
> > next. I'm assuming that a retry won't cause fscache to do another
> > bio
> > attempt?
> > 
> 
> Yes I think you're right and I'm following - let me test it and I'll
> send a v2.
> Then we can drop patch #3 right?
> 
Sounds good. Thanks Dave!

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2021-06-29 12:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28 17:38 [PATCH 0/4] Fix a few error paths in nfs_readpage and fscache Dave Wysochanski
2021-06-28 17:39 ` [PATCH 1/4] NFS: Remove unnecessary inode parameter from nfs_pageio_complete_read() Dave Wysochanski
2021-06-28 17:39 ` [PATCH 2/4] NFS: Ensure nfs_readpage returns promptly when internal error occurs Dave Wysochanski
2021-06-28 19:17   ` Trond Myklebust
2021-06-28 20:00     ` David Wysochanski
2021-06-28 22:00       ` Trond Myklebust
2021-06-28 17:39 ` [PATCH 3/4] NFS: Allow internal use of read structs and functions Dave Wysochanski
2021-06-28 17:39 ` [PATCH 4/4] NFS: Fix fscache read from NFS after cache error Dave Wysochanski
2021-06-28 19:09   ` Trond Myklebust
2021-06-28 20:15     ` David Wysochanski
2021-06-28 21:12     ` David Wysochanski
2021-06-28 21:59       ` Trond Myklebust
2021-06-28 23:46         ` David Wysochanski
2021-06-29  0:39           ` Trond Myklebust
2021-06-29  9:17             ` David Wysochanski
2021-06-29 12:45               ` Trond Myklebust [this message]
2021-06-29 13:20                 ` David Wysochanski
2021-06-29 14:54                   ` Trond Myklebust
2021-06-29 15:29                     ` David Wysochanski
2021-06-29 15:50                       ` Trond Myklebust
2021-06-29 15:54                         ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=321b6e11718979668b5ab129a7048a511a9886a9.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=anna.schumaker@netapp.com \
    --cc=dwysocha@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.