All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "dwysocha@redhat.com" <dwysocha@redhat.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH 4/4] NFS: Fix fscache read from NFS after cache error
Date: Tue, 29 Jun 2021 15:50:13 +0000	[thread overview]
Message-ID: <815639d7a8eff037fa3fabcb8e94f4053c2498d0.camel@hammerspace.com> (raw)
In-Reply-To: <CALF+zOkEQSKbUrmUMn8Bna3WGw1Qm3a0aoz+4XG7=TYsjbfsgg@mail.gmail.com>

On Tue, 2021-06-29 at 11:29 -0400, David Wysochanski wrote:
> On Tue, Jun 29, 2021 at 10:54 AM Trond Myklebust
> <trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-06-29 at 09:20 -0400, David Wysochanski wrote:
> > > On Tue, Jun 29, 2021 at 8:46 AM Trond Myklebust
> > > <trondmy@hammerspace.com> wrote:
> > > > 
> > > > On Tue, 2021-06-29 at 05:17 -0400, David Wysochanski wrote:
> > > > > On Mon, Jun 28, 2021 at 8:39 PM Trond Myklebust
> > > > > <trondmy@hammerspace.com> wrote:
> > > > > > 
> > > > > > On Mon, 2021-06-28 at 19:46 -0400, David Wysochanski wrote:
> > > > > > > On Mon, Jun 28, 2021 at 5:59 PM Trond Myklebust
> > > > > > > <trondmy@hammerspace.com> wrote:
> > > > > > > > 
> > > > > > > > On Mon, 2021-06-28 at 17:12 -0400, David Wysochanski
> > > > > > > > wrote:
> > > > > > > > > On Mon, Jun 28, 2021 at 3:09 PM Trond Myklebust
> > > > > > > > > <trondmy@hammerspace.com> wrote:
> > > > > > > > > > 
> > > > > > > > > > On Mon, 2021-06-28 at 13:39 -0400, Dave Wysochanski
> > > > > > > > > > wrote:
> > > > > > > > > > > Earlier commits refactored some NFS read code and
> > > > > > > > > > > removed
> > > > > > > > > > > nfs_readpage_async(), but neglected to properly
> > > > > > > > > > > fixup
> > > > > > > > > > > nfs_readpage_from_fscache_complete().  The code
> > > > > > > > > > > path
> > > > > > > > > > > is
> > > > > > > > > > > only hit when something unusual occurs with the
> > > > > > > > > > > cachefiles
> > > > > > > > > > > backing filesystem, such as an IO error or while
> > > > > > > > > > > a
> > > > > > > > > > > cookie
> > > > > > > > > > > is being invalidated.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Dave Wysochanski
> > > > > > > > > > > <dwysocha@redhat.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  fs/nfs/fscache.c | 14 ++++++++++++--
> > > > > > > > > > >  1 file changed, 12 insertions(+), 2 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > > > > > > > > > > index c4c021c6ebbd..d308cb7e1dd4 100644
> > > > > > > > > > > --- a/fs/nfs/fscache.c
> > > > > > > > > > > +++ b/fs/nfs/fscache.c
> > > > > > > > > > > @@ -381,15 +381,25 @@ static void
> > > > > > > > > > > nfs_readpage_from_fscache_complete(struct page
> > > > > > > > > > > *page,
> > > > > > > > > > >                                               
> > > > > > > > > > > void
> > > > > > > > > > > *context,
> > > > > > > > > > >                                               
> > > > > > > > > > > int
> > > > > > > > > > > error)
> > > > > > > > > > >  {
> > > > > > > > > > > +       struct nfs_readdesc desc;
> > > > > > > > > > > +       struct inode *inode = page->mapping-
> > > > > > > > > > > >host;
> > > > > > > > > > > +
> > > > > > > > > > >         dfprintk(FSCACHE,
> > > > > > > > > > >                  "NFS:
> > > > > > > > > > > readpage_from_fscache_complete
> > > > > > > > > > > (0x%p/0x%p/%d)\n",
> > > > > > > > > > >                  page, context, error);
> > > > > > > > > > > 
> > > > > > > > > > > -       /* if the read completes with an error,
> > > > > > > > > > > we
> > > > > > > > > > > just
> > > > > > > > > > > unlock
> > > > > > > > > > > the
> > > > > > > > > > > page and let
> > > > > > > > > > > -        * the VM reissue the readpage */
> > > > > > > > > > >         if (!error) {
> > > > > > > > > > >                 SetPageUptodate(page);
> > > > > > > > > > >                 unlock_page(page);
> > > > > > > > > > > +       } else {
> > > > > > > > > > > +               desc.ctx = context;
> > > > > > > > > > > +               nfs_pageio_init_read(&desc.pgio,
> > > > > > > > > > > inode,
> > > > > > > > > > > false,
> > > > > > > > > > > +
> > > > > > > > > > > &nfs_async_read_completion_ops);
> > > > > > > > > > > +               error =
> > > > > > > > > > > readpage_async_filler(&desc,
> > > > > > > > > > > page);
> > > > > > > > > > > +               if (error)
> > > > > > > > > > > +                       return;
> > > > > > > > > > 
> > > > > > > > > > This code path can clearly fail too. Why can we not
> > > > > > > > > > fix
> > > > > > > > > > this
> > > > > > > > > > code
> > > > > > > > > > to
> > > > > > > > > > allow it to return that reported error so that we
> > > > > > > > > > can
> > > > > > > > > > handle
> > > > > > > > > > the
> > > > > > > > > > failure case in nfs_readpage() instead of dead-
> > > > > > > > > > ending
> > > > > > > > > > here?
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Maybe the below patch is what you had in mind?  That
> > > > > > > > > way
> > > > > > > > > if
> > > > > > > > > fscache
> > > > > > > > > is enabled, nfs_readpage() should behave the same way
> > > > > > > > > as
> > > > > > > > > if
> > > > > > > > > it's
> > > > > > > > > not,
> > > > > > > > > for the case where an IO error occurs in the NFS read
> > > > > > > > > completion
> > > > > > > > > path.
> > > > > > > > > 
> > > > > > > > > If we call into fscache and we get back that the IO
> > > > > > > > > has
> > > > > > > > > been
> > > > > > > > > submitted,
> > > > > > > > > wait until it is completed, so we'll catch any IO
> > > > > > > > > errors
> > > > > > > > > in
> > > > > > > > > the
> > > > > > > > > read
> > > > > > > > > completion
> > > > > > > > > path.  This does not solve the "catch the internal
> > > > > > > > > errors",
> > > > > > > > > IOW,
> > > > > > > > > the
> > > > > > > > > ones that show up as pg_error, that will probably
> > > > > > > > > require
> > > > > > > > > copying
> > > > > > > > > pg_error into nfs_open_context.error field.
> > > > > > > > > 
> > > > > > > > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > > > > > > > > index 78b9181e94ba..28e3318080e0 100644
> > > > > > > > > --- a/fs/nfs/read.c
> > > > > > > > > +++ b/fs/nfs/read.c
> > > > > > > > > @@ -357,13 +357,13 @@ int nfs_readpage(struct file
> > > > > > > > > *file,
> > > > > > > > > struct
> > > > > > > > > page
> > > > > > > > > *page)
> > > > > > > > >         } else
> > > > > > > > >                 desc.ctx =
> > > > > > > > > get_nfs_open_context(nfs_file_open_context(file));
> > > > > > > > > 
> > > > > > > > > +       xchg(&desc.ctx->error, 0);
> > > > > > > > >         if (!IS_SYNC(inode)) {
> > > > > > > > >                 ret =
> > > > > > > > > nfs_readpage_from_fscache(desc.ctx,
> > > > > > > > > inode,
> > > > > > > > > page);
> > > > > > > > >                 if (ret == 0)
> > > > > > > > > -                       goto out;
> > > > > > > > > +                       goto out_wait;
> > > > > > > > >         }
> > > > > > > > > 
> > > > > > > > > -       xchg(&desc.ctx->error, 0);
> > > > > > > > >         nfs_pageio_init_read(&desc.pgio, inode,
> > > > > > > > > false,
> > > > > > > > > 
> > > > > > > > > &nfs_async_read_completion_ops);
> > > > > > > > > 
> > > > > > > > > @@ -373,6 +373,7 @@ int nfs_readpage(struct file
> > > > > > > > > *file,
> > > > > > > > > struct
> > > > > > > > > page
> > > > > > > > > *page)
> > > > > > > > > 
> > > > > > > > >         nfs_pageio_complete_read(&desc.pgio);
> > > > > > > > >         ret = desc.pgio.pg_error < 0 ?
> > > > > > > > > desc.pgio.pg_error
> > > > > > > > > :
> > > > > > > > > 0;
> > > > > > > > > +out_wait:
> > > > > > > > >         if (!ret) {
> > > > > > > > >                 ret =
> > > > > > > > > wait_on_page_locked_killable(page);
> > > > > > > > >                 if (!PageUptodate(page) && !ret)
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > > +
> > > > > > > > > > > +              
> > > > > > > > > > > nfs_pageio_complete_read(&desc.pgio);
> > > > > > > > > > >         }
> > > > > > > > > > >  }
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > --
> > > > > > > > > > Trond Myklebust
> > > > > > > > > > Linux NFS client maintainer, Hammerspace
> > > > > > > > > > trond.myklebust@hammerspace.com
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Yes, please. This avoids that duplication of NFS read
> > > > > > > > code
> > > > > > > > in
> > > > > > > > the
> > > > > > > > fscache layer.
> > > > > > > > 
> > > > > > > 
> > > > > > > If you mean patch 4 we still need that - I don't see
> > > > > > > anyway
> > > > > > > to
> > > > > > > avoid it.  The above just will make the fscache enabled
> > > > > > > path waits for the IO to complete, same as the non-
> > > > > > > fscache
> > > > > > > case.
> > > > > > > 
> > > > > > 
> > > > > > With the above, you can simplify patch 4/4 to just make the
> > > > > > page
> > > > > > unlock
> > > > > > unconditional on the error, no?
> > > > > > 
> > > > > > i.e.
> > > > > >         if (!error)
> > > > > >                 SetPageUptodate(page);
> > > > > >         unlock_page(page);
> > > > > > 
> > > > > > End result: the client just does the same check as before
> > > > > > and
> > > > > > let's
> > > > > > the
> > > > > > vfs/mm decide based on the status of the PG_uptodate flag
> > > > > > what
> > > > > > to
> > > > > > do
> > > > > > next. I'm assuming that a retry won't cause fscache to do
> > > > > > another
> > > > > > bio
> > > > > > attempt?
> > > > > > 
> > > > > 
> > > > > Yes I think you're right and I'm following - let me test it
> > > > > and
> > > > > I'll
> > > > > send a v2.
> > > > > Then we can drop patch #3 right?
> > > > > 
> > > > Sounds good. Thanks Dave!
> > > > 
> > > 
> > > This approach works but it differs from the original when an
> > > fscache
> > > error occurs.
> > > The original (see below) would call back into NFS to read from
> > > the
> > > server, but
> > > now we just let the VM handle it.  The VM will re-issue the read,
> > > but
> > > will go back into
> > > fscache again (because it's enabled), which may fail again.
> > 
> > How about marking the page on failure, then? I don't believe we
> > currently use PG_owner_priv_1 (a.k.a. PageOwnerPriv1, PageChecked,
> > PagePinned, PageForeign, PageSwapCache, PageXenRemapped) for
> > anything
> > and according to legend it is supposed to be usable by the fs for
> > page
> > cache pages.
> > 
> > So what say we use SetPageChecked() to mark the page as having
> > failed
> > retrieval from fscache?
> > 
> 
> So this?  I confirm this patch on top of the one I just sent works.
> Want me to merge them together and send a v3?
> 
> Author: Dave Wysochanski <dwysocha@redhat.com>
> Date:   Tue Jun 29 11:10:15 2021 -0400
> 
>     NFS: Mark page with PG_checked if fscache IO completes in error
> 
>     If fscache is enabled and we try to read from fscache, but the
>     IO fails, mark the page with PG_checked.  Then when the VM
>     re-issues the IO, skip over fscache and just read from the
> server.
> 
>     Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> 
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index 0966e147e973..687e98b08994 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -404,10 +404,12 @@ static void
> nfs_readpage_from_fscache_complete(struct page *page,
>                  "NFS: readpage_from_fscache_complete
> (0x%p/0x%p/%d)\n",
>                  page, context, error);
> 
> -       /* if the read completes with an error, unlock the page and
> let
> -        * the VM reissue the readpage */
> +       /* if the read completes with an error, mark the page with
> PG_checked,
> +        * unlock the page, and let the VM reissue the readpage */
>         if (!error)
>                 SetPageUptodate(page);
> +       else
> +               SetPageChecked(page);
>         unlock_page(page);
>  }
> 
> @@ -423,6 +425,11 @@ int __nfs_readpage_from_fscache(struct
> nfs_open_context *ctx,
>                  "NFS: readpage_from_fscache(fsc:%p/p:%p(i:%lx
> f:%lx)/0x%p)\n",
>                  nfs_i_fscache(inode), page, page->index, page-
> >flags, inode);
> 
> +       if (PageChecked(page)) {
> +               ClearPageChecked(page)
> +               return 1;
> +       }
> +
> 
>         ret = fscache_read_or_alloc_page(nfs_i_fscache(inode),
>                                          page,
>                                         
> nfs_readpage_from_fscache_complete,
> 

Yes, but how about just changing the above to:

	if (PageChecked(page))
		return 1;
	SetPageChecked(page);

Then you can short-circuit all further checks in
__nfs_readpage_from_fscache() if they've already failed once.

Note that I don't think it is useful to clear PageChecked() once it has
been set. Once a call to nfs_readpage() succeeds, the page will need to
be evicted from the page cache before we can call nfs_readpage() on it
again.


-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2021-06-29 15:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28 17:38 [PATCH 0/4] Fix a few error paths in nfs_readpage and fscache Dave Wysochanski
2021-06-28 17:39 ` [PATCH 1/4] NFS: Remove unnecessary inode parameter from nfs_pageio_complete_read() Dave Wysochanski
2021-06-28 17:39 ` [PATCH 2/4] NFS: Ensure nfs_readpage returns promptly when internal error occurs Dave Wysochanski
2021-06-28 19:17   ` Trond Myklebust
2021-06-28 20:00     ` David Wysochanski
2021-06-28 22:00       ` Trond Myklebust
2021-06-28 17:39 ` [PATCH 3/4] NFS: Allow internal use of read structs and functions Dave Wysochanski
2021-06-28 17:39 ` [PATCH 4/4] NFS: Fix fscache read from NFS after cache error Dave Wysochanski
2021-06-28 19:09   ` Trond Myklebust
2021-06-28 20:15     ` David Wysochanski
2021-06-28 21:12     ` David Wysochanski
2021-06-28 21:59       ` Trond Myklebust
2021-06-28 23:46         ` David Wysochanski
2021-06-29  0:39           ` Trond Myklebust
2021-06-29  9:17             ` David Wysochanski
2021-06-29 12:45               ` Trond Myklebust
2021-06-29 13:20                 ` David Wysochanski
2021-06-29 14:54                   ` Trond Myklebust
2021-06-29 15:29                     ` David Wysochanski
2021-06-29 15:50                       ` Trond Myklebust [this message]
2021-06-29 15:54                         ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=815639d7a8eff037fa3fabcb8e94f4053c2498d0.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=anna.schumaker@netapp.com \
    --cc=dwysocha@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.