linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: NeilBrown <neilb@suse.com>, Jeff Layton <jlayton@kernel.org>,
	trond.myklebust@primarydata.com, anna.schumaker@netapp.com
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] nfs: track writeback errors with errseq_t
Date: Tue, 29 Aug 2017 06:54:18 -0400	[thread overview]
Message-ID: <1504004058.4679.7.camel@redhat.com> (raw)
In-Reply-To: <87pobfgo9q.fsf@notabene.neil.brown.name>

On Tue, 2017-08-29 at 11:23 +1000, NeilBrown wrote:
> On Mon, Aug 28 2017, Jeff Layton wrote:
> 
> > On Mon, 2017-08-28 at 09:24 +1000, NeilBrown wrote:
> > > On Fri, Aug 25 2017, Jeff Layton wrote:
> > > 
> > > > On Thu, 2017-07-20 at 15:42 -0400, Jeff Layton wrote:
> > > > > From: Jeff Layton <jlayton@redhat.com>
> > > > > 
> > > > > There is some ambiguity in nfs about how writeback errors are
> > > > > tracked.
> > > > > 
> > > > > For instance, nfs_pageio_add_request calls mapping_set_error when
> > > > > the
> > > > > add fails, but we track errors that occur after adding the
> > > > > request
> > > > > with a dedicated int error in the open context.
> > > > > 
> > > > > Now that we have better infrastructure for the vfs layer, this
> > > > > latter int is now unnecessary. Just have
> > > > > nfs_context_set_write_error set
> > > > > the error in the mapping when one occurs.
> > > > > 
> > > > > Have NFS use file_write_and_wait_range to initiate and wait on
> > > > > writeback
> > > > > of the data, and then check again after issuing the commit(s).
> > > > > 
> > > > > With this, we also don't need to pay attention to the ERROR_WRITE
> > > > > flag for reporting, and just clear it to indicate to subsequent
> > > > > writers that they should try to go asynchronous again.
> > > > > 
> > > > > In nfs_page_async_flush, sample the error before locking and
> > > > > joining
> > > > > the requests, and check for errors since that point.
> > > > > 
> > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > > > > ---
> > > > >  fs/nfs/file.c          | 24 +++++++++++-------------
> > > > >  fs/nfs/inode.c         |  3 +--
> > > > >  fs/nfs/write.c         |  8 ++++++--
> > > > >  include/linux/nfs_fs.h |  1 -
> > > > >  4 files changed, 18 insertions(+), 18 deletions(-)
> > > > > 
> > > > > I have a baling wire and duct tape solution for testing this with
> > > > > xfstests (using iptables REJECT targets and soft mounts). This
> > > > > seems to
> > > > > make nfs do the right thing.
> > > > > 
> > > > > diff --git a/fs/nfs/file.c b/fs/nfs/file.c
> > > > > index 5713eb32a45e..15d3c6faafd3 100644
> > > > > --- a/fs/nfs/file.c
> > > > > +++ b/fs/nfs/file.c
> > > > > @@ -212,25 +212,23 @@ nfs_file_fsync_commit(struct file *file,
> > > > > loff_t start, loff_t end, int datasync)
> > > > >  {
> > > > >  	struct nfs_open_context *ctx =
> > > > > nfs_file_open_context(file);
> > > > >  	struct inode *inode = file_inode(file);
> > > > > -	int have_error, do_resend, status;
> > > > > -	int ret = 0;
> > > > > +	int do_resend, status;
> > > > > +	int ret;
> > > > >  
> > > > >  	dprintk("NFS: fsync file(%pD2) datasync %d\n", file,
> > > > > datasync);
> > > > >  
> > > > >  	nfs_inc_stats(inode, NFSIOS_VFSFSYNC);
> > > > >  	do_resend =
> > > > > test_and_clear_bit(NFS_CONTEXT_RESEND_WRITES, &ctx->flags);
> > > > > -	have_error = test_and_clear_bit(NFS_CONTEXT_ERROR_WRITE,
> > > > > &ctx->flags);
> > > > > -	status = nfs_commit_inode(inode, FLUSH_SYNC);
> > > > > -	have_error |= test_bit(NFS_CONTEXT_ERROR_WRITE, &ctx-
> > > > > > flags);
> > > > > 
> > > > > -	if (have_error) {
> > > > > -		ret = xchg(&ctx->error, 0);
> > > > > -		if (ret)
> > > > > -			goto out;
> > > > > -	}
> > > > > -	if (status < 0) {
> > > > > +	clear_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags);
> > > > > +	ret = nfs_commit_inode(inode, FLUSH_SYNC);
> > > > > +
> > > > > +	/* Recheck and advance after the commit */
> > > > > +	status = file_check_and_advance_wb_err(file);
> > > 
> > > This change makes the code inconsistent with the comment above the
> > > function, which still references ctx->error.  The intent of the
> > > comment
> > > is still correct, but the details have changed.
> > > 
> > 
> > Good catch. I'll fix that up in a respin.
> > 
> > > Also, there is a call to mapping_set_error() in
> > > nfs_pageio_add_request().
> > > I wonder if that should be changed to
> > >   nfs_context_set_write_error(req->wb_context, desc->pg_error)
> > > ??
> > > 
> > 
> > Trickier question...
> > 
> > I'm not quite sure what semantics we're looking for with
> > NFS_CONTEXT_ERROR_WRITE. I know that it forces writes to be
> > synchronous, but I'm not quite sure why it gets cleared the way it
> > does. It's set on any error but cleared before issuing a commit.
> > 
> > I added a similar flag to Ceph inodes recently, but only clear it when
> > a write succeeds. Wouldn't that make more sense here as well?
> 
> It is a bit hard to wrap one's mind around.
> 
> In the original code (commit 7b159fc18d417980) it looks like:
>  - test-and-clear bit
>  - write and sync
>  - test-bit
> 
> This does, I think, seem safer than "clear on successful write" as the
> writes could complete out-of-order and I wouldn't be surprised if the
> unsuccessful ones completed with an error before the successful one -
> particularly with an error like EDQUOT.
> 
> However the current code does the writes before the test-and-clear, and
> only does the commit afterwards.  That makes it less clear why the
> current sequence is a good idea.
> 
> However ... nfs_file_fsync_commit() is only called if
> filemap_write_and_wait_range() returned with success, so we only clear
> the flag after successful writes(?).
> 
> Oh....
> This patch from me:
> 
> Commit: 2edb6bc3852c ("NFS - fix recent breakage to NFS error handling.")
> 
> seems to have been reverted by
> 
> Commit: 7b281ee02655 ("NFS: fsync() must exit with an error if page writeback failed")
> 
> which probably isn't good.  It appears that this code is very fragile
> and easily broken.
> Maybe we need to work out exactly what is required, and document it - so
> we can stop breaking it.
> Or maybe we need some unit tests.....
> 

Yes, laying out what's necessary for this would be very helpful. We
clearly want to set the flag when an error occurs. Under what
circumstances should we be clearing it?

I'm not sure we can really do much better than clearing it on a
successful write. With Ceph, was that this is just a hint to the write
submission mechanism and we generally aren't too concerned if a few slip
past in either direction.
-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2017-08-29 10:54 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-20 19:42 [PATCH] nfs: track writeback errors with errseq_t Jeff Layton
2017-08-25 17:59 ` Jeff Layton
2017-08-27 23:24   ` NeilBrown
2017-08-28 11:47     ` Jeff Layton
2017-08-29  1:23       ` NeilBrown
2017-08-29 10:54         ` Jeff Layton [this message]
2017-09-07  3:37           ` NeilBrown
2017-09-07 11:35             ` Jeff Layton
2017-09-07 14:54               ` Trond Myklebust
2017-09-11  3:24                 ` NeilBrown
2017-09-11 10:46                   ` Jeff Layton
2017-09-11 21:52                     ` NeilBrown
2017-09-12 15:20                       ` Jeff Layton
2017-09-12 21:47                         ` NeilBrown
2017-09-13 12:23                           ` Jeff Layton
2017-09-13 23:50                             ` [RFC PATCH manpages] write.2, fsync.2, close.2: update description of error codes NeilBrown
2017-09-14 10:48                               ` Jeff Layton
2017-09-15  7:50                                 ` Michael Kerrisk (man-pages)
2017-09-15  8:25                                   ` NeilBrown
2017-09-28  3:01                                 ` NeilBrown
2017-09-28 12:20                                   ` Jeff Layton
2017-09-28 16:19                                   ` Michael Kerrisk (man-opages)
2017-09-12  2:24                   ` [PATCH] nfs: track writeback errors with errseq_t Trond Myklebust
2017-09-12  5:29                     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1504004058.4679.7.camel@redhat.com \
    --to=jlayton@redhat.com \
    --cc=anna.schumaker@netapp.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).