linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Jeff Layton <jlayton@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-mm@kvack.org, jfs-discussion@lists.sourceforge.net,
	linux-xfs@vger.kernel.org, cluster-devel@redhat.com,
	linux-f2fs-devel@lists.sourceforge.net,
	v9fs-developer@lists.sourceforge.net,
	linux-nilfs@vger.kernel.org, linux-block@vger.kernel.org,
	dhowells@redhat.com, akpm@linux-foundation.org,
	hch@infradead.org, ross.zwisler@linux.intel.com,
	mawilcox@microsoft.com, jack@suse.com, viro@zeniv.linux.org.uk,
	corbet@lwn.net, neilb@suse.de, clm@fb.com, tytso@mit.edu,
	axboe@kernel.dk, josef@toxicpanda.com, hubcap@omnibond.com,
	rpeterso@redhat.com, bo.li.liu@oracle.com
Subject: Re: [PATCH v4 15/27] fs: retrofit old error reporting API onto new infrastructure
Date: Mon, 15 May 2017 12:42:46 +0200	[thread overview]
Message-ID: <20170515104246.GC16182@quack2.suse.cz> (raw)
In-Reply-To: <20170509154930.29524-16-jlayton@redhat.com>

On Tue 09-05-17 11:49:18, Jeff Layton wrote:
> Now that we have a better way to store and report errors that occur
> during writeback, we need to convert the existing codebase to use it. We
> could just adapt all of the filesystem code and related infrastructure
> to the new API, but that's a lot of churn.
> 
> When it comes to setting errors in the mapping, filemap_set_wb_error is
> a drop-in replacement for mapping_set_error. Turn that function into a
> simple wrapper around the new one.
> 
> Because we want to ensure that writeback errors are always reported at
> fsync time, inject filemap_report_wb_error calls much closer to the
> syscall boundary, in call_fsync.
> 
> For fsync calls (and things like the nfsd equivalent), we either return
> the error that the fsync operation returns, or the one returned by
> filemap_report_wb_error. In both cases, we advance the file->f_wb_err to
> the latest value. This allows us to provide new fsync semantics that
> will return errors that may have occurred previously and been viewed
> via other file descriptors.
> 
> The final piece of the puzzle is what to do about filemap_check_errors
> calls that are being called directly or via filemap_* functions. Here,
> we must take a little "creative license".
> 
> Since we now handle advancing the file->f_wb_err value at the generic
> filesystem layer, we no longer need those callers to clear errors out
> of the mapping or advance an errseq_t.
> 
> A lot of the existing codebase relies on being getting an error back
> from those functions when there is a writeback problem, so we do still
> want to have them report writeback errors somehow.
> 
> When reporting writeback errors, we will always report errors that have
> occurred since a particular point in time. With the old writeback error
> reporting, the time we used was "since it was last tested/cleared" which
> is entirely arbitrary and potentially racy. Now, we can at least report
> the latest error that has occurred since an arbitrary point in time
> (represented as a sampled errseq_t value).
> 
> In the case where we don't have a struct file to work with, this patch
> just has the wrappers sample the current mapping->wb_err value, and use
> that as an arbitrary point from which to check for errors.

I think this is really dangerous and we shouldn't do this. You are quite
likely to lose IO errors in such calls because you will ignore all errors
that happened during previous background writeback or even for IO that
managed to complete before we called filemap_fdatawait(). Maybe we need to
keep the original set-clear-bit IO error reporting for these cases, until
we can convert them to fdatawait_range_since()?

> That's probably not "correct" in all cases, particularly in the case of
> something like filemap_fdatawait, but I'm not sure it's any worse than
> what we already have, and this gives us a basis from which to work.
> 
> A lot of those callers will likely want to change to a model where they
> sample the errseq_t much earlier (perhaps when starting a transaction),
> store it in an appropriate place and then use that value later when
> checking to see if an error occurred.
> 
> That will almost certainly take some involvement from other subsystem
> maintainers. I'm quite open to adding new API functions to help enable
> this if that would be helpful, but I don't really want to do that until
> I better understand what's needed.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>

...

> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 5f7317875a67..7ce13281925f 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -187,6 +187,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>  		.nr_to_write = LONG_MAX,
>  		.for_reclaim = 0,
>  	};
> +	errseq_t since = READ_ONCE(file->f_wb_err);
>  
>  	if (unlikely(f2fs_readonly(inode->i_sb)))
>  		return 0;
> @@ -265,6 +266,8 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>  	}
>  
>  	ret = wait_on_node_pages_writeback(sbi, ino);
> +	if (ret == 0)
> +		ret = filemap_check_wb_error(NODE_MAPPING(sbi), since);
>  	if (ret)
>  		goto out;

So this conversion looks wrong and actually points to a larger issue with
the scheme. The problem is there are two mappings that come into play here
- file_inode(file)->i_mapping which is the data mapping and
NODE_MAPPING(sbi) which is the metadata mapping (and this is not a problem
specific to f2fs. For example ext2 also uses this scheme where block
devices' mapping is the metadata mapping). And we need to merge error
information from these two mappings so for the stamping scheme to work,
we'd need two stamps stored in struct file. One for data mapping and one
for metadata mapping. Or maybe there's some more clever scheme but for now
I don't see one...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-05-15 11:18 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-09 15:49 [PATCH v4 00/27] fs: introduce new writeback error reporting and convert existing API as a wrapper around it Jeff Layton
2017-05-09 15:49 ` [PATCH v4 01/27] fs: remove unneeded forward definition of mm_struct from fs.h Jeff Layton
2017-05-10 11:04   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 02/27] mm: drop "wait" parameter from write_one_page Jeff Layton
2017-05-09 15:49 ` [PATCH v4 03/27] mm: fix mapping_set_error call in me_pagecache_dirty Jeff Layton
2017-05-09 15:49 ` [PATCH v4 04/27] buffer: use mapping_set_error instead of setting the flag Jeff Layton
2017-05-09 15:49 ` [PATCH v4 05/27] btrfs: btrfs_wait_tree_block_writeback can be void return Jeff Layton
2017-05-10 11:09   ` Jan Kara
2017-05-19  4:07   ` Liu Bo
2017-05-09 15:49 ` [PATCH v4 06/27] fs: check for writeback errors after syncing out buffers in generic_file_fsync Jeff Layton
2017-05-10 12:48   ` Matthew Wilcox
2017-05-09 15:49 ` [PATCH v4 07/27] orangefs: don't call filemap_write_and_wait from fsync Jeff Layton
2017-05-09 15:49 ` [PATCH v4 08/27] dax: set errors in mapping when writeback fails Jeff Layton
2017-05-09 15:49 ` [PATCH v4 09/27] nilfs2: set the mapping error when calling SetPageError on writeback Jeff Layton
2017-05-09 15:49 ` [PATCH v4 10/27] 9p: set mapping error when writeback fails in launder_page Jeff Layton
2017-05-09 15:49 ` [PATCH v4 11/27] fuse: set mapping error in writepage_locked when it fails Jeff Layton
2017-05-10 11:13   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 12/27] cifs: set mapping error when page writeback fails in writepage or launder_pages Jeff Layton
2017-05-10 11:14   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 13/27] lib: add errseq_t type and infrastructure for handling it Jeff Layton
2017-05-09 22:03   ` NeilBrown
2017-05-10 11:29     ` Jeff Layton
2017-05-10 11:34   ` Jan Kara
2017-05-10 11:58     ` Jeff Layton
2017-05-10 14:18   ` Matthew Wilcox
2017-05-10 14:56     ` Jeff Layton
2017-05-09 15:49 ` [PATCH v4 14/27] fs: new infrastructure for writeback error handling and reporting Jeff Layton
2017-05-10 11:48   ` Jan Kara
2017-05-10 12:19     ` Jeff Layton
2017-05-10 13:46       ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 15/27] fs: retrofit old error reporting API onto new infrastructure Jeff Layton
2017-05-15 10:42   ` Jan Kara [this message]
2017-05-15 17:58     ` Jeff Layton
2017-05-19 19:20     ` Jeff Layton
2017-05-22 13:38       ` Jan Kara
2017-05-22 13:53         ` Jeff Layton
2017-05-22 17:53           ` Jan Kara
2017-05-22 19:09             ` Jeff Layton
2017-05-23  9:05               ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 16/27] fs: adapt sync_file_range to new reporting infrastructure Jeff Layton
2017-05-09 15:49 ` [PATCH v4 17/27] mm: remove AS_EIO and AS_ENOSPC flags Jeff Layton
2017-05-09 15:49 ` [PATCH v4 18/27] mm: don't TestClearPageError in __filemap_fdatawait_range Jeff Layton
2017-05-09 15:49 ` [PATCH v4 19/27] buffer: set errors in mapping at the time that the error occurs Jeff Layton
2017-05-15 11:53   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 20/27] cifs: cleanup writeback handling errors and comments Jeff Layton
2017-05-09 15:49 ` [PATCH v4 21/27] mm: clean up error handling in write_one_page Jeff Layton
2017-05-15 12:01   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 22/27] jbd2: don't reset error in journal_finish_inode_data_buffers Jeff Layton
2017-05-15 11:58   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 23/27] gfs2: clean up some filemap_* calls Jeff Layton
2017-05-10 16:18   ` Bob Peterson
2017-05-09 15:49 ` [PATCH v4 24/27][RFC] nfs: convert to new errseq_t based error tracking for writeback errors Jeff Layton
2017-05-09 15:49 ` [PATCH v4 25/27] Documentation: flesh out the section in vfs.txt on storing and reporting " Jeff Layton
2017-05-09 16:24   ` Jeff Layton
2017-05-09 15:49 ` [PATCH v4 26/27] mm: flesh out comments over mapping_set_error Jeff Layton
2017-05-09 15:49 ` [PATCH v4 27/27] mm: clean up comments in me_pagecache_dirty Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170515104246.GC16182@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bo.li.liu@oracle.com \
    --cc=clm@fb.com \
    --cc=cluster-devel@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=hubcap@omnibond.com \
    --cc=jack@suse.com \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=jlayton@redhat.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-nilfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mawilcox@microsoft.com \
    --cc=neilb@suse.de \
    --cc=ross.zwisler@linux.intel.com \
    --cc=rpeterso@redhat.com \
    --cc=tytso@mit.edu \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).