Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Jeff Layton <jlayton@poochiereds.net>
To: Matthew Wilcox <mawilcox@microsoft.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"jlayton@kernel.org" <jlayton@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@ZenIV.linux.org.uk>,  Jan Kara <jack@suse.cz>,
	"tytso@mit.edu" <tytso@mit.edu>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>,
	"corbet@lwn.net" <corbet@lwn.net>, Chris Mason <clm@fb.com>,
	Josef Bacik <jbacik@fb.com>, David Sterba <dsterba@suse.com>,
	 Carlos Maiolino <cmaiolino@redhat.com>,
	Eryu Guan <eguan@redhat.com>, David Howells <dhowells@redhat.com>,
	 Christoph Hellwig <hch@infradead.org>,
	Liu Bo <bo.li.liu@oracle.com>,
	 "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [PATCH v8 12/18] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors
Date: Thu, 29 Jun 2017 16:42:14 -0400
Message-ID: <1498768934.5710.7.camel@poochiereds.net> (raw)
In-Reply-To: <BY2PR21MB003653755FD85FCE2C49393ECBD20@BY2PR21MB0036.namprd21.prod.outlook.com>

On Thu, 2017-06-29 at 18:21 +0000, Matthew Wilcox wrote:
> From: Jeff Layton [mailto:jlayton@poochiereds.net]
> > On Thu, 2017-06-29 at 10:11 -0700, Darrick J. Wong wrote:
> > > On Thu, Jun 29, 2017 at 09:19:48AM -0400, jlayton@kernel.org wrote:
> > > > +Handling errors during writeback
> > > > +--------------------------------
> > > > +Most applications that utilize the pagecache will periodically call
> > > > +fsync to ensure that data written has made it to the backing store.
> > > 
> > > /me wonders if this sentence ought to be worded more strongly, e.g.
> > > 
> > > "Applications that utilize the pagecache must call a data
> > > synchronization syscall such as fsync, fdatasync, or msync to ensure
> > > that data written has made it to the backing store."
> > 
> > Well...only if they care about the data. There are some that don't. :)
> 
> Also, applications don't "utilize the pagecache"; filesystems use the pagecache.
> Applications may or may not use cached I/O.  How about this:
> 

I meant "applications that do buffered I/O" as opposed to O_DIRECT, but
yeah that's not very clear.


> Applications which care about data integrity and use cached I/O will
> periodically call fsync(), msync() or fdatasync() to ensure that their
> data is durable.
> 
> > What should we do about sync_file_range here? It doesn't currently call
> > any filesystem operations directly, so we don't have a good way to make
> > it selectively use errseq_t handling there.
> > 
> > I could resurrect the FS_* flag for that, though I don't really like
> > that. Should I just go ahead and convert it over to use errseq_t under
> > the theory that most callers will eventually want that anyway?
> 
> I think so.

Ok, I'll leave that for the next pile of patches though.

Here's a revised section

------------------------------8<--------------------------------
Handling errors during writeback
--------------------------------
Most applications that do buffered I/O will periodically call a file
synchronization call (fsync, fdatasync, msync or sync_file_range) to
ensure that data written has made it to the backing store.  When there
is an error during writeback, they expect that error to be reported when
a file sync request is made.  After an error has been reported on one
request, subsequent requests on the same file descriptor should return
0, unless further writeback errors have occurred since the previous file
syncronization.

Ideally, the kernel would report errors only on file descriptions on
which writes were done that subsequently failed to be written back.  The
generic pagecache infrastructure does not track the file descriptions
that have dirtied each individual page however, so determining which
file descriptors should get back an error is not possible.

Instead, the generic writeback error tracking infrastructure in the
kernel settles for reporting errors to fsync on all file descriptions
that were open at the time that the error occurred.  In a situation with
multiple writers, all of them will get back an error on a subsequent
fsync,
even if all of the writes done through that particular file descriptor
succeeded (or even if there were no writes on that file descriptor at
all).

Filesystems that wish to use this infrastructure should call
mapping_set_error to record the error in the address_space when it
occurs.  Then, after writing back data from the pagecache in their
file->fsync operation, they should call file_check_and_advance_wb_err to
ensure that the struct file's error cursor has advanced to the correct
point in the stream of errors emitted by the backing device(s).
------------------------------8<--------------------------------

Thanks for the review so far!
-- 
Jeff Layton <jlayton@poochiereds.net>

  reply index

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-29 13:19 [PATCH v8 00/18] fs: enhanced writeback error reporting with errseq_t (pile #1) jlayton
2017-06-29 13:19 ` [PATCH v8 01/18] fs: remove call_fsync helper function jlayton
2017-06-29 13:19 ` [PATCH v8 02/18] buffer: use mapping_set_error instead of setting the flag jlayton
2017-06-29 13:19 ` [PATCH v8 03/18] fs: check for writeback errors after syncing out buffers in generic_file_fsync jlayton
2017-06-29 14:19   ` Christoph Hellwig
2017-06-29 20:17     ` Jeff Layton
2017-06-29 13:19 ` [PATCH v8 04/18] buffer: set errors in mapping at the time that the error occurs jlayton
2017-06-29 13:19 ` [PATCH v8 05/18] jbd2: don't clear and reset errors after waiting on writeback jlayton
2017-06-29 13:19 ` [PATCH v8 06/18] mm: clear AS_EIO/AS_ENOSPC when writeback initiation fails jlayton
2017-06-29 13:19 ` [PATCH v8 07/18] mm: don't TestClearPageError in __filemap_fdatawait_range jlayton
2017-06-29 13:19 ` [PATCH v8 08/18] mm: clean up error handling in write_one_page jlayton
2017-06-29 13:19 ` [PATCH v8 09/18] lib: add errseq_t type and infrastructure for handling it jlayton
2017-06-29 13:19 ` [PATCH v8 10/18] fs: new infrastructure for writeback error handling and reporting jlayton
2017-06-29 13:45   ` Jeff Layton
2017-06-29 17:52   ` Jeff Layton
2017-06-29 13:19 ` [PATCH v8 11/18] mm: set both AS_EIO/AS_ENOSPC and errseq_t in mapping_set_error jlayton
2017-06-29 13:19 ` [PATCH v8 12/18] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors jlayton
2017-06-29 17:11   ` Darrick J. Wong
2017-06-29 18:13     ` Jeff Layton
2017-06-29 18:21       ` Matthew Wilcox
2017-06-29 20:42         ` Jeff Layton [this message]
2017-06-29 13:19 ` [PATCH v8 13/18] dax: set errors in mapping when writeback fails jlayton
2017-06-29 13:19 ` [PATCH v8 14/18] block: convert to errseq_t based writeback error tracking jlayton
2017-06-29 14:18   ` Christoph Hellwig
2017-06-29 13:19 ` [PATCH v8 15/18] fs: convert __generic_file_fsync to use errseq_t based reporting jlayton
2017-06-29 13:19 ` [PATCH v8 16/18] ext4: use errseq_t based error handling for reporting data writeback errors jlayton
2017-06-29 14:12   ` Christoph Hellwig
2017-06-29 20:26     ` Jeff Layton
2017-06-29 13:19 ` [PATCH v8 17/18] xfs: minimal conversion to errseq_t writeback error reporting jlayton
2017-06-29 14:12   ` Christoph Hellwig
2017-06-30 16:45     ` Jeff Layton
2017-06-30 16:49       ` Christoph Hellwig
2017-06-29 17:13   ` Darrick J. Wong
2017-06-29 13:19 ` [PATCH v8 18/18] btrfs: minimal conversion to errseq_t writeback error reporting on fsync jlayton
2017-06-29 14:17   ` Christoph Hellwig
2017-06-29 20:32     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1498768934.5710.7.camel@poochiereds.net \
    --to=jlayton@poochiereds.net \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bo.li.liu@oracle.com \
    --cc=clm@fb.com \
    --cc=cmaiolino@redhat.com \
    --cc=corbet@lwn.net \
    --cc=darrick.wong@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=dsterba@suse.com \
    --cc=eguan@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=jlayton@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mawilcox@microsoft.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git