linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Jeff Layton <jlayton@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 15/27] fs: retrofit old error reporting API onto new infrastructure
Date: Tue, 23 May 2017 11:05:57 +0200	[thread overview]
Message-ID: <20170523090557.GB1119@quack2.suse.cz> (raw)
In-Reply-To: <1495480173.2816.21.camel@redhat.com>

On Mon 22-05-17 15:09:33, Jeff Layton wrote:
> On Mon, 2017-05-22 at 19:53 +0200, Jan Kara wrote:
> > On Mon 22-05-17 09:53:21, Jeff Layton wrote:
> > > On Mon, 2017-05-22 at 15:38 +0200, Jan Kara wrote:
> > > > > In the case of something like ext2, could we instead get away with just
> > > > > marking the data mapping of the inode with an error if the metadata
> > > > > writeout fails?
> > > > > 
> > > > > Then we could just have write_inode operations call mapping_set_error on
> > > > > inode->i_mapping when they're going to return an error. That should be
> > > > > functionally equivalent, I'd think.
> > > > > 
> > > > > The catch there is that that requires a 1:1 data:metadata mapping, and
> > > > > I'm not sure that that is the case (or will always be, even if it is
> > > > > now).
> > > > 
> > > > So for ext2 / ext4 in nojournal mode this should work - we track all
> > > > relevant metadata in mapping->private_list. But I cannot really comment
> > > > on other filesystems like f2fs...
> > > > 
> > > 
> > > Actually, I think that may be problematic...
> > > 
> > > We could end up calling ext2_write_inode with sync_mode != WB_SYNC_ALL,
> > > which just dirties the buffer without starting writeback. Then, have VM
> > > subsystem write back the buffer due to memory pressure and have that
> > > fail. Trying to set the error in write_inode would miss that situation.
> > 
> > Two notes here:
> > 
> > 1) Inode is a bad example because there isn't 1:1 mapping between buffers
> > containing inodes and mappings - one buffer contains several inodes.
> > I wanted to add that for inodes specifically it does not matter as they get
> > special handling but actually fsync seems to be currently unreliable for
> > them - if we first wrote them in WB_SYNC_NONE mode, they will be just
> > written in bdev's page cache, but following fsync(2) will do nothing as
> > they will be clean. Anyway, this is unrelated problem.
> > 
> 
> Yes, that's what I was trying to articulate above. I'm not sure it's
> unrelated though. Moving to errseq_t based handling there based on the
> blockdev mapping seems like it'd solve that. That does require an extra
> errseq_t though.

Well, it might help solving the error handling case but it doesn't solve
the fundamental problem that the inode buffer even doesn't have to be
written to disk by the time fsync(2) returns.

> (I assume that on ext2 inode writeback, bh->b_page->mapping->host points
> to the bdev inode?)

Yes, it does.

> > 2) For metadata like indirect blocks where you indeed have 1:1 mapping, you
> > can do the error setting in ->end_io handler based on bh->b_assoc_map and
> > that should do what you need, shouldn't it?
> > 
> 
> That would probably work, and I think the mark_buffer_write_io_error
> function that I was adding should already be doing the right thing
> there.

Agreed.

> > If I'm indeed right, then for buffers which have 1:1 mapping we are fine
> > and if we find a solution for inodes, we could avoid the second errseq_t.
> 
> Yeah, I'm just still not seeing a good way to track error in inode
> metadata writeback without an extra errseq_t though. I don't suppose
> that a buffer holding inode metadata has a list of those inodes, does
> it? Then we could walk the list and flag each one with the error.
> Without something like that, I think we're stuck with an extra errseq_t.

No, the buffer doesn't have a list of associated inodes. For ext2/4 it is
doable to actually track down all the inodes but I don't think we want to
complicate this series by implementing such mechanism for each filesystem
that needs this. So let's start with a generic solution that uses second
errseq_t for the metadata mapping. It is somewhat rough (error in writeback
of any metadata block will fail fsync(2) for all open files) but we can later
improve on this for each fs which cares enough about better error reporting.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-05-23  9:06 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-09 15:49 [PATCH v4 00/27] fs: introduce new writeback error reporting and convert existing API as a wrapper around it Jeff Layton
2017-05-09 15:49 ` [PATCH v4 01/27] fs: remove unneeded forward definition of mm_struct from fs.h Jeff Layton
2017-05-10 11:04   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 02/27] mm: drop "wait" parameter from write_one_page Jeff Layton
2017-05-09 15:49 ` [PATCH v4 03/27] mm: fix mapping_set_error call in me_pagecache_dirty Jeff Layton
2017-05-09 15:49 ` [PATCH v4 04/27] buffer: use mapping_set_error instead of setting the flag Jeff Layton
2017-05-09 15:49 ` [PATCH v4 05/27] btrfs: btrfs_wait_tree_block_writeback can be void return Jeff Layton
2017-05-10 11:09   ` Jan Kara
2017-05-19  4:07   ` Liu Bo
2017-05-09 15:49 ` [PATCH v4 06/27] fs: check for writeback errors after syncing out buffers in generic_file_fsync Jeff Layton
2017-05-10 12:48   ` Matthew Wilcox
2017-05-09 15:49 ` [PATCH v4 07/27] orangefs: don't call filemap_write_and_wait from fsync Jeff Layton
2017-05-09 15:49 ` [PATCH v4 08/27] dax: set errors in mapping when writeback fails Jeff Layton
2017-05-09 15:49 ` [PATCH v4 09/27] nilfs2: set the mapping error when calling SetPageError on writeback Jeff Layton
2017-05-09 15:49 ` [PATCH v4 10/27] 9p: set mapping error when writeback fails in launder_page Jeff Layton
2017-05-09 15:49 ` [PATCH v4 11/27] fuse: set mapping error in writepage_locked when it fails Jeff Layton
2017-05-10 11:13   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 12/27] cifs: set mapping error when page writeback fails in writepage or launder_pages Jeff Layton
2017-05-10 11:14   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 13/27] lib: add errseq_t type and infrastructure for handling it Jeff Layton
2017-05-09 22:03   ` NeilBrown
2017-05-10 11:29     ` Jeff Layton
2017-05-10 11:34   ` Jan Kara
2017-05-10 11:58     ` Jeff Layton
2017-05-10 14:18   ` Matthew Wilcox
2017-05-10 14:56     ` Jeff Layton
2017-05-09 15:49 ` [PATCH v4 14/27] fs: new infrastructure for writeback error handling and reporting Jeff Layton
2017-05-10 11:48   ` Jan Kara
2017-05-10 12:19     ` Jeff Layton
2017-05-10 13:46       ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 15/27] fs: retrofit old error reporting API onto new infrastructure Jeff Layton
2017-05-15 10:42   ` Jan Kara
2017-05-15 17:58     ` Jeff Layton
2017-05-19 19:20     ` Jeff Layton
2017-05-22 13:38       ` Jan Kara
2017-05-22 13:53         ` Jeff Layton
2017-05-22 17:53           ` Jan Kara
2017-05-22 19:09             ` Jeff Layton
2017-05-23  9:05               ` Jan Kara [this message]
2017-05-09 15:49 ` [PATCH v4 16/27] fs: adapt sync_file_range to new reporting infrastructure Jeff Layton
2017-05-09 15:49 ` [PATCH v4 17/27] mm: remove AS_EIO and AS_ENOSPC flags Jeff Layton
2017-05-09 15:49 ` [PATCH v4 18/27] mm: don't TestClearPageError in __filemap_fdatawait_range Jeff Layton
2017-05-09 15:49 ` [PATCH v4 19/27] buffer: set errors in mapping at the time that the error occurs Jeff Layton
2017-05-15 11:53   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 20/27] cifs: cleanup writeback handling errors and comments Jeff Layton
2017-05-09 15:49 ` [PATCH v4 21/27] mm: clean up error handling in write_one_page Jeff Layton
2017-05-15 12:01   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 22/27] jbd2: don't reset error in journal_finish_inode_data_buffers Jeff Layton
2017-05-15 11:58   ` Jan Kara
2017-05-09 15:49 ` [PATCH v4 23/27] gfs2: clean up some filemap_* calls Jeff Layton
2017-05-10 16:18   ` Bob Peterson
2017-05-09 15:49 ` [PATCH v4 24/27][RFC] nfs: convert to new errseq_t based error tracking for writeback errors Jeff Layton
2017-05-09 15:49 ` [PATCH v4 25/27] Documentation: flesh out the section in vfs.txt on storing and reporting " Jeff Layton
2017-05-09 16:24   ` Jeff Layton
2017-05-09 15:49 ` [PATCH v4 26/27] mm: flesh out comments over mapping_set_error Jeff Layton
2017-05-09 15:49 ` [PATCH v4 27/27] mm: clean up comments in me_pagecache_dirty Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170523090557.GB1119@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).