linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ted Ts'o <tytso@mit.edu>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	LKML <linux-kernel@vger.kernel.org>,
	Edward Shishkin <edward@redhat.com>
Subject: Re: [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error
Date: Mon, 23 Jan 2012 16:47:09 -0500	[thread overview]
Message-ID: <20120123214709.GB17974@thunk.org> (raw)
In-Reply-To: <20120123030422.GE15102@dastard>

On Mon, Jan 23, 2012 at 02:04:22PM +1100, Dave Chinner wrote:
> 
> Sure, but the buffer contents are dirty until the IO completes
> successfully and what is on disk matches the contents of the buffer
> in memory. It doesn't magically become clean when we clear the dirty
> bit. We only clear the dirty bit before submitting the IO to stop
> multiple callers from trying to submit it for write at the same
> time. IOWs, the buffer dirty bit doesn't really track the dirty
> state of the buffer correctly.

Doesn't BH_Lock prevent multple callers from submitting it for write
at the same time?  If memory serves, one of the reasons why we cleared
the dirty bit before submitting the write was because we allowed
writers to dirty the buffer_head while the write was "in flight".  Of
course, this is becomes problematic if we're trying to support DIF/DIX.

What if we simply disallow BH_Dirty from being set (and disallow the
modification of the buffer) while the buffer is locked?  Then the
dirty bit would indeed correctly track the state of the buffer
correctly.

> I can only assume that you didn't read what I said about how
> different filesystems can (and do) handle write errors differently.
> Indeed, even within a filesystem there can be different error
> handling methods for different types of write IO errors (e.g.
> transient vs unrecoverable).  Hence there are any number of valid
> error handling strategies that can be added to the above list. One
> size does not fit all...

That's another problem, which is that we need more context than just
!uptodate.  We need to know what sort of write I/O errors occurred, so
we can determine whether it's likely to be transient or permanent.

> The thing is, transient write errors tend to be isolated and go away
> when a retry occurs (think of IO timeouts when multipath failover
> occurs). When non-isolated IO or unrecoverable problems occur (e.g.
> no paths left to fail over onto), critical other metadata reads and
> writes will fail and shut down the filesystem, thereby terminating
> the "try forever" background writeback loop those delayed write
> buffers may be in. So the truth is that "trying forever" on write
> errors can handle a whole class of write IO errors very
> effectively....

So how does XFS decide whether a write should fail and shutdown the
file system, or just "try forever"?

						- Ted

  reply	other threads:[~2012-01-23 21:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-05 14:40 [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 1/3] fs: Convert checks for write IO errors from !buffer_uptodate to buffer_write_io_error Jan Kara
2012-01-05 14:40 ` [PATCH 2/3] fs: Do not clear uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 3/3] ext2: Replace tests of write IO errors using buffer_uptodate Jan Kara
2012-01-05 22:16 ` [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Andrew Morton
2012-01-15  2:19 ` Linus Torvalds
2012-01-16 16:01   ` Jan Kara
2012-01-16 18:55     ` Linus Torvalds
2012-01-16 19:06       ` Linus Torvalds
2012-01-17  0:36       ` Dave Chinner
2012-01-17  0:59         ` Linus Torvalds
2012-01-17 10:46           ` Boaz Harrosh
2012-01-23  3:04           ` Dave Chinner
2012-01-23 21:47             ` Ted Ts'o [this message]
2012-01-23 23:49               ` Linus Torvalds
2012-01-24  6:12                 ` Dave Chinner
2012-01-24  7:10                   ` Linus Torvalds
2012-01-24 12:13                     ` Jan Kara
2012-01-24  0:36               ` Dave Chinner
2012-01-26 12:17                 ` Ric Wheeler
2012-01-26 20:51                   ` Jan Kara
2012-01-26 20:58                     ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120123214709.GB17974@thunk.org \
    --to=tytso@mit.edu \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=edward@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).