All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: David Chinner <dgc@sgi.com>
Cc: xfs@oss.sgi.com, hch@infradead.org
Subject: Re: XFS and write barriers.
Date: Fri, 23 Mar 2007 18:49:50 +1100	[thread overview]
Message-ID: <17923.34462.210758.852042@notabene.brown> (raw)
In-Reply-To: message from David Chinner on Friday March 23

On Friday March 23, dgc@sgi.com wrote:
> On Fri, Mar 23, 2007 at 12:26:31PM +1100, Neil Brown wrote:
> > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be
> > retried without the barrier (after possibly waiting for dependant
> > requests to complete).  This is what other filesystems do, but I
> > cannot find the code in xfs which does this.
> 
> XFS doesn't handle this - I was unaware that the barrier status of the
> underlying block device could change....
> 
> OOC, when did this behaviour get introduced?

Probably when md/raid1 started supported barriers....

The problem is that this interface is (as far as I can see) undocumented
and not fully specified.

Barriers only make sense inside drive firmware.  Trying to emulate it
in the md layer doesn't make any sense as the filesystem is in a much
better position to do any emulation required.
So as the devices can change underneath md/raid1, it must be able to
fail a barrier request at any point.

The first file systems to use barriers (ext3, reiserfs) submit a
barrier request and if that fails they decide that barriers don't work
any more and use the fall-back mechanism.

The seemed to mesh perfectly with what I needed for md, so I assumed
it was an intended feature of the interface and made md/raid1 depend
on it.


> > This is particularly important for md/raid1 as it is quite possible
> > that barriers will be supported at first, but after a failure and
> > different device on a different controller could be swapped in that
> > does not support barriers.
> 
> I/O errors are not the way this should be handled. What happens if
> the opposite happens? A drive that needs barriers is used as a
> replacement on a filesystem that has barriers disabled because they
> weren't needed? Now a crash can result in filesystem corruption, but
> the filesystem has not been able to warn the admin that this
> situation occurred. 

There should never be a possibility of filesystem corruption.
If the a barrier request fails, the filesystem should:
  wait for any dependant request to complete
  call blkdev_issue_flush
  schedule the write of the 'barrier' block
  call blkdev_issue_flush again.

My understand is that that sequence is as safe as a barrier, but maybe
not as fast.

The patch looks at least believable.  As you can imagine it is awkward
to test thoroughly.

Thanks,
NeilBrown

  reply	other threads:[~2007-03-23  7:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-23  1:26 XFS and write barriers Neil Brown
2007-03-23  5:30 ` David Chinner
2007-03-23  7:49   ` Neil Brown [this message]
2007-03-25  4:17     ` David Chinner
2007-03-25 23:21       ` Neil Brown
2007-03-26  3:14         ` David Chinner
2007-03-26  4:27           ` Neil Brown
2007-03-26  9:04             ` David Chinner
2007-03-29 14:56               ` Martin Steigerwald
2007-03-29 15:18                 ` David Chinner
2007-03-29 16:49                   ` Martin Steigerwald
2007-03-23  9:50   ` Christoph Hellwig
2007-03-25  3:51     ` David Chinner
2007-03-25 23:58       ` Neil Brown
2007-03-26  1:11     ` Neil Brown
2007-03-23  6:20 ` Timothy Shimmin
2007-03-23  8:00   ` Neil Brown
2007-03-25  3:19     ` David Chinner
2007-03-26  0:01       ` Neil Brown
2007-03-26  3:58         ` David Chinner
2007-03-27  3:58       ` Timothy Shimmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17923.34462.210758.852042@notabene.brown \
    --to=neilb@suse.de \
    --cc=dgc@sgi.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.