All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Tejun Heo <tj@kernel.org>, Vivek Goyal <vgoyal@redhat.com>,
	Jan Kara <jack@suse.cz>,
	jaxboe@fusionio.com, James.Bottomley@suse.de,
	linux-fsdevel@vger.kernel.org, linux-scsi@vger.kernel.org,
	tytso@mit.edu, chris.mason@oracle.com,
	konishi.ryusuke@lab.ntt.co.jp
Subject: Re: [RFC] relaxed barrier semantics
Date: Wed, 28 Jul 2010 11:19:57 +0100	[thread overview]
Message-ID: <1280312397.2502.49.camel@localhost> (raw)
In-Reply-To: <20100728092859.GA11096@lst.de>

Hi,

On Wed, 2010-07-28 at 11:28 +0200, Christoph Hellwig wrote:
> On Wed, Jul 28, 2010 at 11:17:06AM +0200, Tejun Heo wrote:
> > Well, if disabling barrier works around the problem for them (which is
> > basically what was suggeseted in the first message), that's not too
> > bad for short term, I think.
> 
> It's a pretty horrible workaround.  Requiring manual mount options to
> get performance out of a setup which could trivially work out of the
> box is a bad workaround.
> 
> > I'll re-read barrier code and see how hard it would be to implement a
> > proper solution.
> 
> If we move all filesystems to non-draining barriers with pre- and post-
> flushes that might actually be a relatively easy first step.  We don't
> have the complications to deal with multiple types of barriers to
> start with, and it'll fix the issue for devices without volatile write
> caches completely.
> 
> I just need some help from the filesystem folks to determine if they
> are safe with them.
> 
> I know for sure that ext3 and xfs are from looking through them.  And
> I know reiserfs is if we make sure it doesn't hit the code path that
> relies on it that is currently enabled by the barrier option.
> 
> I'll just need more feedback from ext4, gfs2, btrfs and nilfs folks.
> That already ends our small list of barrier supporting filesystems, and
> possibly ocfs2, too - although the barrier implementation there seems
> incomplete as it doesn't seem to flush caches in fsync.

GFS2 uses barriers only on journal flushing. There are three reasons for
flushing the journal:

1. Its full and we need more space (or the periodic timer has expired,
and there is at least one transaction to flush)
2. We are doing fsync or a full fs sync
3. We need to release a glock to another node, and that glock has some
journaled blocks associated with it

In case #1, I don't think there is any need to actually issue a flush
along with the barrier - the fs will always be correct in case of a (for
example) power failure and it is only the amount of data which might be
lost which depends on the write cache size. This is basically the same
for any local filesystem.

In case #2 we must always flush

In case #3 we need to be certain that all I/O up to and including the
barrier (and subsequent written back in-place metadata, if any) has
reached the storage device (and is not still lurking in the I/O
elevator) before we release the lock, but there is no actual need to
flush the write cache of the device itself. In other words, we need to
flush the non-shared bit of the stack, but not the shared bit on the
device itself. The same caveats about the amount of data which may be
lost on power failure apply as per case #1.

I have also made the assumption that a barrier issued from one node to
the shared device will affect I/O from all nodes equally. If that is not
the case, then the above will not apply and we must always flush in case
#3.

Currently the code is also waiting for I/O to drain in cases #1 and #3
as well as case #2 since it was simpler to implement all cases the same,
at least to start with.

Also in case #3, if we were to implement a non-flushing barrier, then we
would need to add a barrier after the in-place metadata writeback of the
inode that is being released I think, in order to be sure cross-node
ordering was correct. Hmmm. Maybe we should be doing that anyway....

Steve.



  parent reply	other threads:[~2010-07-28 10:14 UTC|newest]

Thread overview: 155+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-27 16:56 [RFC] relaxed barrier semantics Christoph Hellwig
2010-07-27 17:54 ` Jan Kara
2010-07-27 18:35   ` Vivek Goyal
2010-07-27 18:42     ` James Bottomley
2010-07-27 18:51       ` Ric Wheeler
2010-07-27 19:43       ` Christoph Hellwig
2010-07-27 19:38     ` Christoph Hellwig
2010-07-28  8:08     ` Tejun Heo
2010-07-28  8:20       ` Tejun Heo
2010-07-28 13:55         ` Vladislav Bolkhovitin
2010-07-28 14:23           ` Tejun Heo
2010-07-28 14:37             ` James Bottomley
2010-07-28 14:44               ` Tejun Heo
2010-07-28 16:17                 ` Vladislav Bolkhovitin
2010-07-28 16:17               ` Vladislav Bolkhovitin
2010-07-28 16:16             ` Vladislav Bolkhovitin
2010-07-28  8:24       ` Christoph Hellwig
2010-07-28  8:40         ` Tejun Heo
2010-07-28  8:50           ` Christoph Hellwig
2010-07-28  8:58             ` Tejun Heo
2010-07-28  9:00               ` Christoph Hellwig
2010-07-28  9:11                 ` Hannes Reinecke
2010-07-28  9:16                   ` Christoph Hellwig
2010-07-28  9:24                     ` Tejun Heo
2010-07-28  9:38                       ` Christoph Hellwig
2010-07-28  9:28                   ` Steven Whitehouse
2010-07-28  9:35                     ` READ_META semantics, was " Christoph Hellwig
2010-07-28 13:52                       ` Jeff Moyer
2010-07-28  9:17                 ` Tejun Heo
2010-07-28  9:28                   ` Christoph Hellwig
2010-07-28  9:48                     ` Tejun Heo
2010-07-28 10:19                     ` Steven Whitehouse [this message]
2010-07-28 11:45                       ` Christoph Hellwig
2010-07-28 12:47                     ` Jan Kara
2010-07-28 23:00                       ` Christoph Hellwig
2010-07-29 10:45                         ` Jan Kara
2010-07-29 16:54                           ` Joel Becker
2010-07-29 17:02                             ` Christoph Hellwig
2010-07-29 17:02                             ` Christoph Hellwig
2010-07-29  1:44                     ` Ted Ts'o
2010-07-29  2:43                       ` Vivek Goyal
2010-07-29  2:43                       ` Vivek Goyal
2010-07-29  8:42                         ` Christoph Hellwig
2010-07-29 20:02                           ` Vivek Goyal
2010-07-29 20:06                             ` Christoph Hellwig
2010-07-30  3:17                               ` Vivek Goyal
2010-07-30  7:07                                 ` Christoph Hellwig
2010-07-30  7:41                                   ` Vivek Goyal
2010-08-02 18:28                                   ` [RFC PATCH] Flush only barriers (Was: Re: [RFC] relaxed barrier semantics) Vivek Goyal
2010-08-03 13:03                                     ` Christoph Hellwig
2010-08-04 15:29                                       ` Vivek Goyal
2010-08-04 16:21                                         ` Christoph Hellwig
2010-07-29  8:31                       ` [RFC] relaxed barrier semantics Christoph Hellwig
2010-07-29 11:16                         ` Jan Kara
2010-07-29 13:00                         ` extfs reliability Vladislav Bolkhovitin
2010-07-29 13:08                           ` Christoph Hellwig
2010-07-29 14:12                             ` Vladislav Bolkhovitin
2010-07-29 14:34                               ` Jan Kara
2010-07-29 18:20                                 ` Vladislav Bolkhovitin
2010-07-29 18:49                                 ` Vladislav Bolkhovitin
2010-07-29 14:26                           ` Jan Kara
2010-07-29 18:20                             ` Vladislav Bolkhovitin
2010-07-29 18:58                           ` Ted Ts'o
2010-07-29 19:44                       ` [RFC] relaxed barrier semantics Ric Wheeler
2010-07-29 19:49                         ` Christoph Hellwig
2010-07-29 19:56                           ` Ric Wheeler
2010-07-29 19:59                             ` James Bottomley
2010-07-29 20:03                               ` Christoph Hellwig
2010-07-29 20:07                                 ` James Bottomley
2010-07-29 20:11                                   ` Christoph Hellwig
2010-07-30 12:45                                     ` Vladislav Bolkhovitin
2010-07-30 12:56                                       ` Christoph Hellwig
2010-08-04  1:58                                     ` Jamie Lokier
2010-07-30 12:46                                 ` Vladislav Bolkhovitin
2010-07-30 12:57                                   ` Christoph Hellwig
2010-07-30 13:09                                     ` Vladislav Bolkhovitin
2010-07-30 13:12                                       ` Christoph Hellwig
2010-07-30 17:40                                         ` Vladislav Bolkhovitin
2010-07-29 20:58                               ` Ric Wheeler
2010-07-29 22:30                             ` Andreas Dilger
2010-07-29 23:04                               ` Ted Ts'o
2010-07-29 23:08                                 ` Ric Wheeler
2010-07-29 23:08                                 ` Ric Wheeler
2010-07-29 23:28                                 ` James Bottomley
2010-07-29 23:37                                   ` James Bottomley
2010-07-30  0:19                                     ` Ted Ts'o
2010-07-30 12:56                                   ` Vladislav Bolkhovitin
2010-07-30  7:11                                 ` Christoph Hellwig
2010-07-30  7:11                                 ` Christoph Hellwig
2010-07-30 12:56                                 ` Vladislav Bolkhovitin
2010-07-30 13:07                                   ` Tejun Heo
2010-07-30 13:22                                     ` Vladislav Bolkhovitin
2010-07-30 13:27                                       ` Vladislav Bolkhovitin
2010-07-30 13:09                                   ` Christoph Hellwig
2010-07-30 13:25                                     ` Vladislav Bolkhovitin
2010-07-30 13:34                                       ` Christoph Hellwig
2010-07-30 13:44                                         ` Vladislav Bolkhovitin
2010-07-30 14:20                                           ` Christoph Hellwig
2010-07-31  0:47                                             ` Jan Kara
2010-07-31  9:12                                               ` Christoph Hellwig
2010-08-02 13:14                                                 ` Jan Kara
2010-08-02 10:38                                               ` Vladislav Bolkhovitin
2010-08-02 12:48                                                 ` Christoph Hellwig
2010-08-02 19:03                                                   ` xfs rm performance Vladislav Bolkhovitin
2010-08-02 19:18                                                     ` Christoph Hellwig
2010-08-05 19:31                                                       ` Vladislav Bolkhovitin
2010-08-02 19:01                                             ` [RFC] relaxed barrier semantics Vladislav Bolkhovitin
2010-08-02 19:26                                               ` Christoph Hellwig
2010-07-30 12:56                                 ` Vladislav Bolkhovitin
2010-07-31  0:35                         ` Jan Kara
2010-07-29 19:44                       ` Ric Wheeler
2010-08-02 16:47                     ` Ryusuke Konishi
2010-08-02 17:39                     ` Chris Mason
2010-08-05 13:11                       ` Vladislav Bolkhovitin
2010-08-05 13:32                         ` Chris Mason
2010-08-05 14:52                           ` Hannes Reinecke
2010-08-05 14:52                           ` Hannes Reinecke
2010-08-05 15:17                             ` Chris Mason
2010-08-05 17:07                             ` Christoph Hellwig
2010-08-05 19:48                           ` Vladislav Bolkhovitin
2010-08-05 19:48                           ` Vladislav Bolkhovitin
2010-08-05 19:50                             ` Christoph Hellwig
2010-08-05 20:05                               ` Vladislav Bolkhovitin
2010-08-06 14:56                                 ` Hannes Reinecke
2010-08-06 18:38                                   ` Vladislav Bolkhovitin
2010-08-06 23:38                                     ` Christoph Hellwig
2010-08-06 23:34                                   ` Christoph Hellwig
2010-08-05 17:09                         ` Christoph Hellwig
2010-08-05 19:32                           ` Vladislav Bolkhovitin
2010-08-05 19:40                             ` Christoph Hellwig
2010-08-05 13:11                       ` Vladislav Bolkhovitin
2010-07-28 13:56                   ` Vladislav Bolkhovitin
2010-07-28 14:42                 ` Vivek Goyal
2010-07-27 19:37   ` Christoph Hellwig
2010-08-03 18:49   ` [PATCH, RFC 1/2] relaxed cache flushes Christoph Hellwig
2010-08-03 18:51     ` [PATCH, RFC 2/2] dm: support REQ_FLUSH directly Christoph Hellwig
2010-08-04  4:57       ` Kiyoshi Ueda
2010-08-04  8:54         ` Christoph Hellwig
2010-08-05  2:16           ` Jun'ichi Nomura
2010-08-26 22:50             ` Mike Snitzer
2010-08-27  0:40               ` Mike Snitzer
2010-08-27  1:20                 ` Jamie Lokier
2010-08-27  1:43               ` Jun'ichi Nomura
2010-08-27  4:08                 ` Mike Snitzer
2010-08-27  5:52                   ` Jun'ichi Nomura
2010-08-27 14:13                     ` Mike Snitzer
2010-08-30  4:45                       ` Jun'ichi Nomura
2010-08-30  8:33                         ` Tejun Heo
2010-08-30 12:43                           ` Mike Snitzer
2010-08-30 12:45                             ` Tejun Heo
2010-08-06 16:04     ` [PATCH, RFC] relaxed barriers Tejun Heo
2010-08-06 23:34       ` Christoph Hellwig
2010-08-07 10:13       ` [PATCH REPOST " Tejun Heo
2010-08-08 14:31         ` Christoph Hellwig
2010-08-09 14:50           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1280312397.2502.49.camel@localhost \
    --to=swhiteho@redhat.com \
    --cc=James.Bottomley@suse.de \
    --cc=chris.mason@oracle.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jaxboe@fusionio.com \
    --cc=konishi.ryusuke@lab.ntt.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.