All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Evgeniy Polyakov <zbr@ioremap.net>,
	ocfs2-devel@oss.oracle.com, Joel Becker <joel.becker@oracle.com>,
	Felix Blyakher <felixb@sgi.com>,
	xfs@oss.sgi.com, Anton Altaparmakov <aia21@cantab.net>,
	linux-ntfs-dev@lists.sourceforge.net,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: Re: [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode
Date: Thu, 20 Aug 2009 12:27:29 -0400	[thread overview]
Message-ID: <20090820162729.GA24659@infradead.org> (raw)
In-Reply-To: <20090820121531.GC16486@duck.novell.com>

On Thu, Aug 20, 2009 at 02:15:31PM +0200, Jan Kara wrote:
> On Wed 19-08-09 12:26:38, Christoph Hellwig wrote:
> > Looks good to me.  Eventually we should use those SYNC_ flags also all
> > through the fsync codepath, but I'll see if I can incorporate that in my
> > planned fsync rewrite.
>   Yes, I thought I'll leave that for later. BTW it should be fairly easy to
> teach generic_sync_file() to do fdatawait() before calling ->fsync() if the
> filesystem sets some flag in inode->i_mapping (or somewhere else) as is
> needed for XFS, btrfs, etc.

Maybe you can help brain storming, but I still can't see any way in that
the

  - write data
  - write inode
  - wait for data

actually is a benefit in terms of semantics (I agree that it could be
faster in theory, but even that is debatable with todays seek latencies
in disks)

Think about a simple non-journaling filesystem like ext2:

 (1) block get allocated during ->write before putting data in
      - this dirties the inode because we update i_block/i_size/etc
 (2) we call fsync (or the O_SNC handling code for that matter)
      - we start writeout of the data, which takes forever because the
	file is very large
      - then we write out the inode, including the i_size/i_blocks
	update
      - due to some reason this gets reordered before the data writeout
	finishes (without that happening there would be no benefit to
	this ordering anyway)
 (3) no we call filemap_fdatawait to wait for data I/O to finish


Now the system crashes between (2) and (3).  After that we we do have
stale data in the inode in the area not written yet.

Is there some case between that simple filesystem and the i_size update
from I/O completion handler in XFS/ext4 where this behaviour actually
buys us anything?  Any ext3 magic maybe?

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@infradead.org>
To: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu,
	linux-ntfs-dev@lists.sourceforge.net,
	LKML <linux-kernel@vger.kernel.org>,
	Joel Becker <joel.becker@oracle.com>,
	Christoph Hellwig <hch@infradead.org>,
	Anton Altaparmakov <aia21@cantab.net>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	Evgeniy Polyakov <zbr@ioremap.net>,
	xfs@oss.sgi.com, ocfs2-devel@oss.oracle.com
Subject: Re: [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode
Date: Thu, 20 Aug 2009 12:27:29 -0400	[thread overview]
Message-ID: <20090820162729.GA24659@infradead.org> (raw)
In-Reply-To: <20090820121531.GC16486@duck.novell.com>

On Thu, Aug 20, 2009 at 02:15:31PM +0200, Jan Kara wrote:
> On Wed 19-08-09 12:26:38, Christoph Hellwig wrote:
> > Looks good to me.  Eventually we should use those SYNC_ flags also all
> > through the fsync codepath, but I'll see if I can incorporate that in my
> > planned fsync rewrite.
>   Yes, I thought I'll leave that for later. BTW it should be fairly easy to
> teach generic_sync_file() to do fdatawait() before calling ->fsync() if the
> filesystem sets some flag in inode->i_mapping (or somewhere else) as is
> needed for XFS, btrfs, etc.

Maybe you can help brain storming, but I still can't see any way in that
the

  - write data
  - write inode
  - wait for data

actually is a benefit in terms of semantics (I agree that it could be
faster in theory, but even that is debatable with todays seek latencies
in disks)

Think about a simple non-journaling filesystem like ext2:

 (1) block get allocated during ->write before putting data in
      - this dirties the inode because we update i_block/i_size/etc
 (2) we call fsync (or the O_SNC handling code for that matter)
      - we start writeout of the data, which takes forever because the
	file is very large
      - then we write out the inode, including the i_size/i_blocks
	update
      - due to some reason this gets reordered before the data writeout
	finishes (without that happening there would be no benefit to
	this ordering anyway)
 (3) no we call filemap_fdatawait to wait for data I/O to finish


Now the system crashes between (2) and (3).  After that we we do have
stale data in the inode in the area not written yet.

Is there some case between that simple filesystem and the i_size update
from I/O completion handler in XFS/ext4 where this behaviour actually
buys us anything?  Any ext3 magic maybe?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@infradead.org>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Evgeniy Polyakov <zbr@ioremap.net>,
	ocfs2-devel@oss.oracle.com, Joel Becker <joel.becker@oracle.com>,
	Felix Blyakher <felixb@sgi.com>,
	xfs@oss.sgi.com, Anton Altaparmakov <aia21@cantab.net>,
	linux-ntfs-dev@lists.sourceforge.net,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: [Ocfs2-devel] [PATCH 07/17] vfs: Introduce new helpers for syncing after	writing to O_SYNC file or IS_SYNC inode
Date: Thu, 20 Aug 2009 16:27:44 -0000	[thread overview]
Message-ID: <20090820162729.GA24659@infradead.org> (raw)
In-Reply-To: <20090820121531.GC16486@duck.novell.com>

On Thu, Aug 20, 2009 at 02:15:31PM +0200, Jan Kara wrote:
> On Wed 19-08-09 12:26:38, Christoph Hellwig wrote:
> > Looks good to me.  Eventually we should use those SYNC_ flags also all
> > through the fsync codepath, but I'll see if I can incorporate that in my
> > planned fsync rewrite.
>   Yes, I thought I'll leave that for later. BTW it should be fairly easy to
> teach generic_sync_file() to do fdatawait() before calling ->fsync() if the
> filesystem sets some flag in inode->i_mapping (or somewhere else) as is
> needed for XFS, btrfs, etc.

Maybe you can help brain storming, but I still can't see any way in that
the

  - write data
  - write inode
  - wait for data

actually is a benefit in terms of semantics (I agree that it could be
faster in theory, but even that is debatable with todays seek latencies
in disks)

Think about a simple non-journaling filesystem like ext2:

 (1) block get allocated during ->write before putting data in
      - this dirties the inode because we update i_block/i_size/etc
 (2) we call fsync (or the O_SNC handling code for that matter)
      - we start writeout of the data, which takes forever because the
	file is very large
      - then we write out the inode, including the i_size/i_blocks
	update
      - due to some reason this gets reordered before the data writeout
	finishes (without that happening there would be no benefit to
	this ordering anyway)
 (3) no we call filemap_fdatawait to wait for data I/O to finish


Now the system crashes between (2) and (3).  After that we we do have
stale data in the inode in the area not written yet.

Is there some case between that simple filesystem and the i_size update
from I/O completion handler in XFS/ext4 where this behaviour actually
buys us anything?  Any ext3 magic maybe?

  reply	other threads:[~2009-08-20 16:27 UTC|newest]

Thread overview: 151+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-19 16:04 [PATCH 0/17] Make O_SYNC handling use standard syncing path Jan Kara
2009-08-19 16:04 ` [PATCH 01/17] vfs: Introduce filemap_fdatawait_range Jan Kara
2009-08-19 16:10   ` Christoph Hellwig
2009-08-19 16:04 ` [PATCH 02/17] vfs: Export __generic_file_aio_write() and add some comments Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:11   ` Christoph Hellwig
2009-08-19 16:11     ` [Ocfs2-devel] " Christoph Hellwig
2009-08-20 12:04     ` Jan Kara
2009-08-20 12:04       ` [Ocfs2-devel] " Jan Kara
2009-08-19 20:22   ` Evgeniy Polyakov
2009-08-19 20:22     ` [Ocfs2-devel] " Evgeniy Polyakov
2009-08-20 12:31     ` Jan Kara
2009-08-20 12:31       ` [Ocfs2-devel] " Jan Kara
2009-08-20 13:30       ` Evgeniy Polyakov
2009-08-20 13:30         ` [Ocfs2-devel] " Evgeniy Polyakov
2009-08-20 13:52         ` Jan Kara
2009-08-20 13:52           ` [Ocfs2-devel] " Jan Kara
2009-08-20 13:58           ` Evgeniy Polyakov
2009-08-20 13:58             ` [Ocfs2-devel] " Evgeniy Polyakov
2009-08-19 16:04 ` [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write() Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:18   ` Christoph Hellwig
2009-08-19 16:18     ` [Ocfs2-devel] " Christoph Hellwig
2009-08-19 16:18     ` Christoph Hellwig
2009-08-20 13:31     ` Jan Kara
2009-08-20 13:31       ` [Ocfs2-devel] " Jan Kara
2009-08-20 13:31       ` Jan Kara
2009-08-19 16:04 ` [PATCH 04/17] pohmelfs: Use __generic_file_aio_write instead of generic_file_aio_write_nolock Jan Kara
2009-08-19 16:04 ` [PATCH 05/17] ocfs2: " Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:04 ` [PATCH 06/17] vfs: Remove sync_page_range_nolock Jan Kara
2009-08-19 16:21   ` Christoph Hellwig
2009-08-19 16:04 ` [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:26   ` Christoph Hellwig
2009-08-19 16:26     ` [Ocfs2-devel] " Christoph Hellwig
2009-08-19 16:26     ` Christoph Hellwig
2009-08-20 12:15     ` Jan Kara
2009-08-20 12:15       ` [Ocfs2-devel] " Jan Kara
2009-08-20 12:15       ` Jan Kara
2009-08-20 16:27       ` Christoph Hellwig [this message]
2009-08-20 16:27         ` [Ocfs2-devel] " Christoph Hellwig
2009-08-20 16:27         ` Christoph Hellwig
2009-08-21 15:23         ` Jan Kara
2009-08-21 15:23           ` [Ocfs2-devel] " Jan Kara
2009-08-21 15:23           ` Jan Kara
2009-08-21 15:32           ` Christoph Hellwig
2009-08-21 15:32             ` [Ocfs2-devel] " Christoph Hellwig
2009-08-21 15:32             ` Christoph Hellwig
2009-08-21 15:48             ` Jan Kara
2009-08-21 15:48               ` [Ocfs2-devel] " Jan Kara
2009-08-21 15:48               ` Jan Kara
2009-08-26 18:22         ` Christoph Hellwig
2009-08-26 18:22           ` [Ocfs2-devel] " Christoph Hellwig
2009-08-26 18:22           ` Christoph Hellwig
2009-08-27  0:04           ` Christoph Hellwig
2009-08-27  0:04             ` [Ocfs2-devel] " Christoph Hellwig
2009-08-27  0:04             ` Christoph Hellwig
2009-08-19 16:04 ` [PATCH 08/17] ext2: Update comment about generic_osync_inode Jan Kara
2009-08-19 16:04 ` [PATCH 09/17] ext3: Remove syncing logic from ext3_file_write Jan Kara
2009-08-19 16:04 ` [PATCH 10/17] ext4: Remove syncing logic from ext4_file_write Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:04 ` [PATCH 11/17] fat: Opencode sync_page_range_nolock() Jan Kara
2009-08-19 16:04 ` [PATCH 12/17] ntfs: Use new syncing helpers and update comments Jan Kara
2009-08-19 16:04 ` [PATCH 13/17] ocfs2: Update syncing after splicing to match generic version Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-21  1:36   ` Joel Becker
2009-08-21  1:36     ` Joel Becker
2009-08-21 14:30     ` Jan Kara
2009-08-21 14:30       ` Jan Kara
2009-08-19 16:04 ` [PATCH 14/17] xfs: Use new syncing helper Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:33   ` Christoph Hellwig
2009-08-19 16:33     ` Christoph Hellwig
2009-08-20 12:22     ` Jan Kara
2009-08-20 12:22       ` Jan Kara
2009-08-19 16:04 ` [PATCH 15/17] pohmelfs: " Jan Kara
2009-08-19 16:04 ` [PATCH 16/17] nfs: Remove reference to generic_osync_inode from a comment Jan Kara
2009-08-19 16:04 ` [PATCH 17/17] vfs: Remove generic_osync_inode() and sync_page_range() Jan Kara
2009-08-20 22:12 ` O_DIRECT and barriers Christoph Hellwig
2009-08-21 11:40   ` Jens Axboe
2009-08-21 13:54     ` Jamie Lokier
2009-08-21 14:26       ` Christoph Hellwig
2009-08-21 15:24         ` Jamie Lokier
2009-08-21 17:45           ` Christoph Hellwig
2009-08-21 19:18             ` Ric Wheeler
2009-08-22  0:50             ` Jamie Lokier
2009-08-22  2:19               ` Theodore Tso
2009-08-22  2:31                 ` Theodore Tso
2009-08-24  2:34               ` Christoph Hellwig
2009-08-27 14:34                 ` Jamie Lokier
2009-08-27 17:10                   ` adding proper O_SYNC/O_DSYNC, was " Christoph Hellwig
2009-08-27 17:24                     ` Ulrich Drepper
2009-08-27 17:24                       ` Ulrich Drepper
2009-08-28 15:46                       ` Christoph Hellwig
2009-08-28 16:06                         ` Ulrich Drepper
2009-08-28 16:06                           ` Ulrich Drepper
2009-08-28 16:17                           ` Christoph Hellwig
2009-08-28 16:33                             ` Ulrich Drepper
2009-08-28 16:33                               ` Ulrich Drepper
2009-08-28 16:41                               ` Christoph Hellwig
2009-08-28 20:51                                 ` Ulrich Drepper
2009-08-28 20:51                                   ` Ulrich Drepper
2009-08-28 21:08                                   ` Christoph Hellwig
2009-08-28 21:16                                     ` Trond Myklebust
2009-08-28 21:29                                       ` Christoph Hellwig
2009-08-28 21:43                                         ` Trond Myklebust
2009-08-28 22:39                                           ` Christoph Hellwig
2009-08-30 16:44                                     ` Jamie Lokier
2009-08-28 16:46                               ` Jamie Lokier
2009-08-29  0:59                                 ` Jamie Lokier
2009-08-28 16:44                         ` Jamie Lokier
2009-08-28 16:50                           ` Jamie Lokier
2009-08-28 21:08                           ` Ulrich Drepper
2009-08-28 21:08                             ` Ulrich Drepper
2009-08-30 16:58                             ` Jamie Lokier
2009-08-30 17:48                             ` Jamie Lokier
2009-08-28 23:06                         ` Jamie Lokier
2009-08-28 23:46                           ` Christoph Hellwig
2009-08-21 22:08         ` Theodore Tso
2009-08-21 22:38           ` Joel Becker
2009-08-21 22:45           ` Joel Becker
2009-08-22  2:11             ` Theodore Tso
2009-08-24  2:42               ` Christoph Hellwig
2009-08-24  2:37             ` Christoph Hellwig
2009-08-24  2:37             ` Christoph Hellwig
2009-08-21 22:45           ` Joel Becker
2009-08-22  0:56           ` Jamie Lokier
2009-08-22  2:06             ` Theodore Tso
2009-08-26  6:34           ` Dave Chinner
2009-08-26  6:34           ` Dave Chinner
2009-08-26 15:01             ` Jamie Lokier
2009-08-26 18:47               ` Theodore Tso
2009-08-27 14:50                 ` Jamie Lokier
2009-08-21 14:20     ` Christoph Hellwig
2009-08-21 15:06       ` James Bottomley
2009-08-21 15:23         ` Christoph Hellwig
2009-08-21 16:59 [PATCH 01/17] vfs: Introduce filemap_fdatawait_range Jan Kara
2009-08-21 16:59 ` [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Jan Kara
2009-08-21 16:59   ` Jan Kara
2009-08-21 17:23 [PATCH 0/17] Make O_SYNC handling use standard syncing path (Version 2) Jan Kara
2009-08-21 17:23 ` [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Jan Kara
2009-08-21 17:23   ` Jan Kara
2009-08-27 17:35   ` Christoph Hellwig
2009-08-27 17:35     ` Christoph Hellwig
2009-08-30 16:35     ` Jamie Lokier
2009-08-30 16:35       ` Jamie Lokier
2009-08-30 16:39       ` Christoph Hellwig
2009-08-30 16:39         ` Christoph Hellwig
2009-08-30 17:29         ` Jamie Lokier
2009-08-30 17:29           ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090820162729.GA24659@infradead.org \
    --to=hch@infradead.org \
    --cc=aia21@cantab.net \
    --cc=felixb@sgi.com \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jack@suse.cz \
    --cc=joel.becker@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.