linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	hch@lst.de, johannes.thumshirn@wdc.com, dsterba@suse.com,
	Goldwyn Rodrigues <rgoldwyn@suse.com>
Subject: Re: [PATCH 04/15] iomap: Call inode_dio_end() before generic_write_sync()
Date: Wed, 23 Sep 2020 07:16:58 +0200	[thread overview]
Message-ID: <20200923051658.GA14957@lst.de> (raw)
In-Reply-To: <20200922214934.GC12096@dread.disaster.area>

On Wed, Sep 23, 2020 at 07:49:34AM +1000, Dave Chinner wrote:
> I did point out in the previous thread that this actually means that
> inode_dio_wait() now has inconsistent wait semantics for O_DSYNC
> writes. If it's a pure overwrite and we hit the FUA path, the
> O_DSYNC write will be complete and guaranteed to be on stable storage
> before the IO completes. If the inode is metadata dirty, then the IO
> will now be signalled complete *before* the data and metadata are
> flushed to stable storage.
> 
> Hence, from the perspective of writes to *stable* storage, this
> makes the ordering of O_DSYNC DIO against anything waiting for it to
> complete to be potentially inconsistent at the stable storage level.
> 
> That's an extremely subtle change of behaviour, and something that
> would be largely impossible to test or reproduce. And, really, I
> don't like having this sort of "oh, it should be fine" handwavy
> justification when we are talking about data integrity operations...

... and I replied with a detailed analysis of what it is fine, and
how this just restores the behavior we historically had before
switching to the iomap direct I/O code.  Although if we want to go
into the fine details we did not have the REQ_FUA path back then,
but that does not change the analysis.

  reply	other threads:[~2020-09-23  5:17 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-21 14:43 [PATCH 0/15 v2] BTRFS DIO inode locking/D_SYNC fix Goldwyn Rodrigues
2020-09-21 14:43 ` [PATCH 01/15] fs: remove dio_end_io() Goldwyn Rodrigues
2020-09-22 14:17   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 02/15] btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK Goldwyn Rodrigues
2020-09-22 13:18   ` Christoph Hellwig
2020-09-22 14:17   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 03/15] iomap: Allow filesystem to call iomap_dio_complete without i_rwsem Goldwyn Rodrigues
2020-09-21 15:09   ` Johannes Thumshirn
2020-09-22 13:19     ` hch
2020-09-22 14:17   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 04/15] iomap: Call inode_dio_end() before generic_write_sync() Goldwyn Rodrigues
2020-09-21 15:11   ` Johannes Thumshirn
2020-09-22 13:21   ` Christoph Hellwig
2020-09-22 14:20   ` Josef Bacik
2020-09-22 16:31     ` Darrick J. Wong
2020-09-22 17:25       ` Goldwyn Rodrigues
2020-09-22 21:49       ` Dave Chinner
2020-09-23  5:16         ` Christoph Hellwig [this message]
2020-09-23  5:31           ` Darrick J. Wong
2020-09-23  5:49             ` Christoph Hellwig
2020-09-23  5:59               ` Dave Chinner
2020-09-21 14:43 ` [PATCH 05/15] btrfs: split btrfs_direct_IO to read and write Goldwyn Rodrigues
2020-09-22 13:22   ` Christoph Hellwig
2020-09-22 14:27   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 06/15] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Goldwyn Rodrigues
2020-09-22 13:22   ` Christoph Hellwig
2020-09-22 14:30   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 07/15] btrfs: Move FS error state bit early during write Goldwyn Rodrigues
2020-09-22 14:38   ` Josef Bacik
2020-09-23  9:10   ` Nikolay Borisov
2020-09-23 14:07     ` Goldwyn Rodrigues
2020-09-21 14:43 ` [PATCH 08/15] btrfs: Introduce btrfs_write_check() Goldwyn Rodrigues
2020-09-22 13:26   ` Christoph Hellwig
2020-09-22 14:42   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 09/15] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
2020-09-22 14:45   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 10/15] btrfs: Push inode locking and unlocking into buffered/direct write Goldwyn Rodrigues
2020-09-22 14:48   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 11/15] btrfs: Use inode_lock_shared() for direct writes within EOF Goldwyn Rodrigues
2020-09-22 14:52   ` Josef Bacik
2020-09-22 17:33     ` Goldwyn Rodrigues
2020-09-21 14:43 ` [PATCH 12/15] btrfs: Remove dio_sem Goldwyn Rodrigues
2020-09-22 14:52   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 13/15] btrfs: Call iomap_dio_complete() without inode_lock Goldwyn Rodrigues
2020-09-22 15:11   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 14/15] btrfs: Revert 09745ff88d93 ("btrfs: dio iomap DSYNC workaround") Goldwyn Rodrigues
2020-09-22 15:12   ` Josef Bacik
2020-09-21 14:43 ` [PATCH 15/15] iomap: Reinstate lockdep_assert_held in iomap_dio_rw() Goldwyn Rodrigues
2020-09-22 13:26   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200923051658.GA14957@lst.de \
    --to=hch@lst.de \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.com \
    --cc=johannes.thumshirn@wdc.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=rgoldwyn@suse.com \
    --cc=rgoldwyn@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).