All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	David Sterba <dsterba@suse.com>,
	"linux-btrfs @ vger . kernel . org" <linux-btrfs@vger.kernel.org>,
	Filipe Manana <fdmanana@gmail.com>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC PATCH] btrfs: don't call btrfs_sync_file from iomap context
Date: Thu, 17 Sep 2020 16:29:23 +1000	[thread overview]
Message-ID: <20200917062923.GV12096@dread.disaster.area> (raw)
In-Reply-To: <20200917055232.GA31646@lst.de>

On Thu, Sep 17, 2020 at 07:52:32AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 17, 2020 at 01:09:42PM +1000, Dave Chinner wrote:
> > > > iomap_dio_complete()
> > > >   generic_write_sync()
> > > >     btrfs_file_fsync()
> > > >       inode_lock()
> > > >       <deadlock>
> > > 
> > > Can inode_dio_end() be called before generic_write_sync(), as it is done
> > > in fs/direct-io.c:dio_complete()?
> > 
> > Don't think so.  inode_dio_wait() is supposed to indicate that all
> > DIO is complete, and having the "make it stable" parts of an O_DSYNC
> > DIO still running after inode_dio_wait() returns means that we still
> > have DIO running....
> > 
> > For some filesystems, ensuring the DIO data is stable may involve
> > flushing other data (perhaps we did EOF zeroing before the file
> > extending DIO) and/or metadata to the log, so we need to guarantee
> > these DIO related operations are complete and stable before we say
> > the DIO is done.
> 
> inode_dio_wait really just waits for active I/O that writes to or reads
> from the file.  It does not imply that the I/O is stable, just like
> i_rwsem itself doesn't.

No, but iomap_dio_rw() considers a O_DSYNC write to be incomplete
until it is stable so that it presents consistent behaviour to
anythign calling inode_dio_wait().

> Various file systems have historically called
> the syncing outside i_rwsem and inode_dio_wait (in fact that is what the
> fs/direct-io.c code does, so XFS did as well until a few years ago), and
> that isn't a problem at all - we just can't return to userspace (or call
> ki_complete for in-kernel users) before the data is stable on disk.

I'm really not caring about userspace here - we use inode_dio_wait()
as an IO completion notification for the purposes of synchronising
internal filesystem state before modifying user data via direct
metadata manipulation. Hence I want sane, consistent, predictable IO
completion notification behaviour regardless of the implementation
path it goes through.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-09-17  6:29 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-01 13:06 [RFC PATCH] btrfs: don't call btrfs_sync_file from iomap context Johannes Thumshirn
2020-09-01 13:11 ` Johannes Thumshirn
2020-09-01 14:17 ` Goldwyn Rodrigues
2020-09-01 14:20   ` Johannes Thumshirn
2020-09-01 14:37 ` Filipe Manana
2020-09-01 14:44   ` Johannes Thumshirn
2020-09-01 18:40     ` Goldwyn Rodrigues
2020-09-01 15:11 ` Josef Bacik
2020-09-01 17:45   ` Darrick J. Wong
2020-09-01 17:55     ` Josef Bacik
2020-09-01 21:46   ` Dave Chinner
2020-09-01 22:19     ` Josef Bacik
2020-09-01 23:58       ` Dave Chinner
2020-09-02  0:22         ` Josef Bacik
2020-09-02  7:12           ` Johannes Thumshirn
2020-09-02 11:10             ` Josef Bacik
2020-09-02 16:29               ` Darrick J. Wong
2020-09-02 16:47                 ` Josef Bacik
2020-09-02 11:44         ` Matthew Wilcox
2020-09-02 12:20           ` Dave Chinner
2020-09-02 12:42             ` Josef Bacik
2020-09-03  2:28               ` Dave Chinner
2020-09-03  9:49                 ` Filipe Manana
2020-09-03 16:32   ` Christoph Hellwig
2020-09-03 16:46     ` Josef Bacik
2020-09-07  0:04     ` Dave Chinner
2020-09-15 21:48       ` Goldwyn Rodrigues
2020-09-17  3:09         ` Dave Chinner
2020-09-17  5:52           ` Christoph Hellwig
2020-09-17  6:29             ` Dave Chinner [this message]
2020-09-17  6:42               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200917062923.GV12096@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@gmail.com \
    --cc=hch@lst.de \
    --cc=johannes.thumshirn@wdc.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=rgoldwyn@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.