From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
david@fromorbit.com, hch@lst.de, johannes.thumshirn@wdc.com,
dsterba@suse.com, josef@toxicpanda.com,
Goldwyn Rodrigues <rgoldwyn@suse.com>
Subject: Re: [PATCH 03/14] iomap: Allow filesystem to call iomap_dio_complete without i_rwsem
Date: Fri, 25 Sep 2020 18:49:39 -0700 [thread overview]
Message-ID: <20200926014939.GP7964@magnolia> (raw)
In-Reply-To: <20200924163922.2547-4-rgoldwyn@suse.de>
On Thu, Sep 24, 2020 at 11:39:10AM -0500, Goldwyn Rodrigues wrote:
> From: Christoph Hellwig <hch@lst.de>
>
> This is to avoid the deadlock caused in btrfs because of O_DIRECT |
> O_DSYNC.
>
> Filesystems such as btrfs require i_rwsem while performing sync on a
> file. iomap_dio_rw() is called under i_rw_sem. This leads to a
> deadlock because of:
>
> iomap_dio_complete()
> generic_write_sync()
> btrfs_sync_file()
>
> Separate out iomap_dio_complete() from iomap_dio_rw(), so filesystems
> can call iomap_dio_complete() after unlocking i_rwsem.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Seems clunky, but then I don't understand btrfs locking either. :)
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> ---
> fs/iomap/direct-io.c | 35 ++++++++++++++++++++++++++---------
> include/linux/iomap.h | 5 +++++
> 2 files changed, 31 insertions(+), 9 deletions(-)
>
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index c1aafb2ab990..b88dbfe15118 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -76,7 +76,7 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
> dio->submit.cookie = submit_bio(bio);
> }
>
> -static ssize_t iomap_dio_complete(struct iomap_dio *dio)
> +ssize_t iomap_dio_complete(struct iomap_dio *dio)
> {
> const struct iomap_dio_ops *dops = dio->dops;
> struct kiocb *iocb = dio->iocb;
> @@ -130,6 +130,7 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
>
> return ret;
> }
> +EXPORT_SYMBOL_GPL(iomap_dio_complete);
>
> static void iomap_dio_complete_work(struct work_struct *work)
> {
> @@ -406,8 +407,8 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> * Returns -ENOTBLK In case of a page invalidation invalidation failure for
> * writes. The callers needs to fall back to buffered I/O in this case.
> */
> -ssize_t
> -iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> +struct iomap_dio *
> +__iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
> bool wait_for_completion)
> {
> @@ -421,14 +422,14 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> struct iomap_dio *dio;
>
> if (!count)
> - return 0;
> + return NULL;
>
> if (WARN_ON(is_sync_kiocb(iocb) && !wait_for_completion))
> - return -EIO;
> + return ERR_PTR(-EIO);
>
> dio = kmalloc(sizeof(*dio), GFP_KERNEL);
> if (!dio)
> - return -ENOMEM;
> + return ERR_PTR(-ENOMEM);
>
> dio->iocb = iocb;
> atomic_set(&dio->ref, 1);
> @@ -558,7 +559,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> dio->wait_for_completion = wait_for_completion;
> if (!atomic_dec_and_test(&dio->ref)) {
> if (!wait_for_completion)
> - return -EIOCBQUEUED;
> + return ERR_PTR(-EIOCBQUEUED);
>
> for (;;) {
> set_current_state(TASK_UNINTERRUPTIBLE);
> @@ -574,10 +575,26 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> __set_current_state(TASK_RUNNING);
> }
>
> - return iomap_dio_complete(dio);
> + return dio;
>
> out_free_dio:
> kfree(dio);
> - return ret;
> + if (ret)
> + return ERR_PTR(ret);
> + return NULL;
> +}
> +EXPORT_SYMBOL_GPL(__iomap_dio_rw);
> +
> +ssize_t
> +iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> + const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
> + bool wait_for_completion)
> +{
> + struct iomap_dio *dio;
> +
> + dio = __iomap_dio_rw(iocb, iter, ops, dops, wait_for_completion);
> + if (IS_ERR_OR_NULL(dio))
> + return PTR_ERR_OR_ZERO(dio);
> + return iomap_dio_complete(dio);
> }
> EXPORT_SYMBOL_GPL(iomap_dio_rw);
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 4d1d3c3469e9..172b3397a1a3 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -13,6 +13,7 @@
> struct address_space;
> struct fiemap_extent_info;
> struct inode;
> +struct iomap_dio;
> struct iomap_writepage_ctx;
> struct iov_iter;
> struct kiocb;
> @@ -258,6 +259,10 @@ struct iomap_dio_ops {
> ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
> bool wait_for_completion);
> +struct iomap_dio *__iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> + const struct iomap_ops *ops, const struct iomap_dio_ops *dops,
> + bool wait_for_completion);
> +ssize_t iomap_dio_complete(struct iomap_dio *dio);
> int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
>
> #ifdef CONFIG_SWAP
> --
> 2.26.2
>
next prev parent reply other threads:[~2020-09-26 1:50 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-24 16:39 [PATCH 0/14 v3] BTRFS DIO inode locking/D_SYNC fix Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 01/14] fs: remove dio_end_io() Goldwyn Rodrigues
2020-09-29 12:16 ` David Sterba
2020-09-24 16:39 ` [PATCH 02/14] btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 03/14] iomap: Allow filesystem to call iomap_dio_complete without i_rwsem Goldwyn Rodrigues
2020-09-26 1:49 ` Darrick J. Wong [this message]
2020-09-24 16:39 ` [PATCH 04/14] iomap: Call inode_dio_end() before generic_write_sync() Goldwyn Rodrigues
2020-09-26 1:51 ` Darrick J. Wong
2020-09-28 15:04 ` David Sterba
2020-09-28 16:12 ` Darrick J. Wong
2020-09-24 16:39 ` [PATCH 05/14] btrfs: split btrfs_direct_IO to read and write Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 06/14] btrfs: Move pos increment and pagecache extension to btrfs_buffered_write() Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 07/14] btrfs: Move FS error state bit early during write Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 08/14] btrfs: Introduce btrfs_write_check() Goldwyn Rodrigues
2020-10-09 14:21 ` Josef Bacik
2020-09-24 16:39 ` [PATCH 09/14] btrfs: Introduce btrfs_inode_lock()/unlock() Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 10/14] btrfs: Push inode locking and unlocking into buffered/direct write Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 11/14] btrfs: Use inode_lock_shared() for direct writes within EOF Goldwyn Rodrigues
2020-10-09 14:25 ` Josef Bacik
2020-09-24 16:39 ` [PATCH 12/14] btrfs: Remove dio_sem Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 13/14] btrfs: Call iomap_dio_complete() without inode_lock Goldwyn Rodrigues
2020-09-24 16:39 ` [PATCH 14/14] btrfs: Revert 09745ff88d93 ("btrfs: dio iomap DSYNC workaround") Goldwyn Rodrigues
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200926014939.GP7964@magnolia \
--to=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=dsterba@suse.com \
--cc=hch@lst.de \
--cc=johannes.thumshirn@wdc.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=rgoldwyn@suse.com \
--cc=rgoldwyn@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).