linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	hch@lst.de, darrick.wong@oracle.com, ruansy.fnst@cn.fujitsu.com,
	Goldwyn Rodrigues <rgoldwyn@suse.com>
Subject: Re: [PATCH 10/13] iomap: use a function pointer for dio submits
Date: Mon, 5 Aug 2019 09:43:21 +1000	[thread overview]
Message-ID: <20190804234321.GC7689@dread.disaster.area> (raw)
In-Reply-To: <20190802220048.16142-11-rgoldwyn@suse.de>

On Fri, Aug 02, 2019 at 05:00:45PM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> This helps filesystems to perform tasks on the bio while
> submitting for I/O. Since btrfs requires the position
> we are working on, pass pos to iomap_dio_submit_bio()
> 
> The correct place for submit_io() is not page_ops. Would it
> better to rename the structure to something like iomap_io_ops
> or put it directly under struct iomap?
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/iomap/direct-io.c  | 16 +++++++++++-----
>  include/linux/iomap.h |  1 +
>  2 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 5279029c7a3c..a802e66bf11f 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -59,7 +59,7 @@ int iomap_dio_iopoll(struct kiocb *kiocb, bool spin)
>  EXPORT_SYMBOL_GPL(iomap_dio_iopoll);
>  
>  static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
> -		struct bio *bio)
> +		struct bio *bio, loff_t pos)
>  {
>  	atomic_inc(&dio->ref);
>  
> @@ -67,7 +67,13 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
>  		bio_set_polled(bio, dio->iocb);
>  
>  	dio->submit.last_queue = bdev_get_queue(iomap->bdev);
> -	dio->submit.cookie = submit_bio(bio);
> +	if (iomap->page_ops && iomap->page_ops->submit_io) {
> +		iomap->page_ops->submit_io(bio, file_inode(dio->iocb->ki_filp),
> +				pos);
> +		dio->submit.cookie = BLK_QC_T_NONE;
> +	} else {
> +		dio->submit.cookie = submit_bio(bio);
> +	}

I don't really like this at all. Apart from the fact it doesn't work
with block device polling (RWF_HIPRI), the iomap architecture is
supposed to resolve the file offset -> block device + LBA mapping
completely up front and so all that remains to be done is build and
submit the bio(s) to the block device.

What I see here is a hack to work around the fact that btrfs has
implemented both file data transformations and device mapping layer
functionality as a filesystem layer between file data bio building
and device bio submission. And as the btrfs file data mapping
(->iomap_begin) is completely unaware that there is further block
mapping to be done before block device bio submission, any generic
code that btrfs uses requires special IO submission hooks rather
than just calling submit_bio().

I'm not 100% sure what the solution here is, but the one thing we
must resist is turning the iomap code into a mess of custom hooks
that only one filesystem uses. We've been taught this lesson time
and time again - the iomap infrastructure exists because stuff like
bufferheads and the old direct IO code ended up so full of special
case code that it ossified and became unmodifiable and
unmaintainable.

We do not want to go down that path again. 

IMO, the iomap IO model needs to be restructured to support post-IO
and pre-IO data verification/calculation/transformation operations
so all the work that needs to be done at the inode/offset context
level can be done in the iomap path before bio submission/after
bio completion. This will allow infrastructure like fscrypt, data
compression, data checksums, etc to be suported generically, not
just by individual filesystems that provide a ->submit_io hook.

As for the btrfs needing to slice and dice bios for multiple
devices?  That should be done via a block device ->make_request
function, not a custom hook in the iomap code.

That's why I don't like this hook - I think hiding data operations
and/or custom bio manipulations in opaque filesystem callouts is
completely the wrong approach to be taking. We need to do these
things in a generic manner so that all filesystems (and block
devices!) that use the iomap infrastructure can take advantage of
them, not just one of them.

Quite frankly, I don't care if it takes more time and work up front,
I'm tired of expedient hacks to merge code quickly repeatedly biting
us on the arse and wasting far more time sorting out than we would
have spent getting it right in the first place.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2019-08-04 23:44 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-02 22:00 [PATCH v2 0/13] Btrfs iomap Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 01/13] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Goldwyn Rodrigues
2019-08-03  0:39   ` Darrick J. Wong
2019-08-05  0:06   ` Dave Chinner
2019-08-02 22:00 ` [PATCH 02/13] iomap: Read page from srcmap for IOMAP_COW Goldwyn Rodrigues
2019-08-03  0:23   ` Darrick J. Wong
2019-08-04 23:52   ` Dave Chinner
2019-08-02 22:00 ` [PATCH 03/13] btrfs: Eliminate PagePrivate for btrfs data pages Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 04/13] btrfs: Add a simple buffered iomap write Goldwyn Rodrigues
2019-08-05  0:11   ` Dave Chinner
2019-08-22 15:05     ` Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 05/13] btrfs: Add CoW in iomap based writes Goldwyn Rodrigues
2019-08-05  0:13   ` Dave Chinner
2019-08-22 15:01     ` Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 06/13] btrfs: remove buffered write code made unnecessary Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 07/13] btrfs: basic direct read operation Goldwyn Rodrigues
2019-08-12 12:32   ` RITESH HARJANI
2019-08-22 15:00     ` Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 08/13] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 09/13] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 10/13] iomap: use a function pointer for dio submits Goldwyn Rodrigues
2019-08-03  0:21   ` Darrick J. Wong
2019-08-05 16:08     ` Goldwyn Rodrigues
2019-08-04 23:43   ` Dave Chinner [this message]
2019-08-05 16:08     ` Goldwyn Rodrigues
2019-08-05 21:54       ` Dave Chinner
2019-08-08  4:26         ` Gao Xiang
2019-08-08  4:52           ` Gao Xiang
2019-08-08  5:49           ` Eric Biggers
2019-08-08  6:28             ` Gao Xiang
2019-08-08  8:16             ` Dave Chinner
2019-08-08  8:57               ` Gao Xiang
2019-08-08  9:29               ` Gao Xiang
2019-08-08 11:21                 ` Gao Xiang
2019-08-08 13:11                   ` Gao Xiang
2019-08-09 20:45             ` Matthew Wilcox
2019-08-09 23:45               ` Gao Xiang
2019-08-10  0:31                 ` Eric Biggers
2019-08-10  0:50                   ` Eric Biggers
2019-08-10  1:34                     ` Gao Xiang
2019-08-10  1:13                   ` Gao Xiang
2019-08-10  0:17               ` Eric Biggers
2019-08-02 22:00 ` [PATCH 11/13] btrfs: Use iomap_dio_rw for performing direct I/O writes Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 12/13] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 13/13] btrfs: update inode size during bio completion Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190804234321.GC7689@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=rgoldwyn@suse.com \
    --cc=rgoldwyn@suse.de \
    --cc=ruansy.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).