linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Matthew Wilcox <willy@infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-bcachefs@vger.kernel.org,
	Christian Brauner <christian@brauner.io>
Subject: Re: [GIT PULL] bcachefs
Date: Wed, 6 Sep 2023 10:57:25 -0700	[thread overview]
Message-ID: <20230906175725.GF28160@frogsfrogsfrogs> (raw)
In-Reply-To: <20230906161002.2ztelmyzgy3pbmcd@moria.home.lan>

On Wed, Sep 06, 2023 at 12:10:02PM -0400, Kent Overstreet wrote:
> On Wed, Sep 06, 2023 at 01:41:22AM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 05, 2023 at 08:00:07PM -0400, Kent Overstreet wrote:
> > > On Tue, Sep 05, 2023 at 06:24:47AM -0700, Christoph Hellwig wrote:
> > > > Hi Kent,
> > > > 
> > > > I thought you'd follow Christians proposal to actually work with
> > > > people to try to use common APIs (i.e. to use iomap once it's been
> > > > even more iter-like, for which I'm still waiting for suggestions),
> > > > and make your new APIs used more widely if they are a good idea
> > > > (which also requires explaining them better) and aim for 6.7 once
> > > > that is done.
> > > 
> > > Christoph, I get that iomap is your pet project and you want to make it
> > > better and see it more widely used.

Christoph quit as maintainer 78 days ago.

> > > But the reasons bcachefs doesn't use iomap have been discussed at
> > > length, and I've posted and talked about the bcachefs equivalents of
> > > that code. You were AWOL on those discussions; you consistently say
> > > "bcachefs should use iomap" and then walk away, so those discussions
> > > haven't moved forwards.
> > > 
> > > To recap, besides being more iterator like (passing data structures
> > > around with iterators, vs. indirect function calls into the filesystem),
> > > bcachefs also hangs a bit more state off the pagecache, due to being
> > > multi device and also using the same data structure for tracking disk
> > > reservations (because why make the buffered write paths look that up
> > > separately?).
> > 
> > I /thought/ the proposal was to use iomap for bcachefs DIO and leave

I wasn't aware of /any/ active effort to make bcachefs use iomap for any
purpose.

> > buffered writes for a different day.  I agree the iomap buffered write
> > path is inappropriate for bcachefs today.  I'd like that to change,
> > but there's a lot of functionality that it would need to support.
> 
> No, I'm not going to convert the bcachefs DIO path to iomap; not as it
> exitss now.
> 
> Right now I've got a clean separation between the VFS level DIO code,
> and the lower level bcachefs code that walks the extents btree and
> issues IOs. I have to consider the iomap approach where the
> loop-up-mappings-and-issue loop is in iomap code but calling into
> filesystem code pretty gross.
> 
> I was talking about this /years/ ago when I did the work to make it
> possible to efficiently split bios - iomap could have taken the approach
> bcachefs did, the prereqs were in place when iomap was started, but it
> didn't happen - iomap ended up being a more conservative approach, a
> cleaned up version of buffer heads and fs/direct-IO.c.

<shrug> iirc the genesis of xfs "iomap" was that it was created to
supply space mappings to pnfs, and then got reused for directio and
pagecache operations.  Later that was hoisted wholesale to the vfs.
Then spectre/meltdown happened and our indirect function call toy got
taken away, and now we're stuck figuring out how to remove them all.

IOWs, individual tactics that each made sense for maintaining the
overall stability of xfs, but overall didn't amount to anything
ambitious.

In the end I'd probably rather do something like this to eliminate all
the indirect function calls:

int
xfs_zero_range(
	struct xfs_inode	*ip,
	loff_t			pos,
	loff_t			len,
	bool			*did_zero)
{
	struct iomap_iter	iter = { };
	struct inode		*inode = VFS_I(ip);
	int			ret = 0;

	iomap_start_zero_range(&iter, inode, pos, len);
	while (iter.len > 0) {
		ret = xfs_buffered_write_iomap_begin(&iter, &iter.write_map,
				&iter.read_map);
		if (ret)
			break;
		len = iomap_zero_range_iter(&iter, did_zero);
		if (len < 0) {
			ret = len;
			break;
		}
		ret = xfs_buffered_write_iomap_end(&iter, len, &iter.write_map);
		if (ret)
			break;
		iomap_iter_advance(&iter, len);
	}
	iomap_end_zero_range(&iter);

	return iter.write > 0 ? iter.write : ret;
}

(I would also rename iter.iomap and iter.srcmap to make it more obvious
which ones get filled out under what circumstances.)

> That's fine, iomap is certainly an improvement over what it was
> replacing, but it would not be an improvement for bcachefs.

Filesystems are free to implement read_ and write_iter as they choose,
whether or not that involves iomap.

> I think it might be more fruitful to look at consolidating the buffered
> IO code in bcachefs and iomap. The conceptual models are a bit closer,
> bcachefs's buffered IO code is just a bit more fully-featured in that it
> does the dirty block tracking in a unified way. That was a project that
> turned out pretty nicely.

I think I'd much rather gradually revise iomap for everyone and cut
bcachefs over when its ready.  I don't think revising iomap is a
reasonable precondition for merging bcachefs, nor do I think it's a good
idea to risk destabilizing bcachefs just to hack in a new dependency
that won't even fit well.

--D

  reply	other threads:[~2023-09-06 17:59 UTC|newest]

Thread overview: 141+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-03  3:25 [GIT PULL] bcachefs Kent Overstreet
2023-09-05 13:24 ` Christoph Hellwig
2023-09-06  0:00   ` Kent Overstreet
2023-09-06  0:41     ` Matthew Wilcox
2023-09-06 16:10       ` Kent Overstreet
2023-09-06 17:57         ` Darrick J. Wong [this message]
2023-09-08  9:37     ` Christoph Hellwig
2023-09-06 19:36 ` Linus Torvalds
2023-09-06 20:02   ` Linus Torvalds
2023-09-06 20:20     ` Linus Torvalds
2023-09-06 21:55       ` Arnaldo Carvalho de Melo
2023-09-06 23:13         ` David Sterba
2023-09-06 23:34           ` Linus Torvalds
2023-09-06 23:46             ` Arnaldo Carvalho de Melo
2023-09-06 23:53               ` Arnaldo Carvalho de Melo
2023-09-06 23:16         ` Linus Torvalds
2023-09-10  0:53       ` Kent Overstreet
2023-09-07 20:37   ` Kent Overstreet
2023-09-07 20:51     ` Linus Torvalds
2023-09-07 23:40   ` Kent Overstreet
2023-09-08  6:29     ` Martin Steigerwald
2023-09-08  9:11     ` Joshua Ashton
2023-09-06 22:28 ` Nathan Chancellor
2023-09-07  0:03   ` Kees Cook
2023-09-07 14:29     ` Chris Mason
2023-09-07 20:39     ` Kent Overstreet
2023-09-08 10:50       ` Brian Foster
2023-09-08 23:05     ` Dave Chinner
2023-09-08 10:59 ` Thank you! (was: Re: [GIT PULL] bcachefs) Martin Steigerwald
  -- strict thread matches above, loose matches on Subject: below --
2023-06-26 21:46 [GIT PULL] bcachefs Kent Overstreet
2023-06-26 23:11 ` Jens Axboe
2023-06-27  0:06   ` Kent Overstreet
2023-06-27  1:13     ` Jens Axboe
2023-06-27  2:05       ` Kent Overstreet
2023-06-27  2:59         ` Jens Axboe
2023-06-27  3:10           ` Kent Overstreet
2023-06-27 17:16           ` Jens Axboe
2023-06-27 20:15             ` Kent Overstreet
2023-06-27 22:05               ` Dave Chinner
2023-06-27 22:41                 ` Kent Overstreet
2023-06-28 14:40                 ` Jens Axboe
2023-06-28 14:48                   ` Thomas Weißschuh
2023-06-28 14:58                     ` Jens Axboe
2023-06-28  3:16               ` Jens Axboe
2023-06-28  4:01                 ` Kent Overstreet
2023-06-28 14:58                   ` Jens Axboe
2023-06-28 15:22                     ` Jens Axboe
2023-06-28 17:56                       ` Kent Overstreet
2023-06-28 20:45                         ` Jens Axboe
2023-06-28 16:57                     ` Jens Axboe
2023-06-28 17:33                       ` Christian Brauner
2023-06-28 17:52                       ` Kent Overstreet
2023-06-28 20:44                         ` Jens Axboe
2023-06-28 21:17                           ` Jens Axboe
2023-06-28 22:13                             ` Kent Overstreet
2023-06-28 22:33                               ` Jens Axboe
2023-06-28 22:55                                 ` Kent Overstreet
2023-06-28 23:14                                   ` Jens Axboe
2023-06-28 23:50                                     ` Kent Overstreet
2023-06-29  1:00                                       ` Dave Chinner
2023-06-29  1:33                                         ` Jens Axboe
2023-06-29 11:18                                           ` Christian Brauner
2023-06-29 14:17                                             ` Kent Overstreet
2023-06-29 15:31                                             ` Kent Overstreet
2023-06-30  9:40                                               ` Christian Brauner
2023-07-06 15:20                                                 ` Kent Overstreet
2023-07-06 16:26                                                   ` Jens Axboe
2023-07-06 16:34                                                     ` Kent Overstreet
2023-06-29  1:29                                       ` Jens Axboe
2023-07-06 20:15                             ` Kent Overstreet
2023-06-28 17:54                     ` Kent Overstreet
2023-06-28 20:54                       ` Jens Axboe
2023-06-28 22:14                         ` Jens Axboe
2023-06-28 23:04                           ` Kent Overstreet
2023-06-28 23:11                             ` Jens Axboe
2023-06-27  2:33       ` Kent Overstreet
2023-06-27  2:59         ` Jens Axboe
2023-06-27  3:19           ` Matthew Wilcox
2023-06-27  3:22             ` Kent Overstreet
2023-06-27  3:52 ` Christoph Hellwig
2023-06-27  4:36   ` Kent Overstreet
2023-07-06 15:56 ` Kent Overstreet
2023-07-06 16:40   ` Josef Bacik
2023-07-06 17:38     ` Kent Overstreet
2023-07-06 19:17       ` Eric Sandeen
2023-07-06 19:31         ` Kent Overstreet
2023-07-06 21:19       ` Darrick J. Wong
2023-07-06 22:43         ` Kent Overstreet
2023-07-07 13:13           ` Jan Kara
2023-07-07 13:52             ` Kent Overstreet
2023-07-07  8:48         ` Christian Brauner
2023-07-07  9:18           ` Kent Overstreet
2023-07-07 16:26             ` James Bottomley
2023-07-07 16:48               ` Kent Overstreet
2023-07-07 17:04                 ` James Bottomley
2023-07-07 17:26                   ` Kent Overstreet
2023-07-08  3:54               ` Matthew Wilcox
2023-07-08  4:10                 ` Kent Overstreet
2023-07-08  4:31                 ` Kent Overstreet
2023-07-08 15:02                   ` Theodore Ts'o
2023-07-08 15:23                     ` Kent Overstreet
2023-07-08 16:42                 ` James Bottomley
2023-07-09  1:16                   ` Kent Overstreet
2023-07-07  9:35           ` Kent Overstreet
2023-07-07  2:04       ` Theodore Ts'o
2023-07-07 12:18       ` Brian Foster
2023-07-07 14:49         ` Kent Overstreet
2023-07-12  2:54   ` Kent Overstreet
2023-07-12 19:48     ` Kees Cook
2023-07-12 19:57       ` Kent Overstreet
2023-07-12 22:10     ` Darrick J. Wong
2023-07-12 23:57       ` Kent Overstreet
2023-08-09  1:27     ` Linus Torvalds
2023-08-10 15:54       ` Kent Overstreet
2023-08-10 16:40         ` Linus Torvalds
2023-08-10 18:02           ` Kent Overstreet
2023-08-10 18:09             ` Linus Torvalds
2023-08-10 17:52         ` Jan Kara
2023-08-11  2:47           ` Kent Overstreet
2023-08-11  8:10             ` Jan Kara
2023-08-11  8:13               ` Christian Brauner
2023-08-10 22:39         ` Darrick J. Wong
2023-08-10 23:47           ` Linus Torvalds
2023-08-11  2:40             ` Jens Axboe
2023-08-11  4:03             ` Kent Overstreet
2023-08-11  5:20               ` Linus Torvalds
2023-08-11  5:29                 ` Kent Overstreet
2023-08-11  5:53                   ` Linus Torvalds
2023-08-11  7:52                     ` Christian Brauner
2023-08-11 14:31                     ` Jens Axboe
2023-08-11  3:45           ` Kent Overstreet
2023-08-21  0:09             ` Dave Chinner
2023-08-10 23:07         ` Matthew Wilcox
2023-08-11 10:54         ` Christian Brauner
2023-08-11 12:58           ` Kent Overstreet
2023-08-14  7:25             ` Christian Brauner
2023-08-14 15:23               ` Kent Overstreet
2023-08-11 13:21           ` Kent Overstreet
2023-08-11 22:56             ` Darrick J. Wong
2023-08-14  7:21             ` Christian Brauner
2023-08-14 15:27               ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230906175725.GF28160@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=christian@brauner.io \
    --cc=hch@infradead.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).