From: Andreas Gruenbacher <agruenba@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>,
cluster-devel <cluster-devel@redhat.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH v4 06/11] iomap: Add write_{begin,end} iomap operations
Date: Tue, 15 May 2018 10:16:52 +0200 [thread overview]
Message-ID: <CAHc6FU6kwz4ZsKUdTHjNX5BsLb3Z0csqwP-UV0fHbLNsm70cDg@mail.gmail.com> (raw)
In-Reply-To: <20180515072252.GA23202@lst.de>
On 15 May 2018 at 09:22, Christoph Hellwig <hch@lst.de> wrote:
> On Tue, May 15, 2018 at 11:11:47AM +1000, Dave Chinner wrote:
>> I'm not sure adding a per-page filesystem callout abstraction is
>> what we want in the iomap code. The commit message does not explain
>> why gfs2 needs per-page callouts, nor why the per-page work cannot
>> be done/avoided by running code in the ->iomap_begin callout
>> (e.g. by unstuffing the inode so nothing needs to be done per-page).
>
> Agreed.
>
>> My main concern is that the callouts to the filesystem in iomap are
>> - by design - per IO, not per page. The whole point of using iomap
>> was to get away from doing per-page filesystem operations on
>> multi-page IOs - it's highly inefficient when we only need to call
>> into the filesystem on a per-extent basis, and we simplified the
>> code a *lot* by avoiding such behaviours.
>
> Yes. And from a quick look we can easily handle "stuffed" aka inline
> data with the existing architecture. We just need a new IOMAP_INLINE
> flag for the extent type. Details will need to be worked out how to
> pass that data, either a struct page or virtual address should probably
> do it.
This is about data journaling, it has nothing to do with "stuffed"
files per se. With journaled data writes, we need to do some special
processing for each page so that the page ends up in the journal
before it gets written back to its proper on-disk location. It just
happens that since we need these per-page operations anyway, the
"unstuffing" and "re-stuffing" nicely fits in those operations as
well.
> The other thing gfs2 seems to be doing is handling of journaled
> data. I'm not quite sure how to fit that into the grand overall scheme
> of things, but at least that would only be after the write.
Not sure what you mean by "but at least that would only be after the write".
>> That is, Christoph's patchset pushes us further away from
>> filesystems doing their own per-page processing of page state. It
>> converts the iomap IO path to store it's own per-page state tracking
>> information in page->private, greatly reducing memory overhead (16
>> bytes on 4k page instead of 104 bytes per bufferhead) and
>
> Or in fact zero bytes for block size == PAGE_SIZE
>
>> streamlining the code. This, however, essentially makes the
>> buffered IO path through the iomap infrastructure incompatible with
>> existing bufferhead based filesystems.
>>
>> Depsite this, we can still provide limited support for buffered
>> writes on bufferhead based filesystems through the iomap
>> infrastructure, but I only see that as a mechanism for filesystems
>> moving away from dependencies on bufferheads. It would require a
>> different refactoring of the iomap code - I'd much prefer to keep
>> bufferhead dependent IO paths completely separate (e.g. via a new
>> actor function) to minimise impact and dependencies on the internal
>> bufferhead free iomap IO path....
>>
>> Let's see what Christoph thinks first, though....
>
> gfs2 seems to indeed depend pretty heavily on buffer_heads. This is
> a bit of a bummer, but I'd need to take a deeper look how to offer
> iomap functionality to them without them moving away from buffer_heads
> which isn't really going to be a quick project. In a way the
> write_begin/write_end ops from Andreas facilitate that, as the
> rest of iomap_file_buffered_write and iomap_write_actor are untouched.
>
> So at least temporarily his callouts are what we need, even if the
> actual inline data functionality should be moved out of them for sure,
Been there. An earlier version of the patches did have a separate
stuffed write path and the unstuffing and re-stuffing didn't happen in
those per-page operations. The result was much messier overall.
> but only with a long term plan to move gfs2 away from buffer heads.
I hope we'll get rid of buffer heads in additional parts of gfs2, but
that's not going to happen soon. The main point of those per-page
operations is still to support data journaling though.
> Btw, does gfs2 support blocksizes smallers than PAGE_SIZE?
Yes.
Thanks,
Andreas
next prev parent reply other threads:[~2018-05-15 8:16 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-14 15:36 [PATCH v4 00/11] gfs2 iomap write support Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 01/11] gfs2: Update find_metapath comment Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 02/11] gfs2: hole_size improvement Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 03/11] gfs2: gfs2_stuffed_write_end cleanup Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 04/11] gfs2: Remove ordered write mode handling from gfs2_trans_add_data Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 05/11] gfs2: Iomap cleanups and improvements Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 06/11] iomap: Add write_{begin,end} iomap operations Andreas Gruenbacher
2018-05-15 1:11 ` Dave Chinner
2018-05-15 7:22 ` Christoph Hellwig
2018-05-15 8:16 ` Andreas Gruenbacher [this message]
2018-05-18 16:04 ` Christoph Hellwig
2018-05-25 17:58 ` Andreas Grünbacher
2018-05-28 13:02 ` Christoph Hellwig
2018-05-14 15:36 ` [PATCH v4 07/11] gfs2: iomap buffered write support Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 08/11] gfs2: gfs2_extent_length cleanup Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 09/11] gfs2: iomap direct I/O support Andreas Gruenbacher
2018-05-15 7:31 ` Christoph Hellwig
2018-05-16 20:36 ` Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 10/11] gfs2: Remove gfs2_write_{begin,end} Andreas Gruenbacher
2018-05-14 15:36 ` [PATCH v4 11/11] iomap: Complete partial direct I/O writes synchronously Andreas Gruenbacher
2018-05-15 7:24 ` Christoph Hellwig
2018-05-16 20:27 ` Andreas Gruenbacher
2018-05-18 15:56 ` Christoph Hellwig
2018-05-18 18:35 ` Andreas Grünbacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHc6FU6kwz4ZsKUdTHjNX5BsLb3Z0csqwP-UV0fHbLNsm70cDg@mail.gmail.com \
--to=agruenba@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).