linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhang Yi <yi.zhang@huaweicloud.com>
To: Luis Chamberlain <mcgrof@kernel.org>, Jan Kara <jack@suse.cz>,
	Matthew Wilcox <willy@infradead.org>
Cc: lsf-pc@lists.linux-foundation.org,
	Christoph Hellwig <hch@infradead.org>,
	David Howells <dhowells@redhat.com>,
	"kbus @imap.suse.de>> Keith Busch" <kbusch@kernel.org>,
	Pankaj Raghav <p.raghav@samsung.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	yi.zhang@huawei.com, guohanjun@huawei.com
Subject: Re: LSF/MM/BPF 2023 IOMAP conversion status update
Date: Fri, 24 Feb 2023 15:01:37 +0800	[thread overview]
Message-ID: <b1dec5c2-0437-de15-b2f4-13609b4378f0@huaweicloud.com> (raw)
In-Reply-To: <20230208160422.m4d4rx6kg57xm5xk@quack3>

On 2023/2/9 0:04, Jan Kara wrote:
> On Sun 29-01-23 05:06:47, Matthew Wilcox wrote:
>> On Sat, Jan 28, 2023 at 08:46:45PM -0800, Luis Chamberlain wrote:
>>> I'm hoping this *might* be useful to some, but I fear it may leave quite
>>> a bit of folks with more questions than answers as it did for me. And
>>> hence I figured that *this aspect of this topic* perhaps might be a good
>>> topic for LSF.  The end goal would hopefully then be finally enabling us
>>> to document IOMAP API properly and helping with the whole conversion
>>> effort.
>>
>> +1 from me.
>>
>> I've made a couple of abortive efforts to try and convert a "trivial"
>> filesystem like ext2/ufs/sysv/jfs to iomap, and I always get hung up on
>> what the semantics are for get_block_t and iomap_begin().
> 
> Yeah, I'd be also interested in this discussion. In particular as a
> maintainer of part of these legacy filesystems (ext2, udf, isofs).
> 
>>> Perhaps fs/buffers.c could be converted to folios only, and be done
>>> with it. But would we be loosing out on something? What would that be?
>>
>> buffer_heads are inefficient for multi-page folios because some of the
>> algorthims are O(n^2) for n being the number of buffers in a folio.
>> It's fine for 8x 512b buffers in a 4k page, but for 512x 4kb buffers in
>> a 2MB folio, it's pretty sticky.  Things like "Read I/O has completed on
>> this buffer, can I mark the folio as Uptodate now?"  For iomap, that's a
>> scan of a 64 byte bitmap up to 512 times; for BHs, it's a loop over 512
>> allocations, looking at one bit in each BH before moving on to the next.
>> Similarly for writeback, iirc.
>>
>> So +1 from me for a "How do we convert 35-ish block based filesystems
>> from BHs to iomap for their buffered & direct IO paths".  There's maybe a
>> separate discussion to be had for "What should the API be for filesystems
>> to access metadata on the block device" because I don't believe the
>> page-cache based APIs are easy for fs authors to use.
> 
> Yeah, so the actual data paths should be relatively easy for these old
> filesystems as they usually don't do anything special (those that do - like
> reiserfs - are deprecated and to be removed). But for metadata we do need
> some convenience functions like - give me block of metadata at this block
> number, make it dirty / clean / uptodate (block granularity dirtying &
> uptodate state is absolute must for metadata, otherwise we'll have data
> corruption issues). From the more complex functionality we need stuff like:
> lock particular block of metadata (equivalent of buffer lock), track that
> this block is metadata for given inode so that it can be written on
> fsync(2). Then more fancy filesystems like ext4 also need to attach more
> private state to each metadata block but that needs to be dealt with on
> case-by-case basis anyway.
> 

Hello, all.

I also interested in this topic, especially for the ext4 filesystem iomap
conversion of buffered IO paths. And also for the discussion of the metadata APIs,
current buffer_heads could lead to many potential problems and brings a lot of
quality challenges to our products. I look forward to more discussion if I can
attend offline.

Thanks,
Yi.

>> Maybe some related topics are
>> "What testing should we require for some of these ancient filesystems?"
>> "Whose job is it to convert these 35 filesystems anyway, can we just
>> delete some of them?"
> 
> I would not certainly miss some more filesystems - like minix, sysv, ...
> But before really treatening to remove some of these ancient and long
> untouched filesystems, we should convert at least those we do care about.
> When there's precedent how simple filesystem conversion looks like, it is
> easier to argue about what to do with the ones we don't care about so much.
> 
>> "Is there a lower-performance but easier-to-implement API than iomap
>> for old filesystems that only exist for compatibiity reasons?"
> 
> As I wrote above, for metadata there ought to be something as otherwise it
> will be real pain (and no gain really). But I guess the concrete API only
> matterializes once we attempt a conversion of some filesystem like ext2.
> I'll try to have a look into that, at least the obvious preparatory steps
> like converting the data paths to iomap.
> 
> 								Honza
> 



  reply	other threads:[~2023-02-24  7:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-29  4:46 LSF/MM/BPF 2023 IOMAP conversion status update Luis Chamberlain
2023-01-29  5:06 ` Matthew Wilcox
2023-01-29  5:39   ` Luis Chamberlain
2023-02-08 16:04   ` Jan Kara
2023-02-24  7:01     ` Zhang Yi [this message]
2023-02-26 20:16     ` Ritesh Harjani
2023-03-16 14:40       ` [RFCv1][WIP] ext2: Move direct-io to use iomap Ritesh Harjani (IBM)
2023-03-16 15:41         ` Darrick J. Wong
2023-03-20 16:11           ` Ritesh Harjani
2023-03-20 13:15         ` Christoph Hellwig
2023-03-20 17:51         ` Jan Kara
2023-03-22  6:34           ` Ritesh Harjani
2023-03-23 11:30             ` Jan Kara
2023-03-23 13:19               ` Ritesh Harjani
2023-03-30  0:02               ` Christoph Hellwig
2023-02-27 19:26     ` LSF/MM/BPF 2023 IOMAP conversion status update Darrick J. Wong
2023-02-27 21:02       ` Matthew Wilcox
2023-02-27 19:47   ` Darrick J. Wong
2023-02-27 20:24     ` Luis Chamberlain
2023-02-27 19:06 ` Darrick J. Wong
2023-02-27 19:58   ` Luis Chamberlain
2023-03-01 16:59 ` Ritesh Harjani
2023-03-01 17:08   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1dec5c2-0437-de15-b2f4-13609b4378f0@huaweicloud.com \
    --to=yi.zhang@huaweicloud.com \
    --cc=dhowells@redhat.com \
    --cc=guohanjun@huawei.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=kbusch@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=willy@infradead.org \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).