From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FBB8C677F1 for ; Fri, 24 Feb 2023 07:01:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA0A36B0072; Fri, 24 Feb 2023 02:01:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B28896B0073; Fri, 24 Feb 2023 02:01:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A2F16B0074; Fri, 24 Feb 2023 02:01:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 83DC56B0072 for ; Fri, 24 Feb 2023 02:01:51 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4904A1615D1 for ; Fri, 24 Feb 2023 07:01:51 +0000 (UTC) X-FDA: 80501290422.10.313D498 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by imf25.hostedemail.com (Postfix) with ESMTP id 9917AA000D for ; Fri, 24 Feb 2023 07:01:46 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of yi.zhang@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=yi.zhang@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677222109; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0Faf8YcrtkpPq3C0JmpDRqrfuKCKvuCj1IJbIjyiHtc=; b=BcfFP3yVFavk2XWIX+UzAV+K1nH8o+Uzz53+SV6EMQ/jxWhRDBmW26XlnVrYEnleFZmqC1 EGBUPMoocwm0ugBl1YvWG+m1liSJ6NEAJvWgEBtDYP6B11CzMrsEGS0izwzSEKSoIYDQQi ocJx6TZpuanhaiB0NYmno+5eQs0s8Cw= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of yi.zhang@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=yi.zhang@huaweicloud.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677222109; a=rsa-sha256; cv=none; b=pR4wkdA+WgL0H0aE8PDfHDKVfExF242Ji2zkoPBmdq0PCSAhRBIxonI0KCHg95uEjeZqdT ecPHv5F3NprF6CGFkPVxLtAxzrj4JRQEJnLANZFzjLkKSW1vDXO/2JFKL5T4cGt5KrpAUj Nx7+ewY/km8zhYMVtk2/muT1qWUe29A= Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4PNLR54FGsz4f3lwW for ; Fri, 24 Feb 2023 15:01:37 +0800 (CST) Received: from [10.174.176.34] (unknown [10.174.176.34]) by APP4 (Coremail) with SMTP id gCh0CgBH_rHSYPhjdvjdEA--.43719S3; Fri, 24 Feb 2023 15:01:39 +0800 (CST) Subject: Re: LSF/MM/BPF 2023 IOMAP conversion status update To: Luis Chamberlain , Jan Kara , Matthew Wilcox Cc: lsf-pc@lists.linux-foundation.org, Christoph Hellwig , David Howells , "kbus @imap.suse.de>> Keith Busch" , Pankaj Raghav , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, yi.zhang@huawei.com, guohanjun@huawei.com References: <20230129044645.3cb2ayyxwxvxzhah@garbanzo> <20230208160422.m4d4rx6kg57xm5xk@quack3> From: Zhang Yi Message-ID: Date: Fri, 24 Feb 2023 15:01:37 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <20230208160422.m4d4rx6kg57xm5xk@quack3> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-CM-TRANSID:gCh0CgBH_rHSYPhjdvjdEA--.43719S3 X-Coremail-Antispam: 1UD129KBjvJXoWxCw18Xr1DXw17CryUAF4DXFb_yoWrAw1xpF WagFnrKr1ktF48Zrn7ua1xtFWIya909345Xr90qry5Aa45GrnagFZrtayqyFyqgryfu3Wa vr4jvFyUuF9FvrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvIb4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7Mxk0xIA0c2IE e2xFo4CEbIxvr21l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxV Aqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q 6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6x kF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWrZr1j6s0DMIIF0xvEx4A2jsIE 14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf 9x07UWE__UUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9917AA000D X-Stat-Signature: 9m66e5kb3p81kgdg4do6cbksaparn5s8 X-HE-Tag: 1677222106-513764 X-HE-Meta: U2FsdGVkX1+FQ9gwsLZ4xdN8JS8fEVf6ucB1LyoOHSvk4lF6CPhBPk6vA/jgeShaCvEfxZ5mNDcagpwsPyvAelTzlq3Zij7C9Tttk/qnednGQFx3bAOWOPT0Dq2BjnCwTvHh/Ey9QK3FcFyWU7rvqx46Tzx9TjiwYVtocniAZvuOZkxTZ/Wk8MkykliEoDCGjIBhjxfftUcE1fcSV+3b0Dj2Ljrp/7TdINQwl2KKIDN3l/8MNC1V9rlCmQR245xcluCxtWkxyZhZO/rc3xA4ZhkbWlO3G2ia6jqDgEsdZD19iwEZaiFhOO2reIkhS5rEVK+zxXDs/qKVhenbWmr5MgBIDm4TdwAD7JnqVXhZcrYKSnMhmme0s9E/PjKmfUeuq7Ty8hkRud2r2iWWtr+a53ONMdf1cpbxDy+yTEWr968YocUHtu/WnSYmVq4adK254pe/tSnVK92COcPt87m3IYSBbk5CGViD5sC2S6SBd+vWU7FYau7wPkx9F/2v0d/+C/gdYVTyH6DBSS5HMSJwLdHa01rduxHFfdLqMe/rBOhJ4sgAvjeKlf9QC7QXcwc6V+vWdGt/MTFytDssrMiD+tfAVNxjN8JpNKGVcMuHX351ZiAC6yGJ90pxkBzqTTY5unZkA4oKGjqrn7NuyqvoAhkvYyB6NX6ZVFJVqfaqMBSHukJ+ul8FE0B39j6MJgp7x7LTKXkR81QM71tO5oPJ4BVmhp57gPRwTjmVo/qkpAeuchdIyeBNbjw15BIx7PY+qO8vyO/dcrhmPqpVpcDgh5d4O1lfVd+n6UZOmSYzEMHk06QLIyr1OITUoPM8JE2pjv4E3Cv848uDb6PuHznsr8XQUvB2X32UL7lhloYus13owf8BGF7E/phJocpp4z2bF7obL2X5XCWpD/cQbhZgtQ03GbqEGu1BjoWrVAxfUypHCcRReARcfxB08D+MjLSpmvDJycyVRj5wnnGww8Q eYlVuPjG 4+psh887PEih7AdiV6yQS6a35jR6D2hH6hPk9+5c732ALuKg42x0/pkGbkNMXE+bz7FHK9W5kKb+TqgiR9pRWE8qHkP/mQ0/nJheaRb6SJaQBiKF23pvZhlcjh2kUWBMlo89LU9eVGBE7GjAmwAcd0EEXvYz5wIyHHiY10lw9VJbETrXrc9zig9+K7xiGgFJv2Oki X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/2/9 0:04, Jan Kara wrote: > On Sun 29-01-23 05:06:47, Matthew Wilcox wrote: >> On Sat, Jan 28, 2023 at 08:46:45PM -0800, Luis Chamberlain wrote: >>> I'm hoping this *might* be useful to some, but I fear it may leave quite >>> a bit of folks with more questions than answers as it did for me. And >>> hence I figured that *this aspect of this topic* perhaps might be a good >>> topic for LSF. The end goal would hopefully then be finally enabling us >>> to document IOMAP API properly and helping with the whole conversion >>> effort. >> >> +1 from me. >> >> I've made a couple of abortive efforts to try and convert a "trivial" >> filesystem like ext2/ufs/sysv/jfs to iomap, and I always get hung up on >> what the semantics are for get_block_t and iomap_begin(). > > Yeah, I'd be also interested in this discussion. In particular as a > maintainer of part of these legacy filesystems (ext2, udf, isofs). > >>> Perhaps fs/buffers.c could be converted to folios only, and be done >>> with it. But would we be loosing out on something? What would that be? >> >> buffer_heads are inefficient for multi-page folios because some of the >> algorthims are O(n^2) for n being the number of buffers in a folio. >> It's fine for 8x 512b buffers in a 4k page, but for 512x 4kb buffers in >> a 2MB folio, it's pretty sticky. Things like "Read I/O has completed on >> this buffer, can I mark the folio as Uptodate now?" For iomap, that's a >> scan of a 64 byte bitmap up to 512 times; for BHs, it's a loop over 512 >> allocations, looking at one bit in each BH before moving on to the next. >> Similarly for writeback, iirc. >> >> So +1 from me for a "How do we convert 35-ish block based filesystems >> from BHs to iomap for their buffered & direct IO paths". There's maybe a >> separate discussion to be had for "What should the API be for filesystems >> to access metadata on the block device" because I don't believe the >> page-cache based APIs are easy for fs authors to use. > > Yeah, so the actual data paths should be relatively easy for these old > filesystems as they usually don't do anything special (those that do - like > reiserfs - are deprecated and to be removed). But for metadata we do need > some convenience functions like - give me block of metadata at this block > number, make it dirty / clean / uptodate (block granularity dirtying & > uptodate state is absolute must for metadata, otherwise we'll have data > corruption issues). From the more complex functionality we need stuff like: > lock particular block of metadata (equivalent of buffer lock), track that > this block is metadata for given inode so that it can be written on > fsync(2). Then more fancy filesystems like ext4 also need to attach more > private state to each metadata block but that needs to be dealt with on > case-by-case basis anyway. > Hello, all. I also interested in this topic, especially for the ext4 filesystem iomap conversion of buffered IO paths. And also for the discussion of the metadata APIs, current buffer_heads could lead to many potential problems and brings a lot of quality challenges to our products. I look forward to more discussion if I can attend offline. Thanks, Yi. >> Maybe some related topics are >> "What testing should we require for some of these ancient filesystems?" >> "Whose job is it to convert these 35 filesystems anyway, can we just >> delete some of them?" > > I would not certainly miss some more filesystems - like minix, sysv, ... > But before really treatening to remove some of these ancient and long > untouched filesystems, we should convert at least those we do care about. > When there's precedent how simple filesystem conversion looks like, it is > easier to argue about what to do with the ones we don't care about so much. > >> "Is there a lower-performance but easier-to-implement API than iomap >> for old filesystems that only exist for compatibiity reasons?" > > As I wrote above, for metadata there ought to be something as otherwise it > will be real pain (and no gain really). But I guess the concrete API only > matterializes once we attempt a conversion of some filesystem like ext2. > I'll try to have a look into that, at least the obvious preparatory steps > like converting the data paths to iomap. > > Honza >