From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BCF6CC77B73 for ; Sun, 16 Apr 2023 14:39:47 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4PztBB2KPyz3cd1 for ; Mon, 17 Apr 2023 00:39:46 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.alibaba.com (client-ip=115.124.30.133; helo=out30-133.freemail.mail.aliyun.com; envelope-from=gerry@linux.alibaba.com; receiver=) X-Greylist: delayed 303 seconds by postgrey-1.36 at boromir; Mon, 17 Apr 2023 00:39:38 AEST Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4PztB25gcVz3c73 for ; Mon, 17 Apr 2023 00:39:38 +1000 (AEST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R331e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=gerry@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VgADkIT_1681655667; Received: from smtpclient.apple(mailfrom:gerry@linux.alibaba.com fp:SMTPD_---0VgADkIT_1681655667) by smtp.aliyun-inc.com; Sun, 16 Apr 2023 22:34:28 +0800 From: Gerry Liu Message-Id: <3FC47462-A324-48D0-A1E2-940BF0016113@linux.alibaba.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_332A80C8-10BD-4FC7-91B5-35AD9124BE3A" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: [PATCH V3] erofs: support flattened block device for multi-blob images Date: Sun, 16 Apr 2023 22:34:26 +0800 In-Reply-To: <20230302071751.48425-1-zhujia.zj@bytedance.com> To: Jia Zhu References: <8be37b4c-5a87-1c10-b0e6-99284e6fd4ca@linux.alibaba.com> <20230302071751.48425-1-zhujia.zj@bytedance.com> X-Mailer: Apple Mail (2.3696.120.41.1.1) X-BeenThere: linux-erofs@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development of Linux EROFS file system List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-kernel@vger.kernel.org, huyue2@coolpad.com, linux-erofs@lists.ozlabs.org Errors-To: linux-erofs-bounces+linux-erofs=archiver.kernel.org@lists.ozlabs.org Sender: "Linux-erofs" --Apple-Mail=_332A80C8-10BD-4FC7-91B5-35AD9124BE3A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > 2023=E5=B9=B43=E6=9C=882=E6=97=A5 15:17=EF=BC=8CJia Zhu = =E5=86=99=E9=81=93=EF=BC=9A >=20 > In order to support mounting multi-blobs container image as a single > block device, add flattened block device feature for EROFS. >=20 > In this mode, all meta/data contents will be mapped into one block > address. User could compose a block device(by nbd/ublk/virtio-blk/ > vhost-user-blk) from multiple sources and mount the block device by > EROFS directly. It can reduce the number of block devices used, and > it's also benefits in both VM file passthrough and distributed storage > scenarios. >=20 > You can test this using the method mentioned by: > https://github.com/dragonflyoss/image-service/pull/1111 > 1. Compose a (nbd)block device from multi-blobs. > 2. Mount EROFS on mntdir/. > 3. Compare the md5sum between source dir and mntdir/. >=20 > Later, we could also use it to refer original tar blobs. >=20 > Signed-off-by: Jia Zhu > Signed-off-by: Xin Yin > Reviewed-by: Jingbo Xu Tested-by: Jiang Liu > > --- > v3: > 1. Move the flatdev check down after all sanity checks.(Jingbo Xu) > 2. Add Reviewed-by tag. > --- > fs/erofs/data.c | 8 ++++++-- > fs/erofs/internal.h | 1 + > fs/erofs/super.c | 5 ++++- > 3 files changed, 11 insertions(+), 3 deletions(-) >=20 > diff --git a/fs/erofs/data.c b/fs/erofs/data.c > index e16545849ea7..818f78ce648c 100644 > --- a/fs/erofs/data.c > +++ b/fs/erofs/data.c > @@ -197,7 +197,6 @@ int erofs_map_dev(struct super_block *sb, struct = erofs_map_dev *map) > struct erofs_device_info *dif; > int id; >=20 > - /* primary device by default */ > map->m_bdev =3D sb->s_bdev; > map->m_daxdev =3D EROFS_SB(sb)->dax_dev; > map->m_dax_part_off =3D EROFS_SB(sb)->dax_part_off; > @@ -210,12 +209,17 @@ int erofs_map_dev(struct super_block *sb, struct = erofs_map_dev *map) > up_read(&devs->rwsem); > return -ENODEV; > } > + if (devs->flatdev) { > + map->m_pa +=3D = blknr_to_addr(dif->mapped_blkaddr); > + up_read(&devs->rwsem); > + return 0; > + } > map->m_bdev =3D dif->bdev; > map->m_daxdev =3D dif->dax_dev; > map->m_dax_part_off =3D dif->dax_part_off; > map->m_fscache =3D dif->fscache; > up_read(&devs->rwsem); > - } else if (devs->extra_devices) { > + } else if (devs->extra_devices && !devs->flatdev) { > down_read(&devs->rwsem); > idr_for_each_entry(&devs->tree, dif, id) { > erofs_off_t startoff, length; > diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h > index 3f3561d37d1b..4fee380a98d9 100644 > --- a/fs/erofs/internal.h > +++ b/fs/erofs/internal.h > @@ -81,6 +81,7 @@ struct erofs_dev_context { > struct rw_semaphore rwsem; >=20 > unsigned int extra_devices; > + bool flatdev; > }; >=20 > struct erofs_fs_context { > diff --git a/fs/erofs/super.c b/fs/erofs/super.c > index 19b1ae79cec4..0afdfce372b3 100644 > --- a/fs/erofs/super.c > +++ b/fs/erofs/super.c > @@ -248,7 +248,7 @@ static int erofs_init_device(struct erofs_buf = *buf, struct super_block *sb, > if (IS_ERR(fscache)) > return PTR_ERR(fscache); > dif->fscache =3D fscache; > - } else { > + } else if (!sbi->devs->flatdev) { > bdev =3D blkdev_get_by_path(dif->path, FMODE_READ | = FMODE_EXCL, > sb->s_type); > if (IS_ERR(bdev)) > @@ -290,6 +290,9 @@ static int erofs_scan_devices(struct super_block = *sb, > if (!ondisk_extradevs) > return 0; >=20 > + if (!sbi->devs->extra_devices && !erofs_is_fscache_mode(sb)) > + sbi->devs->flatdev =3D true; > + > sbi->device_id_mask =3D roundup_pow_of_two(ondisk_extradevs + 1) = - 1; > pos =3D le16_to_cpu(dsb->devt_slotoff) * EROFS_DEVT_SLOT_SIZE; > down_read(&sbi->devs->rwsem); > --=20 > 2.20.1 --Apple-Mail=_332A80C8-10BD-4FC7-91B5-35AD9124BE3A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

2023=E5=B9=B43=E6=9C=882=E6=97=A5 15:17=EF=BC=8CJia Zhu = <zhujia.zj@bytedance.com> =E5=86=99=E9=81=93=EF=BC=9A
In order to support mounting multi-blobs container image as a = single
block device, add flattened block device feature = for EROFS.

In this mode, all meta/data = contents will be mapped into one block
address. User could = compose a block device(by nbd/ublk/virtio-blk/
vhost-user-blk) from multiple sources and mount the block = device by
EROFS directly. It can reduce the number of = block devices used, and
it's also benefits in both VM file = passthrough and distributed storage
scenarios.

You can test this using the method mentioned = by:
https://github.com/dragonflyoss/image-service/pull/1111
1. Compose a (nbd)block device from multi-blobs.
2. Mount EROFS on mntdir/.
3. Compare the = md5sum between source dir and mntdir/.

Later,= we could also use it to refer original tar blobs.

Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Tested-by: Jiang Liu <gerry@linux.alibaba.com>

---
v3:
1. Move the flatdev check down after all = sanity checks.(Jingbo Xu)
2. Add Reviewed-by tag.
---
fs/erofs/data.c     | = 8 ++++++--
fs/erofs/internal.h | 1 +
= fs/erofs/super.c    | 5 ++++-
3 files = changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index e16545849ea7..818f78ce648c 100644
--- = a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ = -197,7 +197,6 @@ int erofs_map_dev(struct super_block *sb, struct = erofs_map_dev *map)
struct erofs_device_info *dif;
= int id;

- /* = primary device by default */
map->m_bdev =3D = sb->s_bdev;
map->m_daxdev =3D = EROFS_SB(sb)->dax_dev;
map->m_dax_part_off =3D = EROFS_SB(sb)->dax_part_off;
@@ -210,12 +209,17 @@ int = erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
= = = up_read(&devs->rwsem);
return = -ENODEV;
}
+ if = (devs->flatdev) {
+ map->m_pa +=3D = blknr_to_addr(dif->mapped_blkaddr);
+ = up_read(&devs->rwsem);
+ return = 0;
+ }
= map->m_bdev =3D dif->bdev;
= map->m_daxdev =3D dif->dax_dev;
= map->m_dax_part_off =3D dif->dax_part_off;
= = = map->m_fscache =3D dif->fscache;
= up_read(&devs->rwsem);
- } else if = (devs->extra_devices) {
+ } else if (devs->extra_devices = && !devs->flatdev) {
= down_read(&devs->rwsem);
= idr_for_each_entry(&devs->tree, dif, id) {
= = = = erofs_off_t startoff, length;
diff --git = a/fs/erofs/internal.h b/fs/erofs/internal.h
index = 3f3561d37d1b..4fee380a98d9 100644
--- = a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -81,6 +81,7 @@ struct erofs_dev_context {
= = struct rw_semaphore rwsem;

unsigned = int extra_devices;
+ bool flatdev;
= };

struct erofs_fs_context {
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 19b1ae79cec4..0afdfce372b3 100644
--- = a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ = -248,7 +248,7 @@ static int erofs_init_device(struct erofs_buf *buf, = struct super_block *sb,
if (IS_ERR(fscache))
= = = = return PTR_ERR(fscache);
= dif->fscache =3D fscache;
- } else = {
+= } else if (!sbi->devs->flatdev) {
= = = bdev =3D blkdev_get_by_path(dif->path, FMODE_READ | = FMODE_EXCL,
 sb->s_type);
= = if (IS_ERR(bdev))
@@ -290,6 +290,9 @@ static int = erofs_scan_devices(struct super_block *sb,
if = (!ondisk_extradevs)
return 0;

+ = if (!sbi->devs->extra_devices && = !erofs_is_fscache_mode(sb))
+ sbi->devs->flatdev =3D = true;
+
sbi->device_id_mask =3D = roundup_pow_of_two(ondisk_extradevs + 1) - 1;
pos =3D = le16_to_cpu(dsb->devt_slotoff) * EROFS_DEVT_SLOT_SIZE;
= = down_read(&sbi->devs->rwsem);
--
2.20.1

= --Apple-Mail=_332A80C8-10BD-4FC7-91B5-35AD9124BE3A--