linux-erofs.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Merging multiple erofs file systems on the same block device
@ 2023-05-01 14:09 Daan De Meyer
  2023-05-02  3:17 ` Gao Xiang
  0 siblings, 1 reply; 6+ messages in thread
From: Daan De Meyer @ 2023-05-01 14:09 UTC (permalink / raw)
  To: linux-erofs

Hi,

I've been looking into erofs as an initramfs replacement by using
root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.
However, by using a ramdisk instead of the usual compressed cpio, I
would lose the feature where the kernel merges multiple individual
cpios together into a single tmpfs filesystem. Looking at the
documentation for erofs, I noticed that erofs already seems to support
merging multiple erofs filesystems on separate block devices using the
device= cmdline option. Would it be possible to extend this so that
multiple erofs filesystems that follow each other on the same block
device can also be merged? This would allow me to pass multiple erofs
filesystems to the kernel via initrd=, which would get concatenated
together into a single buffer, which the kernel would write to a
ramdisk (using root=/dev/ram0) which the kernel would then have erofs
mount to /dev/root. erofs would notice that there's multiple erofs
filesystems on the ramdisk and overlay them together (perhaps only if
a cmdline option is enabled).

Does this make sense at all?

Cheers,

Daan De Meyer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Merging multiple erofs file systems on the same block device
  2023-05-01 14:09 Merging multiple erofs file systems on the same block device Daan De Meyer
@ 2023-05-02  3:17 ` Gao Xiang
  2023-05-02 10:03   ` Daan De Meyer
  0 siblings, 1 reply; 6+ messages in thread
From: Gao Xiang @ 2023-05-02  3:17 UTC (permalink / raw)
  To: Daan De Meyer, linux-erofs

Hi,

On 2023/5/1 22:09, Daan De Meyer wrote:
> Hi,
> 
> I've been looking into erofs as an initramfs replacement by using
> root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.

Sorry, I'm on vacation now.

May I ask what's your detailed use cases?  Sure, you could use
/dev/ram0 as a replacement, but currently it still takes double
memory compared with initramfs since ramdisk doesn't support FSDAX
for now (by enabling FSDAX, it won't take double memory at all.)

Actually I think ramdisk FSDAX is useful and I might sync up this on
the following LSF/MM/BPF 2023.

> However, by using a ramdisk instead of the usual compressed cpio, I
> would lose the feature where the kernel merges multiple individual
> cpios together into a single tmpfs filesystem. Looking at the
> documentation for erofs, I noticed that erofs already seems to support
> merging multiple erofs filesystems on separate block devices using the
> device= cmdline option. Would it be possible to extend this so that
Here `device=` is actually used to refer to seperate blobs with the
merged metadata.  For example, you could have

   device=/dev/ram1 original tar1
   device=/dev/ram2 original tar2
   /dev/ram0        merged metadata for tar1 + tar2.

which means, if you'd like to merge multiple EROFS filesystems, you
might need another step to build a merged metadata in advance in order
to merges multiple individual tarballs together, which could be built
when applying images or booting (by using a special bootloader with
such functionality.)

EROFS doesn't support stacking multiple fses runtimely since it seems
a duplicated feature of overlayfs (you could consider using overlayfs
honestly.)

> multiple erofs filesystems that follow each other on the same block
> device can also be merged? This would allow me to pass multiple erofs
> filesystems to the kernel via initrd=, which would get concatenated
> together into a single buffer, which the kernel would write to a
> ramdisk (using root=/dev/ram0) which the kernel would then have erofs
> mount to /dev/root. erofs would notice that there's multiple erofs
> filesystems on the ramdisk and overlay them together (perhaps only if
> a cmdline option is enabled).

Recently, A merged EROFS filesystem with a single block device can
be passed but you might still need a step to build a merged fs tree
before mounting.  Such feature is used to build a metadata and
reuse original tar blobs.

> 
> Does this make sense at all?

Hopefully it helps.

Thanks,
Gao Xiang

> 
> Cheers,
> 
> Daan De Meyer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Merging multiple erofs file systems on the same block device
  2023-05-02  3:17 ` Gao Xiang
@ 2023-05-02 10:03   ` Daan De Meyer
  2023-05-05  5:05     ` Gao Xiang
  0 siblings, 1 reply; 6+ messages in thread
From: Daan De Meyer @ 2023-05-02 10:03 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-erofs

> On 2023/5/1 22:09, Daan De Meyer wrote:
> > Hi,
> >
> > I've been looking into erofs as an initramfs replacement by using
> > root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.
>
> Sorry, I'm on vacation now.
>
> May I ask what's your detailed use cases?  Sure, you could use
> /dev/ram0 as a replacement, but currently it still takes double
> memory compared with initramfs since ramdisk doesn't support FSDAX
> for now (by enabling FSDAX, it won't take double memory at all.)

I'm experimenting with larger initramfses and running into memory
bottlenecks since the entire compressed cpio has to be decompressed
into memory. I was hoping to use erofs as a replacement that could stay
compressed, where only the files that are actually accessed are
decompressed at runtime.

> Actually I think ramdisk FSDAX is useful and I might sync up this on
> the following LSF/MM/BPF 2023.
>
> > However, by using a ramdisk instead of the usual compressed cpio, I
> > would lose the feature where the kernel merges multiple individual
> > cpios together into a single tmpfs filesystem. Looking at the
> > documentation for erofs, I noticed that erofs already seems to support
> > merging multiple erofs filesystems on separate block devices using the
> > device= cmdline option. Would it be possible to extend this so that
> Here `device=` is actually used to refer to seperate blobs with the
> merged metadata.  For example, you could have
>
>    device=/dev/ram1 original tar1
>    device=/dev/ram2 original tar2
>    /dev/ram0        merged metadata for tar1 + tar2.
>
> which means, if you'd like to merge multiple EROFS filesystems, you
> might need another step to build a merged metadata in advance in order
> to merges multiple individual tarballs together, which could be built
> when applying images or booting (by using a special bootloader with
> such functionality.)

Ahh, I misunderstood the device= option then.

> EROFS doesn't support stacking multiple fses runtimely since it seems
> a duplicated feature of overlayfs (you could consider using overlayfs
> honestly.)

I would love to use overlayfs, but there's no way to specify to the kernel that
the initrd should be set up as an overlayfs of a set of ram disks. It would be
interesting if I could put multiple filesystems in the initrd and the
kernel would
notice and automatically set up an overlayfs of them.

> > multiple erofs filesystems that follow each other on the same block
> > device can also be merged? This would allow me to pass multiple erofs
> > filesystems to the kernel via initrd=, which would get concatenated
> > together into a single buffer, which the kernel would write to a
> > ramdisk (using root=/dev/ram0) which the kernel would then have erofs
> > mount to /dev/root. erofs would notice that there's multiple erofs
> > filesystems on the ramdisk and overlay them together (perhaps only if
> > a cmdline option is enabled).
>
> Recently, A merged EROFS filesystem with a single block device can
> be passed but you might still need a step to build a merged fs tree
> before mounting.  Such feature is used to build a metadata and
> reuse original tar blobs.
>
> >
> > Does this make sense at all?
>
> Hopefully it helps.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Merging multiple erofs file systems on the same block device
  2023-05-02 10:03   ` Daan De Meyer
@ 2023-05-05  5:05     ` Gao Xiang
  2023-05-05  8:19       ` Daan De Meyer
  0 siblings, 1 reply; 6+ messages in thread
From: Gao Xiang @ 2023-05-05  5:05 UTC (permalink / raw)
  To: Daan De Meyer; +Cc: linux-erofs



On 2023/5/2 18:03, Daan De Meyer wrote:
>> On 2023/5/1 22:09, Daan De Meyer wrote:
>>> Hi,
>>>
>>> I've been looking into erofs as an initramfs replacement by using
>>> root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.
>>
>> Sorry, I'm on vacation now.
>>
>> May I ask what's your detailed use cases?  Sure, you could use
>> /dev/ram0 as a replacement, but currently it still takes double
>> memory compared with initramfs since ramdisk doesn't support FSDAX
>> for now (by enabling FSDAX, it won't take double memory at all.)
> 
> I'm experimenting with larger initramfses and running into memory
> bottlenecks since the entire compressed cpio has to be decompressed
> into memory. I was hoping to use erofs as a replacement that could stay
> compressed, where only the files that are actually accessed are
> decompressed at runtime.

Sorry for late reply.

Okay, that makes sense, although FSDAX cannot be used as this way since
decompressed data is needed for mmapped accesses.

> 
>> Actually I think ramdisk FSDAX is useful and I might sync up this on
>> the following LSF/MM/BPF 2023.
>>
>>> However, by using a ramdisk instead of the usual compressed cpio, I
>>> would lose the feature where the kernel merges multiple individual
>>> cpios together into a single tmpfs filesystem. Looking at the
>>> documentation for erofs, I noticed that erofs already seems to support
>>> merging multiple erofs filesystems on separate block devices using the
>>> device= cmdline option. Would it be possible to extend this so that
>> Here `device=` is actually used to refer to seperate blobs with the
>> merged metadata.  For example, you could have
>>
>>     device=/dev/ram1 original tar1
>>     device=/dev/ram2 original tar2
>>     /dev/ram0        merged metadata for tar1 + tar2.
>>
>> which means, if you'd like to merge multiple EROFS filesystems, you
>> might need another step to build a merged metadata in advance in order
>> to merges multiple individual tarballs together, which could be built
>> when applying images or booting (by using a special bootloader with
>> such functionality.)
> 
> Ahh, I misunderstood the device= option then.
> 
>> EROFS doesn't support stacking multiple fses runtimely since it seems
>> a duplicated feature of overlayfs (you could consider using overlayfs
>> honestly.)
> 
> I would love to use overlayfs, but there's no way to specify to the kernel that
> the initrd should be set up as an overlayfs of a set of ram disks. It would be
> interesting if I could put multiple filesystems in the initrd and the
> kernel would
> notice and automatically set up an overlayfs of them.

I didn't use overlayfs as this way so I'm not sure as well.  Yet as a wild
guess, you could specify a ramdisk with a customized init to stack
overlayfs like this in the userspace?  Not sure though...

Thanks,
Gao Xiang

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Merging multiple erofs file systems on the same block device
  2023-05-05  5:05     ` Gao Xiang
@ 2023-05-05  8:19       ` Daan De Meyer
  2023-05-08 11:04         ` Gao Xiang
  0 siblings, 1 reply; 6+ messages in thread
From: Daan De Meyer @ 2023-05-05  8:19 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-erofs

> On 2023/5/2 18:03, Daan De Meyer wrote:
> >> On 2023/5/1 22:09, Daan De Meyer wrote:
> >>> Hi,
> >>>
> >>> I've been looking into erofs as an initramfs replacement by using
> >>> root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.
> >>
> >> Sorry, I'm on vacation now.
> >>
> >> May I ask what's your detailed use cases?  Sure, you could use
> >> /dev/ram0 as a replacement, but currently it still takes double
> >> memory compared with initramfs since ramdisk doesn't support FSDAX
> >> for now (by enabling FSDAX, it won't take double memory at all.)
> >
> > I'm experimenting with larger initramfses and running into memory
> > bottlenecks since the entire compressed cpio has to be decompressed
> > into memory. I was hoping to use erofs as a replacement that could stay
> > compressed, where only the files that are actually accessed are
> > decompressed at runtime.
>
> Sorry for late reply.
>
> Okay, that makes sense, although FSDAX cannot be used as this way since
> decompressed data is needed for mmapped accesses.

Can you clarify why using a ramdisk would take double the memory if FSDAX
is not used?

> >
> >> Actually I think ramdisk FSDAX is useful and I might sync up this on
> >> the following LSF/MM/BPF 2023.
> >>
> >>> However, by using a ramdisk instead of the usual compressed cpio, I
> >>> would lose the feature where the kernel merges multiple individual
> >>> cpios together into a single tmpfs filesystem. Looking at the
> >>> documentation for erofs, I noticed that erofs already seems to support
> >>> merging multiple erofs filesystems on separate block devices using the
> >>> device= cmdline option. Would it be possible to extend this so that
> >> Here `device=` is actually used to refer to seperate blobs with the
> >> merged metadata.  For example, you could have
> >>
> >>     device=/dev/ram1 original tar1
> >>     device=/dev/ram2 original tar2
> >>     /dev/ram0        merged metadata for tar1 + tar2.
> >>
> >> which means, if you'd like to merge multiple EROFS filesystems, you
> >> might need another step to build a merged metadata in advance in order
> >> to merges multiple individual tarballs together, which could be built
> >> when applying images or booting (by using a special bootloader with
> >> such functionality.)
> >
> > Ahh, I misunderstood the device= option then.
> >
> >> EROFS doesn't support stacking multiple fses runtimely since it seems
> >> a duplicated feature of overlayfs (you could consider using overlayfs
> >> honestly.)
> >
> > I would love to use overlayfs, but there's no way to specify to the kernel that
> > the initrd should be set up as an overlayfs of a set of ram disks. It would be
> > interesting if I could put multiple filesystems in the initrd and the
> > kernel would
> > notice and automatically set up an overlayfs of them.
>
> I didn't use overlayfs as this way so I'm not sure as well.  Yet as a wild
> guess, you could specify a ramdisk with a customized init to stack
> overlayfs like this in the userspace?  Not sure though...

This approach generally works, but if you want to enforce all mounted
filesystems to be dm-verity protected, you now need to add verity data
for all these filesystems to the initramfs. That's why we're looking
for a solution where the kernel sets up the filesystem mount of the
ramdisk which is special cased and doesn't require verity data to be
present. Anyway, the overlayfs issue is not something for erofs to
solve. If at all desirable, it should probably be filesystem
independent where the kernel just looks for multiple filesystems in
the initrd buffer and sets up an overlay mount using each found
filesystem.

Cheers,

Daan De Meyer


On Fri, 5 May 2023 at 07:05, Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>
>
>
> On 2023/5/2 18:03, Daan De Meyer wrote:
> >> On 2023/5/1 22:09, Daan De Meyer wrote:
> >>> Hi,
> >>>
> >>> I've been looking into erofs as an initramfs replacement by using
> >>> root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.
> >>
> >> Sorry, I'm on vacation now.
> >>
> >> May I ask what's your detailed use cases?  Sure, you could use
> >> /dev/ram0 as a replacement, but currently it still takes double
> >> memory compared with initramfs since ramdisk doesn't support FSDAX
> >> for now (by enabling FSDAX, it won't take double memory at all.)
> >
> > I'm experimenting with larger initramfses and running into memory
> > bottlenecks since the entire compressed cpio has to be decompressed
> > into memory. I was hoping to use erofs as a replacement that could stay
> > compressed, where only the files that are actually accessed are
> > decompressed at runtime.
>
> Sorry for late reply.
>
> Okay, that makes sense, although FSDAX cannot be used as this way since
> decompressed data is needed for mmapped accesses.
>
> >
> >> Actually I think ramdisk FSDAX is useful and I might sync up this on
> >> the following LSF/MM/BPF 2023.
> >>
> >>> However, by using a ramdisk instead of the usual compressed cpio, I
> >>> would lose the feature where the kernel merges multiple individual
> >>> cpios together into a single tmpfs filesystem. Looking at the
> >>> documentation for erofs, I noticed that erofs already seems to support
> >>> merging multiple erofs filesystems on separate block devices using the
> >>> device= cmdline option. Would it be possible to extend this so that
> >> Here `device=` is actually used to refer to seperate blobs with the
> >> merged metadata.  For example, you could have
> >>
> >>     device=/dev/ram1 original tar1
> >>     device=/dev/ram2 original tar2
> >>     /dev/ram0        merged metadata for tar1 + tar2.
> >>
> >> which means, if you'd like to merge multiple EROFS filesystems, you
> >> might need another step to build a merged metadata in advance in order
> >> to merges multiple individual tarballs together, which could be built
> >> when applying images or booting (by using a special bootloader with
> >> such functionality.)
> >
> > Ahh, I misunderstood the device= option then.
> >
> >> EROFS doesn't support stacking multiple fses runtimely since it seems
> >> a duplicated feature of overlayfs (you could consider using overlayfs
> >> honestly.)
> >
> > I would love to use overlayfs, but there's no way to specify to the kernel that
> > the initrd should be set up as an overlayfs of a set of ram disks. It would be
> > interesting if I could put multiple filesystems in the initrd and the
> > kernel would
> > notice and automatically set up an overlayfs of them.
>
> I didn't use overlayfs as this way so I'm not sure as well.  Yet as a wild
> guess, you could specify a ramdisk with a customized init to stack
> overlayfs like this in the userspace?  Not sure though...
>
> Thanks,
> Gao Xiang

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Merging multiple erofs file systems on the same block device
  2023-05-05  8:19       ` Daan De Meyer
@ 2023-05-08 11:04         ` Gao Xiang
  0 siblings, 0 replies; 6+ messages in thread
From: Gao Xiang @ 2023-05-08 11:04 UTC (permalink / raw)
  To: Daan De Meyer; +Cc: linux-erofs



On 2023/5/5 16:19, Daan De Meyer wrote:
>> On 2023/5/2 18:03, Daan De Meyer wrote:
>>>> On 2023/5/1 22:09, Daan De Meyer wrote:
>>>>> Hi,
>>>>>
>>>>> I've been looking into erofs as an initramfs replacement by using
>>>>> root=/dev/ram0 to tell the kernel to load the initramfs as a ramdisk.
>>>>
>>>> Sorry, I'm on vacation now.
>>>>
>>>> May I ask what's your detailed use cases?  Sure, you could use
>>>> /dev/ram0 as a replacement, but currently it still takes double
>>>> memory compared with initramfs since ramdisk doesn't support FSDAX
>>>> for now (by enabling FSDAX, it won't take double memory at all.)
>>>
>>> I'm experimenting with larger initramfses and running into memory
>>> bottlenecks since the entire compressed cpio has to be decompressed
>>> into memory. I was hoping to use erofs as a replacement that could stay
>>> compressed, where only the files that are actually accessed are
>>> decompressed at runtime.
>>
>> Sorry for late reply.
>>
>> Okay, that makes sense, although FSDAX cannot be used as this way since
>> decompressed data is needed for mmapped accesses.
> 
> Can you clarify why using a ramdisk would take double the memory if FSDAX
> is not used?

You could take a look at
https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt

and section "ramfs and ramdisk" describes the details, but FSDAX could avoid
this problem.

> 
>>>
>>>> Actually I think ramdisk FSDAX is useful and I might sync up this on
>>>> the following LSF/MM/BPF 2023.
>>>>
>>>>> However, by using a ramdisk instead of the usual compressed cpio, I
>>>>> would lose the feature where the kernel merges multiple individual
>>>>> cpios together into a single tmpfs filesystem. Looking at the
>>>>> documentation for erofs, I noticed that erofs already seems to support
>>>>> merging multiple erofs filesystems on separate block devices using the
>>>>> device= cmdline option. Would it be possible to extend this so that
>>>> Here `device=` is actually used to refer to seperate blobs with the
>>>> merged metadata.  For example, you could have
>>>>
>>>>      device=/dev/ram1 original tar1
>>>>      device=/dev/ram2 original tar2
>>>>      /dev/ram0        merged metadata for tar1 + tar2.
>>>>
>>>> which means, if you'd like to merge multiple EROFS filesystems, you
>>>> might need another step to build a merged metadata in advance in order
>>>> to merges multiple individual tarballs together, which could be built
>>>> when applying images or booting (by using a special bootloader with
>>>> such functionality.)
>>>
>>> Ahh, I misunderstood the device= option then.
>>>
>>>> EROFS doesn't support stacking multiple fses runtimely since it seems
>>>> a duplicated feature of overlayfs (you could consider using overlayfs
>>>> honestly.)
>>>
>>> I would love to use overlayfs, but there's no way to specify to the kernel that
>>> the initrd should be set up as an overlayfs of a set of ram disks. It would be
>>> interesting if I could put multiple filesystems in the initrd and the
>>> kernel would
>>> notice and automatically set up an overlayfs of them.
>>
>> I didn't use overlayfs as this way so I'm not sure as well.  Yet as a wild
>> guess, you could specify a ramdisk with a customized init to stack
>> overlayfs like this in the userspace?  Not sure though...
> 
> This approach generally works, but if you want to enforce all mounted
> filesystems to be dm-verity protected, you now need to add verity data
> for all these filesystems to the initramfs. That's why we're looking
> for a solution where the kernel sets up the filesystem mount of the
> ramdisk which is special cased and doesn't require verity data to be
> present. Anyway, the overlayfs issue is not something for erofs to
> solve. If at all desirable, it should probably be filesystem
> independent where the kernel just looks for multiple filesystems in
> the initrd buffer and sets up an overlay mount using each found
> filesystem.

I'm not sure if you could still specify one root filesystem with
dm-verity and it mounts filesystems of rest layers and then uses
overlayfs properly with a customized init.

Yes, anyway, you could enhance kernel init code as well.

Thanks,
Gao Xiang

> 
> Cheers,
> 
> Daan De Meyer

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-08 11:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-01 14:09 Merging multiple erofs file systems on the same block device Daan De Meyer
2023-05-02  3:17 ` Gao Xiang
2023-05-02 10:03   ` Daan De Meyer
2023-05-05  5:05     ` Gao Xiang
2023-05-05  8:19       ` Daan De Meyer
2023-05-08 11:04         ` Gao Xiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).