linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Alexander Larsson <alexl@redhat.com>,
	gscrivan@redhat.com, brauner@kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	david@fromorbit.com, viro@zeniv.linux.org.uk,
	Vivek Goyal <vgoyal@redhat.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Gao Xiang <hsiangkao@linux.alibaba.com>,
	Jingbo Xu <jefflexu@linux.alibaba.com>
Subject: Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem
Date: Mon, 6 Feb 2023 15:30:58 +0200	[thread overview]
Message-ID: <CAOQ4uxiW0=DJpRAu90pJic0qu=pS6f2Eo7v-Uw3pmd0zsvFuuw@mail.gmail.com> (raw)
In-Reply-To: <CAJfpegtRacAoWdhVxCE8gpLVmQege4yz8u11mvXCs2weBBQ4jg@mail.gmail.com>

> > > My little request again, could you help benchmark on your real workload
> > > rather than "ls -lR" stuff?  If your hard KPI is really what as you
> > > said, why not just benchmark the real workload now and write a detailed
> > > analysis to everyone to explain it's a _must_ that we should upstream
> > > a new stacked fs for this?
> > >
> >
> > I agree that benchmarking the actual KPI (boot time) will have
> > a much stronger impact and help to build a much stronger case
> > for composefs if you can prove that the boot time difference really matters.
> >
> > In order to test boot time on fair grounds, I prepared for you a POC
> > branch with overlayfs lazy lookup:
> > https://github.com/amir73il/linux/commits/ovl-lazy-lowerdata
>
> Sorry about being late to the party...
>
> Can you give a little detail about what exactly this does?
>

Consider a container image distribution system, with base images
and derived images and instruction on how to compose these images
using overlayfs or other methods.

Consider a derived image L3 that depends on images L2, L1.

With the composefs methodology, the image distribution server splits
each image is split into metadata only (metacopy) images M3, M2, M1
and their underlying data images containing content addressable blobs
D3, D2, D1.

The image distribution server goes on to merge the metadata layers
on the server, so U3 = M3 + M2 + M1.

In order to start image L3, the container client will unpack the data layers
D3, D2, D1 to local fs normally, but the server merged U3 metadata image
will be distributed as a read-only fsverity signed image that can be mounted
by mount -t composefs U3.img (much like mount -t erofs -o loop U3.img).

The composefs image format contains "redirect" instruction to the data blob
path and an fsverity signature that can be used to verify the redirected data
content.

When composefs authors proposed to merge composefs, Gao and me
pointed out that the same functionality can be achieved with minimal changes
using erofs+overlayfs.

Composefs authors have presented ls -lR time and memory usage benchmarks
that demonstrate how composefs performs better that erofs+overlayfs in
this workload and explained that the lookup of the data blobs is what takes
the extra time and memory in the erofs+overlayfs ls -lR test.

The lazyfollow POC optimizes-out the lowerdata lookup for the ls -lR
benchmark, so that composefs could be compared to erofs+overlayfs.

To answer Alexander's question:

> Cool. I'll play around with this. Does this need to be an opt-in
> option in the final version? It feels like this could be useful to
> improve performance in general for overlayfs, for example when
> metacopy is used in container layers.

I think lazyfollow could be enabled by default after we hashed out
all the bugs and corner cases and most importantly remove the
POC limitation of lower-only overlay.

The feedback that composefs authors are asking from you
is whether you will agree to consider adding the "lazyfollow
lower data" optimization and "fsverity signature for metacopy"
feature to overlayfs?

If you do agree, the I think they should invest their resources
in making those improvements to overlayfs and perhaps
other improvements to erofs, rather than proposing a new
specialized filesystem.

Thanks,
Amir.

  reply	other threads:[~2023-02-06 13:31 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20 15:23 [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 1/6] fsverity: Export fsverity_get_digest Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 2/6] composefs: Add on-disk layout header Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 3/6] composefs: Add descriptor parsing code Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 4/6] composefs: Add filesystem implementation Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 5/6] composefs: Add documentation Alexander Larsson
2023-01-21  2:19   ` Bagas Sanjaya
2023-01-20 15:23 ` [PATCH v3 6/6] composefs: Add kconfig and build support Alexander Larsson
2023-01-20 19:44 ` [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Amir Goldstein
2023-01-20 22:18   ` Giuseppe Scrivano
2023-01-21  3:08     ` Gao Xiang
2023-01-21 16:19       ` Giuseppe Scrivano
2023-01-21 17:15         ` Gao Xiang
2023-01-21 22:34           ` Giuseppe Scrivano
2023-01-22  0:39             ` Gao Xiang
2023-01-22  9:01               ` Giuseppe Scrivano
2023-01-22  9:32                 ` Giuseppe Scrivano
2023-01-24  0:08                   ` Gao Xiang
2023-01-21 10:57     ` Amir Goldstein
2023-01-21 15:01       ` Giuseppe Scrivano
2023-01-21 15:54         ` Amir Goldstein
2023-01-21 16:26           ` Gao Xiang
2023-01-23 17:56   ` Alexander Larsson
2023-01-23 23:59     ` Gao Xiang
2023-01-24  3:24     ` Amir Goldstein
2023-01-24 13:10       ` Alexander Larsson
2023-01-24 14:40         ` Gao Xiang
2023-01-24 19:06         ` Amir Goldstein
2023-01-25  4:18           ` Dave Chinner
2023-01-25  8:32             ` Amir Goldstein
2023-01-25 10:08               ` Alexander Larsson
2023-01-25 10:43                 ` Amir Goldstein
2023-01-25 10:39               ` Giuseppe Scrivano
2023-01-25 11:17                 ` Amir Goldstein
2023-01-25 12:30                   ` Giuseppe Scrivano
2023-01-25 12:46                     ` Amir Goldstein
2023-01-25 13:10                       ` Giuseppe Scrivano
2023-01-25 18:07                         ` Amir Goldstein
2023-01-25 19:45                           ` Giuseppe Scrivano
2023-01-25 20:23                             ` Amir Goldstein
2023-01-25 20:29                               ` Amir Goldstein
2023-01-27 15:57                               ` Vivek Goyal
2023-01-25 15:24                       ` Christian Brauner
2023-01-25 16:05                         ` Giuseppe Scrivano
2023-01-25  9:37           ` Alexander Larsson
2023-01-25 10:05             ` Gao Xiang
2023-01-25 10:15               ` Alexander Larsson
2023-01-27 10:24                 ` Gao Xiang
2023-02-01  4:28                   ` Jingbo Xu
2023-02-01  7:44                     ` Amir Goldstein
2023-02-01  8:59                       ` Jingbo Xu
2023-02-01  9:52                         ` Alexander Larsson
2023-02-01 12:39                           ` Jingbo Xu
2023-02-01  9:46                     ` Alexander Larsson
2023-02-01 10:01                       ` Gao Xiang
2023-02-01 11:22                         ` Gao Xiang
2023-02-02  6:37                           ` Amir Goldstein
2023-02-02  7:17                             ` Gao Xiang
2023-02-02  7:37                               ` Gao Xiang
2023-02-03 11:32                                 ` Alexander Larsson
2023-02-03 12:46                                   ` Amir Goldstein
2023-02-03 15:09                                     ` Gao Xiang
2023-02-05 19:06                                       ` Amir Goldstein
2023-02-06  7:59                                         ` Amir Goldstein
2023-02-06 10:35                                         ` Miklos Szeredi
2023-02-06 13:30                                           ` Amir Goldstein [this message]
2023-02-06 16:34                                             ` Miklos Szeredi
2023-02-06 17:16                                               ` Amir Goldstein
2023-02-06 18:17                                                 ` Amir Goldstein
2023-02-06 19:32                                                 ` Miklos Szeredi
2023-02-06 20:06                                                   ` Amir Goldstein
2023-02-07  8:12                                                     ` Alexander Larsson
2023-02-06 12:51                                         ` Alexander Larsson
2023-02-07  8:12                                         ` Jingbo Xu
2023-02-06 12:43                                     ` Alexander Larsson
2023-02-06 13:27                                       ` Gao Xiang
2023-02-06 15:31                                         ` Alexander Larsson
2023-02-01 12:06                       ` Jingbo Xu
2023-02-02  4:57                       ` Jingbo Xu
2023-02-02  4:59                         ` Jingbo Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxiW0=DJpRAu90pJic0qu=pS6f2Eo7v-Uw3pmd0zsvFuuw@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=alexl@redhat.com \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=gscrivan@redhat.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=jefflexu@linux.alibaba.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).