All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Larsson <alexl@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	gscrivan@redhat.com, brauner@kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	david@fromorbit.com, viro@zeniv.linux.org.uk,
	Vivek Goyal <vgoyal@redhat.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Gao Xiang <hsiangkao@linux.alibaba.com>,
	Jingbo Xu <jefflexu@linux.alibaba.com>
Subject: Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem
Date: Mon, 6 Feb 2023 13:51:23 +0100	[thread overview]
Message-ID: <CAL7ro1ETbWte1dsLY0kFsKdbw5POAahx55Hsk2nNvgGXAWE-CQ@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uximQZ_DL1atbrCg0bQ8GN8JfrEartxDSP+GB_hFvYQOhg@mail.gmail.com>

On Sun, Feb 5, 2023 at 8:06 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> > >>> Apart from that, I still fail to get some thoughts (apart from
> > >>> unprivileged
> > >>> mounts) how EROFS + overlayfs combination fails on automative real
> > >>> workloads
> > >>> aside from "ls -lR" (readdir + stat).
> > >>>
> > >>> And eventually we still need overlayfs for most use cases to do
> > >>> writable
> > >>> stuffs, anyway, it needs some words to describe why such < 1s
> > >>> difference is
> > >>> very very important to the real workload as you already mentioned
> > >>> before.
> > >>>
> > >>> And with overlayfs lazy lookup, I think it can be close to ~100ms or
> > >>> better.
> > >>>
> > >>
> > >> If we had an overlay.fs-verity xattr, then I think there are no
> > >> individual features lacking for it to work for the automotive usecase
> > >> I'm working on. Nor for the OCI container usecase. However, the
> > >> possibility of doing something doesn't mean it is the better technical
> > >> solution.
> > >>
> > >> The container usecase is very important in real world Linux use today,
> > >> and as such it makes sense to have a technically excellent solution for
> > >> it, not just a workable solution. Obviously we all have different
> > >> viewpoints of what that is, but these are the reasons why I think a
> > >> composefs solution is better:
> > >>
> > >> * It is faster than all other approaches for the one thing it actually
> > >> needs to do (lookup and readdir performance). Other kinds of
> > >> performance (file i/o speed, etc) is up to the backing filesystem
> > >> anyway.
> > >>
> > >> Even if there are possible approaches to make overlayfs perform better
> > >> here (the "lazy lookup" idea) it will not reach the performance of
> > >> composefs, while further complicating the overlayfs codebase. (btw, did
> > >> someone ask Miklos what he thinks of that idea?)
> > >>
> > >
> > > Well, Miklos was CCed (now in TO:)
> > > I did ask him specifically about relaxing -ouserxarr,metacopy,redirect:
> > > https://lore.kernel.org/linux-unionfs/20230126082228.rweg75ztaexykejv@wittgenstein/T/#mc375df4c74c0d41aa1a2251c97509c6522487f96
> > > but no response on that yet.
> > >
> > > TBH, in the end, Miklos really is the one who is going to have the most
> > > weight on the outcome.
> > >
> > > If Miklos is interested in adding this functionality to overlayfs, you are going
> > > to have a VERY hard sell, trying to merge composefs as an independent
> > > expert filesystem. The community simply does not approve of this sort of
> > > fragmentation unless there is a very good reason to do that.
> > >
> > >> For the automotive usecase we have strict cold-boot time requirements
> > >> that make cold-cache performance very important to us. Of course, there
> > >> is no simple time requirements for the specific case of listing files
> > >> in an image, but any improvement in cold-cache performance for both the
> > >> ostree rootfs and the containers started during boot will be worth its
> > >> weight in gold trying to reach these hard KPIs.
> > >>
> > >> * It uses less memory, as we don't need the extra inodes that comes
> > >> with the overlayfs mount. (See profiling data in giuseppes mail[1]).
> > >
> > > Understood, but we will need profiling data with the optimized ovl
> > > (or with the single blob hack) to compare the relevant alternatives.
> >
> > My little request again, could you help benchmark on your real workload
> > rather than "ls -lR" stuff?  If your hard KPI is really what as you
> > said, why not just benchmark the real workload now and write a detailed
> > analysis to everyone to explain it's a _must_ that we should upstream
> > a new stacked fs for this?
> >
>
> I agree that benchmarking the actual KPI (boot time) will have
> a much stronger impact and help to build a much stronger case
> for composefs if you can prove that the boot time difference really matters.

I will not be able to produce any full comparisons of a car booting
with this. First of all its customer internal data, and secondly its
not something that is currently at a stage that is finished enough to
do such a benchmark. For this discussion, consider it more a weak
example of why cold-cache performance is important in many cases.

> In order to test boot time on fair grounds, I prepared for you a POC
> branch with overlayfs lazy lookup:
> https://github.com/amir73il/linux/commits/ovl-lazy-lowerdata

Cool. I'll play around with this. Does this need to be an opt-in
option in the final version? It feels like this could be useful to
improve performance in general for overlayfs, for example when
metacopy is used in container layers.

> It is very lightly tested, but should be sufficient for the benchmark.
> Note that:
> 1. You need to opt-in with redirect_dir=lazyfollow,metacopy=on
> 2. The lazyfollow POC only works with read-only overlay that
>     has two lower dirs (1 metadata layer and one data blobs layer)
> 3. The data layer must be a local blockdev fs (i.e. not a network fs)
> 4. Only absolute path redirects are lazy (e.g. "/objects/cc/3da...")
>
> These limitations could be easily lifted with a bit more work.
> If any of those limitations stand in your way for running the benchmark
> let me know and I'll see what I can do.
>
> If there is any issue with the POC branch, please let me know.
>
> Thanks,
> Amir.
>


-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                Red Hat, Inc
       alexl@redhat.com         alexander.larsson@gmail.com


  parent reply	other threads:[~2023-02-06 12:52 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20 15:23 [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 1/6] fsverity: Export fsverity_get_digest Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 2/6] composefs: Add on-disk layout header Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 3/6] composefs: Add descriptor parsing code Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 4/6] composefs: Add filesystem implementation Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 5/6] composefs: Add documentation Alexander Larsson
2023-01-21  2:19   ` Bagas Sanjaya
2023-01-20 15:23 ` [PATCH v3 6/6] composefs: Add kconfig and build support Alexander Larsson
2023-01-20 19:44 ` [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Amir Goldstein
2023-01-20 22:18   ` Giuseppe Scrivano
2023-01-21  3:08     ` Gao Xiang
2023-01-21 16:19       ` Giuseppe Scrivano
2023-01-21 17:15         ` Gao Xiang
2023-01-21 22:34           ` Giuseppe Scrivano
2023-01-22  0:39             ` Gao Xiang
2023-01-22  9:01               ` Giuseppe Scrivano
2023-01-22  9:32                 ` Giuseppe Scrivano
2023-01-24  0:08                   ` Gao Xiang
2023-01-21 10:57     ` Amir Goldstein
2023-01-21 15:01       ` Giuseppe Scrivano
2023-01-21 15:54         ` Amir Goldstein
2023-01-21 16:26           ` Gao Xiang
2023-01-23 17:56   ` Alexander Larsson
2023-01-23 23:59     ` Gao Xiang
2023-01-24  3:24     ` Amir Goldstein
2023-01-24 13:10       ` Alexander Larsson
2023-01-24 14:40         ` Gao Xiang
2023-01-24 19:06         ` Amir Goldstein
2023-01-25  4:18           ` Dave Chinner
2023-01-25  8:32             ` Amir Goldstein
2023-01-25 10:08               ` Alexander Larsson
2023-01-25 10:43                 ` Amir Goldstein
2023-01-25 10:39               ` Giuseppe Scrivano
2023-01-25 11:17                 ` Amir Goldstein
2023-01-25 12:30                   ` Giuseppe Scrivano
2023-01-25 12:46                     ` Amir Goldstein
2023-01-25 13:10                       ` Giuseppe Scrivano
2023-01-25 18:07                         ` Amir Goldstein
2023-01-25 19:45                           ` Giuseppe Scrivano
2023-01-25 20:23                             ` Amir Goldstein
2023-01-25 20:29                               ` Amir Goldstein
2023-01-26  5:26                                 ` userns mount and metacopy redirects (Was: Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem) Amir Goldstein
2023-01-26  8:22                                   ` Christian Brauner
2023-01-27 15:57                               ` [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Vivek Goyal
2023-01-25 15:24                       ` Christian Brauner
2023-01-25 16:05                         ` Giuseppe Scrivano
2023-01-25  9:37           ` Alexander Larsson
2023-01-25 10:05             ` Gao Xiang
2023-01-25 10:15               ` Alexander Larsson
2023-01-27 10:24                 ` Gao Xiang
2023-02-01  4:28                   ` Jingbo Xu
2023-02-01  7:44                     ` Amir Goldstein
2023-02-01  8:59                       ` Jingbo Xu
2023-02-01  9:52                         ` Alexander Larsson
2023-02-01 12:39                           ` Jingbo Xu
2023-02-01  9:46                     ` Alexander Larsson
2023-02-01 10:01                       ` Gao Xiang
2023-02-01 11:22                         ` Gao Xiang
2023-02-02  6:37                           ` Amir Goldstein
2023-02-02  7:17                             ` Gao Xiang
2023-02-02  7:37                               ` Gao Xiang
2023-02-03 11:32                                 ` Alexander Larsson
2023-02-03 12:46                                   ` Amir Goldstein
2023-02-03 15:09                                     ` Gao Xiang
2023-02-05 19:06                                       ` Amir Goldstein
2023-02-06  7:59                                         ` Amir Goldstein
2023-02-06 10:35                                         ` Miklos Szeredi
2023-02-06 13:30                                           ` Amir Goldstein
2023-02-06 16:34                                             ` Miklos Szeredi
2023-02-06 17:16                                               ` Amir Goldstein
2023-02-06 18:17                                                 ` Amir Goldstein
2023-02-06 19:32                                                 ` Miklos Szeredi
2023-02-06 20:06                                                   ` Amir Goldstein
2023-02-07  8:12                                                     ` Alexander Larsson
2023-04-03 19:00                                                 ` Lazy lowerdata lookup and data-only layers (Was: Re: Composefs:) Amir Goldstein
2023-04-11 15:50                                                   ` Miklos Szeredi
2023-04-12 14:06                                                     ` Amir Goldstein
2023-04-12 14:20                                                       ` Miklos Szeredi
2023-04-12 15:41                                                         ` Amir Goldstein
2023-02-06 12:51                                         ` Alexander Larsson [this message]
2023-02-07  8:12                                         ` [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Jingbo Xu
2023-02-06 12:43                                     ` Alexander Larsson
2023-02-06 13:27                                       ` Gao Xiang
2023-02-06 15:31                                         ` Alexander Larsson
2023-02-01 12:06                       ` Jingbo Xu
2023-02-02  4:57                       ` Jingbo Xu
2023-02-02  4:59                         ` Jingbo Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL7ro1ETbWte1dsLY0kFsKdbw5POAahx55Hsk2nNvgGXAWE-CQ@mail.gmail.com \
    --to=alexl@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=gscrivan@redhat.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=jefflexu@linux.alibaba.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.