linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Alexander Larsson <alexl@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	gscrivan@redhat.com, brauner@kernel.org, viro@zeniv.linux.org.uk,
	Vivek Goyal <vgoyal@redhat.com>,
	Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem
Date: Wed, 25 Jan 2023 15:18:35 +1100	[thread overview]
Message-ID: <20230125041835.GD937597@dread.disaster.area> (raw)
In-Reply-To: <CAOQ4uxhGX9NVxwsiBMP0q21ZRot6-UA0nGPp1wGNjgmKBjjBBA@mail.gmail.com>

On Tue, Jan 24, 2023 at 09:06:13PM +0200, Amir Goldstein wrote:
> On Tue, Jan 24, 2023 at 3:13 PM Alexander Larsson <alexl@redhat.com> wrote:
> > On Tue, 2023-01-24 at 05:24 +0200, Amir Goldstein wrote:
> > > On Mon, Jan 23, 2023 at 7:56 PM Alexander Larsson <alexl@redhat.com>
> > > wrote:
> > > > On Fri, 2023-01-20 at 21:44 +0200, Amir Goldstein wrote:
> > > > > On Fri, Jan 20, 2023 at 5:30 PM Alexander Larsson
> > > > > <alexl@redhat.com>
> > > > > wrote:
> > I'm not sure why the dentry cache case would be more important?
> > Starting a new container will very often not have cached the image.
> >
> > To me the interesting case is for a new image, but with some existing
> > page cache for the backing files directory. That seems to model staring
> > a new image in an active container host, but its somewhat hard to test
> > that case.
> >
> 
> ok, you can argue that faster cold cache ls -lR is important
> for starting new images.
> I think you will be asked to show a real life container use case where
> that benchmark really matters.

I've already described the real world production system bottlenecks
that composefs is designed to overcome in a previous thread.

Please go back an read this:

https://lore.kernel.org/linux-fsdevel/20230118002242.GB937597@dread.disaster.area/

Cold cache performance dominates the runtime of short lived
containers as well as high density container hosts being run to
their container level memory limits. `ls -lR` is just a
microbenchmark that demonstrates how much better composefs cold
cache behaviour is than the alternatives being proposed....

This might also help explain why my initial review comments focussed
on getting rid of optional format features, straight lining the
processing, changing the format or search algorithms so more
sequential cacheline accesses occurred resulting in less memory
stalls, etc. i.e. reductions in cold cache lookup overhead will
directly translate into faster container workload spin up.

> > > > This isn't all that strange, as overlayfs does a lot more work for
> > > > each lookup, including multiple name lookups as well as several
> > > > xattr
> > > > lookups, whereas composefs just does a single lookup in a pre-
> > > > computed
> > >
> > > Seriously, "multiple name lookups"?
> > > Overlayfs does exactly one lookup for anything but first level
> > > subdirs
> > > and for sparse files it does the exact same lookup in /objects as
> > > composefs.
> > > Enough with the hand waving please. Stick to hard facts.
> >
> > With the discussed layout, in a stat() call on a regular file,
> > ovl_lookup() will do lookups on both the sparse file and the backing
> > file, whereas cfs_dir_lookup() will just map some page cache pages and
> > do a binary search.
> >
> > Of course if you actually open the file, then cfs_open_file() would do
> > the equivalent lookups in /objects. But that is often not what happens,
> > for example in "ls -l".
> >
> > Additionally, these extra lookups will cause extra memory use, as you
> > need dentries and inodes for the erofs/squashfs inodes in addition to
> > the overlay inodes.
> 
> I see. composefs is really very optimized for ls -lR.

No, composefs is optimised for minimal namespace and inode
resolution overhead. 'ls -lR' does a lot of these operations, and
therefore you see the efficiency of the design being directly
exposed....

> Now only need to figure out if real users start a container and do ls -lR
> without reading many files is a real life use case.

I've been using 'ls -lR' and 'find . -ctime 1' to benchmark cold
cache directory iteration and inode lookup performance for roughly
20 years. The benchmarks I run *never* read file data, nor is that
desired - they are pure directory and inode lookup micro-benchmarks
used to analyse VFS and filesystem directory and inode lookup
performance.

I have been presenting such measurements and patches improving
performance of these microbnechmarks to the XFS and fsdevel lists
over 15 years and I have *never* had to justify that what I'm
measuring is a "real world workload" to anyone. Ever.

Complaining about real world relevancy of the presented benchmark
might be considered applying a double standard, wouldn't you agree?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2023-01-25  4:18 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20 15:23 [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 1/6] fsverity: Export fsverity_get_digest Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 2/6] composefs: Add on-disk layout header Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 3/6] composefs: Add descriptor parsing code Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 4/6] composefs: Add filesystem implementation Alexander Larsson
2023-01-20 15:23 ` [PATCH v3 5/6] composefs: Add documentation Alexander Larsson
2023-01-21  2:19   ` Bagas Sanjaya
2023-01-20 15:23 ` [PATCH v3 6/6] composefs: Add kconfig and build support Alexander Larsson
2023-01-20 19:44 ` [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem Amir Goldstein
2023-01-20 22:18   ` Giuseppe Scrivano
2023-01-21  3:08     ` Gao Xiang
2023-01-21 16:19       ` Giuseppe Scrivano
2023-01-21 17:15         ` Gao Xiang
2023-01-21 22:34           ` Giuseppe Scrivano
2023-01-22  0:39             ` Gao Xiang
2023-01-22  9:01               ` Giuseppe Scrivano
2023-01-22  9:32                 ` Giuseppe Scrivano
2023-01-24  0:08                   ` Gao Xiang
2023-01-21 10:57     ` Amir Goldstein
2023-01-21 15:01       ` Giuseppe Scrivano
2023-01-21 15:54         ` Amir Goldstein
2023-01-21 16:26           ` Gao Xiang
2023-01-23 17:56   ` Alexander Larsson
2023-01-23 23:59     ` Gao Xiang
2023-01-24  3:24     ` Amir Goldstein
2023-01-24 13:10       ` Alexander Larsson
2023-01-24 14:40         ` Gao Xiang
2023-01-24 19:06         ` Amir Goldstein
2023-01-25  4:18           ` Dave Chinner [this message]
2023-01-25  8:32             ` Amir Goldstein
2023-01-25 10:08               ` Alexander Larsson
2023-01-25 10:43                 ` Amir Goldstein
2023-01-25 10:39               ` Giuseppe Scrivano
2023-01-25 11:17                 ` Amir Goldstein
2023-01-25 12:30                   ` Giuseppe Scrivano
2023-01-25 12:46                     ` Amir Goldstein
2023-01-25 13:10                       ` Giuseppe Scrivano
2023-01-25 18:07                         ` Amir Goldstein
2023-01-25 19:45                           ` Giuseppe Scrivano
2023-01-25 20:23                             ` Amir Goldstein
2023-01-25 20:29                               ` Amir Goldstein
2023-01-27 15:57                               ` Vivek Goyal
2023-01-25 15:24                       ` Christian Brauner
2023-01-25 16:05                         ` Giuseppe Scrivano
2023-01-25  9:37           ` Alexander Larsson
2023-01-25 10:05             ` Gao Xiang
2023-01-25 10:15               ` Alexander Larsson
2023-01-27 10:24                 ` Gao Xiang
2023-02-01  4:28                   ` Jingbo Xu
2023-02-01  7:44                     ` Amir Goldstein
2023-02-01  8:59                       ` Jingbo Xu
2023-02-01  9:52                         ` Alexander Larsson
2023-02-01 12:39                           ` Jingbo Xu
2023-02-01  9:46                     ` Alexander Larsson
2023-02-01 10:01                       ` Gao Xiang
2023-02-01 11:22                         ` Gao Xiang
2023-02-02  6:37                           ` Amir Goldstein
2023-02-02  7:17                             ` Gao Xiang
2023-02-02  7:37                               ` Gao Xiang
2023-02-03 11:32                                 ` Alexander Larsson
2023-02-03 12:46                                   ` Amir Goldstein
2023-02-03 15:09                                     ` Gao Xiang
2023-02-05 19:06                                       ` Amir Goldstein
2023-02-06  7:59                                         ` Amir Goldstein
2023-02-06 10:35                                         ` Miklos Szeredi
2023-02-06 13:30                                           ` Amir Goldstein
2023-02-06 16:34                                             ` Miklos Szeredi
2023-02-06 17:16                                               ` Amir Goldstein
2023-02-06 18:17                                                 ` Amir Goldstein
2023-02-06 19:32                                                 ` Miklos Szeredi
2023-02-06 20:06                                                   ` Amir Goldstein
2023-02-07  8:12                                                     ` Alexander Larsson
2023-02-06 12:51                                         ` Alexander Larsson
2023-02-07  8:12                                         ` Jingbo Xu
2023-02-06 12:43                                     ` Alexander Larsson
2023-02-06 13:27                                       ` Gao Xiang
2023-02-06 15:31                                         ` Alexander Larsson
2023-02-01 12:06                       ` Jingbo Xu
2023-02-02  4:57                       ` Jingbo Xu
2023-02-02  4:59                         ` Jingbo Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230125041835.GD937597@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=alexl@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=gscrivan@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).