linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Larsson <alexl@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>,
	linux-unionfs@vger.kernel.org,  linux-fsdevel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] ovl: require xwhiteout feature flag on layer roots
Date: Mon, 22 Jan 2024 11:44:11 +0100	[thread overview]
Message-ID: <0473f5389dd1ada08c73479612a4e054c8023d94.camel@redhat.com> (raw)
In-Reply-To: <CAOQ4uxj_EWqa716+9xxu0zEd-ziEFpoGsv2OggUrb8_eGGkDDw@mail.gmail.com>

On Mon, 2024-01-22 at 10:38 +0200, Amir Goldstein wrote:
> On Fri, Jan 19, 2024 at 6:35 PM Alexander Larsson <alexl@redhat.com>
> wrote:
> > 
> > On Fri, 2024-01-19 at 13:08 +0200, Amir Goldstein wrote:
> > > On Fri, Jan 19, 2024 at 12:14 PM Miklos Szeredi
> > > <mszeredi@redhat.com>
> > > wrote:
> > > 
> > > 
> > > Do you want me to fix/test and send this to Linus?
> > > 
> > > Alex, can we add your RVB to v2?
> > 
> > I ran into an issue converting composefs to use this.
> > 
> > Suppose we have a chroot of files containing some upper dirs and we
> > want to make a composefs of this. For example, say
> > /foo/lower/dir/whiteout is a traditional whiteout.
> > 
> > Previously, what happened is that I marked the whiteout file with
> > trusted.overlay.overlay.whiteout, and the /foo/lower/dir with
> > trusted.overlay.overlay.whiteouts.
> > 
> > Them when I mounted then entire chroot with overlayfs these xattrs
> > would get unescaped and I would get a $mnt/foo/lower/dir/whiteout
> > with
> > a trusted.overlay.whiteout xattr, and a $mnt/foo/lower/dir with a
> > trusted.overlay.whiteout. When I then mounted another overlayfs
> > with a
> > lowerdir of $mnt/foo/lower it would treat the whiteout as a
> > xwhiteout.
> > 
> > However, now I need the lowerdir toplevel dir to also have a
> > trusted.overlay.whiteouts xattr. But when I'm converting the entire
> > chroot I do not know which of the directories is going to be used
> > as
> > the toplevel lower dir, so I don't know where to put this marker.
> > 
> > The only solution I see is to put it on *all* parent directories.
> > Is
> > there a better approach here?
> > 
> 
> Alex,
> 
> As you can see, I posted v3 with an alternative approach that would
> not
> require marking all possible lower layer roots.
> 
> However, I cannot help wondering if it wouldn't be better practice,
> when
> composing layers, to always be explicit, per-directory about whether
> the
> composed directory is a "base" or a "diff" layer.
> 
> Isn't this information always known at composing time?

Currently, composefs images are not layered as such. They normally only
have one or more lowerdata layers, and then the actual image as a
single lowerdir, and on top of that an optional upper if you want some
kind of writability. 

But, when composing the composefs the content of the image is opaque to
us. We're just given a directory with some files in it for the image.
It might contain some other lowerdirs, but the details are not know to
us at compose time.

That said, I can see that in some cases it may make sense to use
multiple lower dirs, for example when using composefs for multi-layer
container images. When generating such layers we typically have the
lower levels available, so we could probably extract the base/dir
status for each directory.

> In legacy overlayfs, there is an explicit mark for "this is a base
> dir" -
> namely, the opaque xattr, but there is no such explicit mark on
> directories without an entry with the same name in layers below them.
> 
> The lack of explicit mark "merge" vs. "opaque" in all directories in
> all
> the layers had led to problems in the past, for example, this is the
> reason that this fix was needed:
> 
>   b79e05aaa166 ovl: no direct iteration for dir with origin xattr
> 
> In conclusion, since composefs is the first tool, that I know of, to
> compose "non-legacy" overlayfs layers (i.e. with overlay xattrs),
> I think the correct design decision would mark every directory in
> every layer explicitly as at exactly one of "merge"/"opaque".
> 
> Note that non-dir are always marked explicitly as "metacopy",
> so there is no ambiguity with non-dirs and we also error out
> if a non-dir stack does not end with an "opaque" entry.
> 
> Additionally, when composing layers, since all the children of
> a directory should be explicitly marked as "merge" vs. "opaque"
> then the parent's "impure" (meaning contains "merge" children)
> can also be set at composing time.
> 
> Failing to set "impure" correctly when composing layers could
> result in wrong readdir d_ino results.
> 
> My proposition in v3 for an explicit mark was to
> "reinterpret the opaque xattr from boolean to enum".
> My proposal included only the states 'y' (opaque) and 'x' (contains
> xwhiteouts), but for composefs, I would extend this to also mark
> a merge dir explicitly with opaque='n' and explicitly mark all the
> directories in a "base layer" with opaque='y'.
> 
> Implementation-wise, composefs could start by marking each directory
> with either 'y'/'n' state based on the lowerstack, and if xwhiteout
> entries
> are added, 'n' state could be changed to 'x' state.

We never add xwhiteout entries to the composefs, all directories in the
base layer would be overlay=n or y, and any directory containing an
escaped xwhiteout would have an escaped opaque xattr with 'x'.

> What do you think?

I feel it may be overkill, as most composefs images would be a one-
layer thing, and adding opaque=n to every directory in the lowermost
layer would just waste space and makes little sense (being in the
lowest layer means we already know if the directory is merged or not).
However, I think it may make sense to be able to mark non-lowest-layer
directories with either n or y.

> Does it make sense from composefs POV?
> Am I correct to assume that at composing time, every directory
> state is known (base 'y' vs. diff 'n')?
> 
> Thanks,
> Amir.
> 

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=
 Alexander Larsson                                            Red Hat,
Inc 
       alexl@redhat.com            alexander.larsson@gmail.com 
He's a globe-trotting crooked cyborg in drag. She's an elegant 
streetsmart socialite prone to fits of savage, blood-crazed rage. They 
fight crime! 


  reply	other threads:[~2024-01-22 10:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-19 10:14 [PATCH v2] ovl: require xwhiteout feature flag on layer roots Miklos Szeredi
2024-01-19 11:08 ` Amir Goldstein
2024-01-19 11:20   ` Miklos Szeredi
2024-01-19 12:19     ` Amir Goldstein
2024-01-19 12:29   ` Alexander Larsson
2024-01-19 14:12     ` Amir Goldstein
2024-01-19 16:35   ` Alexander Larsson
2024-01-19 19:06     ` Amir Goldstein
2024-01-19 20:29       ` Miklos Szeredi
2024-01-20 10:32         ` Amir Goldstein
2024-01-22 10:01           ` Alexander Larsson
2024-01-22  8:38     ` Amir Goldstein
2024-01-22 10:44       ` Alexander Larsson [this message]
2024-01-22 11:30         ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0473f5389dd1ada08c73479612a4e054c8023d94.camel@redhat.com \
    --to=alexl@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).