linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Hao Luo <haoluo@google.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	overlayfs <linux-unionfs@vger.kernel.org>,
	Christian Brauner <brauner@kernel.org>,
	Greg KH <gregkh@linuxfoundation.org>
Subject: Re: Overlayfs with writable lower layer
Date: Thu, 15 Sep 2022 13:54:34 +0300	[thread overview]
Message-ID: <CAOQ4uxi3muNRfKtvt6U8gQRiVCfGghKdxsGhLjbhL7GPrrs+ZA@mail.gmail.com> (raw)
In-Reply-To: <CA+khW7gS+=D6F3x9k+=8juknzooxjZyqwAMDrEY0NrR2kYAjMQ@mail.gmail.com>

On Wed, Sep 14, 2022 at 10:33 PM Hao Luo <haoluo@google.com> wrote:
>
> On Wed, Sep 14, 2022 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Wed, Sep 14, 2022 at 9:00 PM Hao Luo <haoluo@google.com> wrote:
> > >
> > > On Tue, Sep 13, 2022 at 8:46 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Tue, Sep 13, 2022 at 11:33 PM Hao Luo <haoluo@google.com> wrote:
> > > > >
> > > > > On Tue, Sep 13, 2022 at 11:54 AM Amir Goldstein <amir73il@gmail.com> wrote:
> [...]
> > > > There are probably some other limitations at the moment
> > > > related to pseudo filesystems that prevent them from being
> > > > used as upper and/or lower fs in overlayfs.
> > > >
> > > > We will need to check what those limitations are and whether
> > > > those limitations could be lifted for your specific use case.
> > > >
> > >
> > > How can we approach this? Maybe I can send my patch that adds tmp dir,
> > > tmp files and xattr, attr to upstream as RFC, so you can take a look?
> > >
> >
> > I don't think I need your fs to test.
> > The only thing special in this setup as far as I can tell is the dynamic
> > cgroupfs (or cgroup2?) lower dirs.
> >
> > IIUC, everything worked for you except for oddities related to
> > lower directories not appearing and not disappearing from the union.
> > Is that correct? is that the only thing that you need a fix for?
> >
>
> Yes, that's correct.
>
> > > > > Further, directory B could disappear from lower. When that happens, I
> > > > > think there are two possible behaviors:
> > > > >  - make 'file' disappear from union as well;
> > > > >  - make 'file' and its directory accessible as well.
> > > > >
> > > > > In behavior 1, it will look like
> > > > > $ tree union
> > > > > .
> > > > > └── A
> > > > >     └── lower1
> > > > >
> > > > > In behavior 2, it will look like
> > > > > $ tree union
> > > > > .
> > > > > └── A
> > > > >     ├── B
> > > > >     │   └── file
> > > > >     └── lower1
> > > > >
> > > > > IMHO, behavior 1 works better in my use case. But if the FS experts
> > > > > think behavior 2 makes more sense, I can work around.
> > > > >
> > > >
> > > > Something that I always wanted to try is to get rid of the duplicated
> > > > upper fs hierarchy.
> > > >
> > > > It's a bit complicated to explain the details, but if your use case
> > > > does not involve any directory renames(?), then the upper path
> > > > for the merge directories can be index based and not hierarchical.
> > > >
> > >
> > > Yeah, I don't expect directory renaming. But I can't say if there is
> > > anyone trying to do that by accident, or by bad intention.
> > >
> >
> > Your fs will return an error for rename if you did not implement it.
> >
> > Anyway, if you can accept behavior 2, it is much more simple.
> > This other idea is very vague and not simple, so better not risk it.
> >
> > If you confirm that you only need to get uptodate view of
> > lower dirs in union, then I will look for the patches that I have
> > and see if they can help you.
> >
>
> Yes, I acknowledge that behavior 2 works for me.

OK. I took a closer look and there are some challenges.
Nothing that cannot be fixed if you are willing to do the work.
I will try to explain the challenges and possible solutions.

Current overlayfs code assumes in many places that the
lower fs is not being changed at all while overlayfs is mounted.
As overlayfs.rst says:

"Changes to the underlying filesystems while part of a mounted overlay
filesystem are not allowed.  If the underlying filesystem is changed,
the behavior of the overlay is undefined, though it will not result in
a crash or deadlock."

One of the most visible impacts of changes to lower later
is that the merge dir cache is not invalidated, which is the
immediate reason that you are seeing the ghost lower dir A/B
in the union even if you did not create file A/B/file.

You can check if this hack fixes your first order problem:

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 78f62cc1797b..4eb6fcf341de 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -326,7 +326,7 @@ static void ovl_dir_reset(struct file *file)
        struct dentry *dentry = file->f_path.dentry;
        bool is_real;

-       if (cache && ovl_dentry_version_get(dentry) != cache->version) {
+       if (cache /*&& ovl_dentry_version_get(dentry) != cache->version*/) {
                ovl_cache_put(od, dentry);
                od->cache = NULL;
                od->cursor = NULL;
---

If it does, it may be acceptable to add that behavior as a mount option.

But it doesn't end here, there is also lookup cache and possibly other
issues as well related to merge dirs with ghosted lower.

If you did create file A/B/file then trying to list A/B after A/B
has gone from lower fs,  may depend on the lower fs behavior.
Some of the issues are not related to overlayfs but to cgroupfs.

For "standard" Linux fs, if you keep an open fd to a directory,
that directory can be removed and then if you try to readdir from
the open fd, or use the fd in one of the XXXat() syscalls,
you will get ENOENT, because of the IS_DEADDIR(dir) checks
in the vfs.

Do you get this behavior with an open fd on a cgroupfs dir
that has disappeared? Please check.

I think that ovl_iterate() can be made more tolerant to
ENOENT when iterating a merge dir with ghosted lower dir.
If you run into this error when trying to list A/B, find out
the place in the code that returns the error and I'll see
if that error may be relaxed.

The patches that I have are doing something different.
The idea is that overlayfs can watch for lower fs changes using
fsnotify() callbacks and do "something" proactive when they happen.

My Overlayfs watch [1] patches do "something" else - they
record the changes to lower fs when they happen, but they
demonstrate the basic concept of watching changes in lower fs.

[1] https://github.com/amir73il/overlayfs/wiki/Overlayfs-watch

The "something" that overlayfs could do when a lower dir
is removed is to invalidate the caches of the union dir and
everything under it.

There is one other small problem with this method w.r.t
lower cgroupfs - cgroupfs does not call any fsnotify callbacks when
directories disappear...

cgroupfs is an instance of kernfs.
kenfs is calling the fsnotify_modify() hook when kernel changes
the content of a file:

d911d9874801 kernfs: make kernfs_notify() trigger inotify events too

but it does not call fsnotify_rmdir/mkdir/delete/create() like other pseudo
fs do (debugfs, configfs, tracefs, ...) when directories appear/disappear -
at least I don't think that it does.

Please run inotifywatch on cgroupfs and find out for yourself.

Hope that some of the info here can help you move forward.
Most of it you can probably ignore.

Thanks,
Amir.

  reply	other threads:[~2022-09-15 10:54 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 15:30 [PATCH RESEND V12 0/8] fuse: Add support for passthrough read/write Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 1/8] fs: Generic function to convert iocb to rw flags Alessio Balsini
2021-01-25 16:46   ` Alessio Balsini
2021-03-24  7:43     ` Rokudo Yan
2021-03-24 14:02       ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 2/8] fuse: 32-bit user space ioctl compat for fuse device Alessio Balsini
     [not found]   ` <CAMAHBGzkfEd9-1u0iKXp65ReJQgUi_=4sMpmfkwEOaMp6Ux7pg@mail.gmail.com>
2021-01-27 13:40     ` Alessio Balsini
     [not found]       ` <CAMAHBGwpKW+30kNQ_Apt8A-FTmr94hBOzkT21cjEHHW+t7yUMQ@mail.gmail.com>
2021-01-28 14:15         ` Alessio Balsini
2021-02-05  9:54           ` Peng Tao
2021-03-16 18:57           ` Arnd Bergmann
2021-02-17 10:21   ` Miklos Szeredi
2021-03-01 12:26     ` Alessio Balsini
2021-03-16 18:53   ` Arnd Bergmann
2021-03-18 16:13     ` Alessio Balsini
2021-03-18 21:15       ` Arnd Bergmann
2021-03-19 15:21         ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough Alessio Balsini
2021-02-17 13:41   ` Miklos Szeredi
2021-02-19  7:05     ` Peng Tao
2021-02-19  8:40       ` Miklos Szeredi
2021-03-01 17:05         ` Alessio Balsini
2022-09-08 15:36           ` Amir Goldstein
2022-09-09 19:07             ` Miklos Szeredi
2022-09-10  8:52               ` Amir Goldstein
2022-09-10 13:03                 ` Bernd Schubert
2022-09-12  9:29                 ` Miklos Szeredi
2022-09-12 12:29                   ` Amir Goldstein
2022-09-12 13:03                     ` Miklos Szeredi
2022-09-12 13:05                       ` Miklos Szeredi
2022-09-12 13:26                       ` Amir Goldstein
2022-09-12 14:22                         ` Miklos Szeredi
2022-09-12 15:39                           ` Amir Goldstein
2022-09-12 17:43                             ` Hao Luo
2022-09-12 18:28                               ` Overlayfs with writable lower layer Amir Goldstein
2022-09-13 18:26                                 ` Hao Luo
2022-09-13 18:54                                   ` Amir Goldstein
2022-09-13 20:33                                     ` Hao Luo
2022-09-14  3:46                                       ` Amir Goldstein
2022-09-14 18:00                                         ` Hao Luo
2022-09-14 19:23                                           ` Amir Goldstein
2022-09-14 19:33                                             ` Hao Luo
2022-09-15 10:54                                               ` Amir Goldstein [this message]
2023-05-12 19:37                     ` [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough Amir Goldstein
2023-05-15  7:29                       ` Miklos Szeredi
2023-05-15 14:00                         ` Amir Goldstein
2023-05-15 20:16                           ` [fuse-devel] " Nikolaus Rath
2023-05-15 21:11                             ` Bernd Schubert
2023-05-15 21:45                               ` Paul Lawrence
2023-05-16  8:43                                 ` Miklos Szeredi
2023-05-16 10:16                                   ` Nikolaus Rath
2023-05-16  8:48                                 ` Amir Goldstein
2021-01-25 15:30 ` [PATCH RESEND V12 4/8] fuse: Passthrough initialization and release Alessio Balsini
2021-02-17 13:52   ` Miklos Szeredi
2021-05-05 12:21     ` Amir Goldstein
2021-05-17 11:36       ` Alessio Balsini
2021-05-17 13:21         ` Amir Goldstein
2021-01-25 15:30 ` [PATCH RESEND V12 5/8] fuse: Introduce synchronous read and write for passthrough Alessio Balsini
2021-02-17 14:00   ` Miklos Szeredi
2021-01-25 15:30 ` [PATCH RESEND V12 6/8] fuse: Handle asynchronous read and write in passthrough Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 7/8] fuse: Use daemon creds in passthrough mode Alessio Balsini
2021-02-05  9:23   ` Peng Tao
2021-02-05 11:21     ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 8/8] fuse: Introduce passthrough for mmap Alessio Balsini
2021-02-17 14:05   ` Miklos Szeredi
2021-04-01 11:24     ` Alessio Balsini
2021-11-18 18:31 ` [PATCH RESEND V12 0/8] fuse: Add support for passthrough read/write Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxi3muNRfKtvt6U8gQRiVCfGghKdxsGhLjbhL7GPrrs+ZA@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=haoluo@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).