linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Sargun Dhillon <sargun@sargun.me>
Cc: overlayfs <linux-unionfs@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Daniel J Walsh <dwalsh@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH v2 3/4] overlay: Add the ability to remount volatile directories when safe
Date: Fri, 27 Nov 2020 15:04:59 +0200	[thread overview]
Message-ID: <CAOQ4uxhE4qQrRn2gVs+EoyyCQdGCBZLHdfKf+oDBvii0m3aX4Q@mail.gmail.com> (raw)
In-Reply-To: <20201127092058.15117-4-sargun@sargun.me>

On Fri, Nov 27, 2020 at 11:21 AM Sargun Dhillon <sargun@sargun.me> wrote:
>
> Overlayfs added the ability to setup mounts where all syncs could be
> short-circuted in (2a99ddacee43: ovl: provide a mount option "volatile").
>
> A user might want to remount this fs, but we do not let the user because
> of the "incompat" detection feature. In the case of volatile, it is safe
> to do something like[1]:
>
> $ sync -f /root/upperdir
> $ rm -rf /root/workdir/incompat/volatile
>
> There are two ways to go about this. You can call sync on the underlying
> filesystem, check the error code, and delete the dirty file if everything
> is clean. If you're running lots of containers on the same filesystem, or
> you want to avoid all unnecessary I/O, this may be suboptimal.
>
> Alternatively, you can blindly delete the dirty file, and "hope for the
> best".
>
> This patch introduces transparent functionality to check if it is
> (relatively) safe to reuse the upperdir. It ensures that the filesystem
> hasn't been remounted, the system hasn't been rebooted, nor has the
> overlayfs code changed. Since the structure is explicitly not meant to be
> used between different versions of the code, its stability does not matter
> so much.
>
> [1]: https://lore.kernel.org/linux-unionfs/CAOQ4uxhKr+j5jFyEC2gJX8E8M19mQ3CqdTYaPZOvDQ9c0qLEzw@mail.gmail.com/T/#m6abe713e4318202ad57f301bf28a414e1d824f9c
>
> Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-unionfs@vger.kernel.org
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>

You may add:
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  Documentation/filesystems/overlayfs.rst | 18 +++--
>  fs/overlayfs/overlayfs.h                | 37 +++++++++-
>  fs/overlayfs/readdir.c                  | 98 ++++++++++++++++++++++---
>  fs/overlayfs/super.c                    | 73 +++++++++++++-----
>  fs/overlayfs/util.c                     |  2 +
>  5 files changed, 190 insertions(+), 38 deletions(-)
>
> diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
> index c6e30c1bc2f2..b485fdb65b85 100644
> --- a/Documentation/filesystems/overlayfs.rst
> +++ b/Documentation/filesystems/overlayfs.rst
> @@ -579,13 +579,17 @@ example, upon detecting a fault, ext4 can be configured to panic.
>  The advantage of mounting with the "volatile" option is that all forms of
>  sync calls to the upper filesystem are omitted.
>
> -When overlay is mounted with "volatile" option, the directory
> -"$workdir/work/incompat/volatile" is created.  During next mount, overlay
> -checks for this directory and refuses to mount if present. This is a strong
> -indicator that user should throw away upper and work directories and create
> -fresh one. In very limited cases where the user knows that the system has
> -not crashed and contents of upperdir are intact, The "volatile" directory
> -can be removed.
> +When overlay is mounted with the "volatile" option, the directory
> +"$workdir/work/incompat/volatile" is created.  This acts as a indicator
> +that the user should throw away upper and work directories and create fresh
> +ones.  In some cases, the overlayfs can detect if the upperdir can be
> +reused safely in a subsequent volatile mounts, and mounting will proceed as
> +normal.  If the filesystem is unable to determine if this is safe (due to a
> +reboot, upgraded kernel code, or loss of checkpoint, etc...), the user may
> +bypass these safety checks and remove the "volatile" directory if they know
> +the system did not encounter a fault and the contents of the upperdir are
> +intact. Then, the user can remount the filesystem as normal.
> +
>
>  Testsuite
>  ---------
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index f8880aa2ba0e..de694ee99d7c 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -32,8 +32,13 @@ enum ovl_xattr {
>         OVL_XATTR_NLINK,
>         OVL_XATTR_UPPER,
>         OVL_XATTR_METACOPY,
> +       OVL_XATTR_VOLATILE,
>  };
>
> +#define OVL_INCOMPATDIR_NAME "incompat"
> +#define OVL_VOLATILEDIR_NAME "volatile"
> +#define OVL_VOLATILE_DIRTY_NAME "dirty"
> +
>  enum ovl_inode_flag {
>         /* Pure upper dir that may contain non pure upper entries */
>         OVL_IMPURE,
> @@ -57,6 +62,31 @@ enum {
>         OVL_XINO_ON,
>  };
>
> +/*
> + * This is copied into the volatile xattr, and the user does not interact with
> + * it. There is no stability requirement, as a reboot explicitly invalidates
> + * a volatile workdir. It is explicitly meant not to be a stable api.
> + *
> + * Although this structure isn't meant to be stable it is exposed to potentially
> + * unprivileged users. We don't do any kind of cryptographic operations with
> + * the structure, so it could be tampered with, or inspected. Don't put
> + * kernel memory pointers in it, or anything else that could cause problems,
> + * or information disclosure.
> + */
> +struct ovl_volatile_info {
> +       /*
> +        * This uniquely identifies a boot, and is reset if overlayfs itself
> +        * is reloaded. Therefore we check our current / known boot_id
> +        * against this before looking at any other fields to validate:
> +        * 1. Is this datastructure laid out in the way we expect? (Overlayfs
> +        *    module, reboot, etc...)
> +        * 2. Could something have changed (like the s_instance_id counter
> +        *    resetting)
> +        */
> +       uuid_t          ovl_boot_id;    /* Must stay first member */
> +       u64             s_instance_id;
> +} __packed;
> +
>  /*
>   * The tuple (fh,uuid) is a universal unique identifier for a copy up origin,
>   * where:
> @@ -422,8 +452,8 @@ void ovl_cleanup_whiteouts(struct dentry *upper, struct list_head *list);
>  void ovl_cache_free(struct list_head *list);
>  void ovl_dir_cache_free(struct inode *inode);
>  int ovl_check_d_type_supported(struct path *realpath);
> -int ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
> -                       struct dentry *dentry, int level);
> +int ovl_workdir_cleanup(struct ovl_fs *ofs, struct inode *dir,
> +                       struct vfsmount *mnt, struct dentry *dentry, int level);
>  int ovl_indexdir_cleanup(struct ovl_fs *ofs);
>
>  /* inode.c */
> @@ -520,3 +550,6 @@ int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
>
>  /* export.c */
>  extern const struct export_operations ovl_export_operations;
> +
> +/* super.c */
> +extern uuid_t ovl_boot_id;
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 01620ebae1bd..7b66fbb20261 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1080,10 +1080,78 @@ int ovl_check_d_type_supported(struct path *realpath)
>
>         return rdd.d_type_supported;
>  }
> +static int ovl_verify_volatile_info(struct ovl_fs *ofs,
> +                                   struct dentry *volatiledir)
> +{
> +       int err;
> +       struct ovl_volatile_info info;
> +
> +       if (!volatiledir->d_inode)
> +               return 0;
> +
> +       if (!ofs->config.ovl_volatile) {
> +               pr_debug("Mount is not volatile; upperdir is marked volatile\n");
> +               return -EINVAL;
> +       }
> +
> +       err = ovl_do_getxattr(ofs, volatiledir, OVL_XATTR_VOLATILE, &info,
> +                             sizeof(info));
> +       if (err < 0) {
> +               pr_debug("Unable to read volatile xattr: %d\n", err);
> +               return -EINVAL;
> +       }
> +
> +       if (err != sizeof(info)) {
> +               pr_debug("%s xattr on-disk size is %d expected to read %zd\n",
> +                        ovl_xattr(ofs, OVL_XATTR_VOLATILE), err, sizeof(info));
> +               return -EINVAL;
> +       }
> +
> +       if (!uuid_equal(&ovl_boot_id, &info.ovl_boot_id)) {
> +               pr_debug("boot id has changed (reboot or module reloaded)\n");
> +               return -EINVAL;
> +       }
> +
> +       if (volatiledir->d_sb->s_instance_id != info.s_instance_id) {
> +               pr_debug("workdir has been unmounted and remounted\n");
> +               return -EINVAL;
> +       }
> +
> +       return 1;
> +}
>
> -#define OVL_INCOMPATDIR_NAME "incompat"
> +/*
> + * ovl_check_incompat checks this specific incompat entry for incompatibility.
> + * If it is found to be incompatible -EINVAL will be returned.
> + *
> + * If the directory should be preserved, then this function returns 1.
> + */
> +static int ovl_check_incompat(struct ovl_fs *ofs, struct ovl_cache_entry *p,
> +                             struct path *path)
> +{
> +       int err = -EINVAL;
> +       struct dentry *d;
> +
> +       if (!strcmp(p->name, OVL_VOLATILEDIR_NAME)) {
> +               d = lookup_one_len(p->name, path->dentry, p->len);
> +               if (IS_ERR(d))
> +                       return PTR_ERR(d);
> +
> +               err = ovl_verify_volatile_info(ofs, d);
> +               dput(d);
> +       }
> +
> +       if (err == -EINVAL)
> +               pr_err("incompat feature '%s' cannot be mounted\n", p->name);
> +       else
> +               pr_debug("incompat '%s' handled: %d\n", p->name, err);
> +
> +       dput(d);

Letfover.

Thanks,
Amir.

  parent reply	other threads:[~2020-11-27 13:05 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-27  9:20 [PATCH v2 0/4] Make overlayfs volatile mounts reusable Sargun Dhillon
2020-11-27  9:20 ` [PATCH v2 1/4] fs: Add s_instance_id field to superblock for unique identification Sargun Dhillon
2020-11-27  9:20 ` [PATCH v2 2/4] overlay: Document current outstanding shortcoming of volatile Sargun Dhillon
2020-11-27 12:52   ` Amir Goldstein
2020-11-27 22:11     ` Sargun Dhillon
2020-11-28  2:01       ` Jeff Layton
2020-11-28  4:45         ` Sargun Dhillon
2020-11-28  7:12           ` Amir Goldstein
2020-11-28  8:52             ` Sargun Dhillon
2020-11-28  9:04               ` Amir Goldstein
2020-12-01 11:09               ` Sargun Dhillon
2020-12-01 11:29                 ` Amir Goldstein
2020-12-01 13:01                 ` Jeff Layton
2020-12-01 15:24                   ` Vivek Goyal
2020-12-01 16:10                     ` Jeff Layton
2020-11-28 12:04           ` Jeff Layton
2020-11-28  8:56       ` Amir Goldstein
2020-11-28  9:06         ` Amir Goldstein
2020-11-27  9:20 ` [PATCH v2 3/4] overlay: Add the ability to remount volatile directories when safe Sargun Dhillon
2020-11-27 11:09   ` kernel test robot
2020-11-27 13:04   ` Amir Goldstein [this message]
2020-12-07 11:39   ` Dan Carpenter
2020-11-27  9:20 ` [PATCH v2 4/4] overlay: Add rudimentary checking of writeback errseq on volatile remount Sargun Dhillon
2020-11-30 18:43   ` Vivek Goyal
2020-11-30 19:15   ` Vivek Goyal
2020-12-05  9:13     ` Amir Goldstein
2020-12-05 13:51       ` Jeff Layton
2020-12-05 14:51         ` Amir Goldstein
2020-11-30 19:33   ` Vivek Goyal
2020-12-01 11:56     ` Sargun Dhillon
2020-12-01 12:45       ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxhE4qQrRn2gVs+EoyyCQdGCBZLHdfKf+oDBvii0m3aX4Q@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dwalsh@redhat.com \
    --cc=gscrivan@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=sargun@sargun.me \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).