linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: overlayfs <linux-unionfs@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Daniel J Walsh <dwalsh@redhat.com>
Subject: Re: [PATCH v7] overlayfs: Provide a mount option "volatile" to skip sync
Date: Tue, 1 Sep 2020 11:22:26 +0300	[thread overview]
Message-ID: <CAOQ4uxi6Hc4gNwCiogBG+FeeW-bAUd-ZsW2X=TPJ+6JZCbodVQ@mail.gmail.com> (raw)
In-Reply-To: <20200831181529.GA1193654@redhat.com>

On Mon, Aug 31, 2020 at 9:15 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Container folks are complaining that dnf/yum issues too many sync while
> installing packages and this slows down the image build. Build
> requirement is such that they don't care if a node goes down while
> build was still going on. In that case, they will simply throw away
> unfinished layer and start new build. So they don't care about syncing
> intermediate state to the disk and hence don't want to pay the price
> associated with sync.
>
> So they are asking for mount options where they can disable sync on overlay
> mount point.
>
> They primarily seem to have two use cases.
>
> - For building images, they will mount overlay with nosync and then sync
>   upper layer after unmounting overlay and reuse upper as lower for next
>   layer.
>
> - For running containers, they don't seem to care about syncing upper
>   layer because if node goes down, they will simply throw away upper
>   layer and create a fresh one.
>
> So this patch provides a mount option "volatile" which disables all forms
> of sync. Now it is caller's responsibility to throw away upper if
> system crashes or shuts down and start fresh.
>
> With "volatile", I am seeing roughly 20% speed up in my VM where I am just
> installing emacs in an image. Installation time drops from 31 seconds to
> 25 seconds when nosync option is used. This is for the case of building on top
> of an image where all packages are already cached. That way I take
> out the network operations latency out of the measurement.
>
> Giuseppe is also looking to cut down on number of iops done on the
> disk. He is complaining that often in cloud their VMs are throttled
> if they cross the limit. This option can help them where they reduce
> number of iops (by cutting down on frequent sync and writebacks).
>
> Changes from v6:
> - Got rid of logic to check for volatile/dirty file. Now Amir's
>   patch checks for presence of incomat/volatile directory and errors
>   out if present. User is now required to remove volatile
>   directory. (Amir).
>
> Changes from v5:
> - Added support to detect that previous overlay was mounted with
>   "volatile" option and fail mount. (Miklos and Amir).
>
> Changes from v4:
> - Dropped support for sync=fs (Miklos)
> - Renamed "sync=off" to "volatile". (Miklos)
>
> Changes from v3:
> - Used only enums and dropped bit flags (Amir Goldstein)
> - Dropped error when conflicting sync options provided. (Amir Goldstein)
>
> Changes from v2:
> - Added helper functions (Amir Goldstein)
> - Used enums to keep sync state (Amir Goldstein)
>
> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

See one suggestion below, but you may ignore it...


> +/*
> + * Creates $workdir/work/incompat/volatile/dirty file if it is not
> + * already present.
> + */
> +static int ovl_create_volatile_dirty(struct ovl_fs *ofs)
> +{
> +       struct dentry *parent, *child;
> +       char *name;
> +       int i, len, err;
> +       char *dirty_path[] = {OVL_WORKDIR_NAME, "incompat", "volatile", "dirty"};

Technically, you are calling this right after creating OVL_WORKDIR_NAME, so you
could start from ofs->workdir and drop the first level, but as you wrote it this
function could also be called also after the assignment ovl->workdir =
ovl->indexdir
so it is probably safer to start with ofs->workbasedir as you did.

> +       int nr_elems = ARRAY_SIZE(dirty_path);
> +
> +       err = 0;
> +       parent = ofs->workbasedir;
> +       dget(parent);
> +
> +       for (i = 0; i < nr_elems; i++) {
> +               name = dirty_path[i];
> +               len = strlen(name);
> +               inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> +               child = lookup_one_len(name, parent, len);
> +               if (IS_ERR(child)) {
> +                       err = PTR_ERR(child);
> +                       goto out_unlock;
> +               }
> +
> +               if (!child->d_inode) {
> +                       unsigned short ftype;
> +
> +                       ftype = (i == (nr_elems - 1)) ? S_IFREG : S_IFDIR;
> +                       child = ovl_create_real(parent->d_inode, child,
> +                                               OVL_CATTR(ftype | 0));
> +                       if (IS_ERR(child)) {
> +                               err = PTR_ERR(child);
> +                               goto out_unlock;
> +                       }
> +               }
> +
> +               inode_unlock(parent->d_inode);
> +               dput(parent);
> +               parent = child;
> +               child = NULL;
> +       }
> +
> +       dput(parent);
> +       return err;
> +
> +out_unlock:
> +       inode_unlock(parent->d_inode);
> +       dput(parent);
> +       return err;
> +}
> +

I think a helper ovl_test_create() along the lines of the helper found on
my ovl-features branch could make this code a lot easier to follow.
Note that the helper in that branch in not ready to be cherry-picked
as is - it needs changes, so take it or leave it.

Thanks,
Amir.

  reply	other threads:[~2020-09-01  8:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-31 18:15 [PATCH v7] overlayfs: Provide a mount option "volatile" to skip sync Vivek Goyal
2020-09-01  8:22 ` Amir Goldstein [this message]
2020-09-01 13:14   ` Vivek Goyal
2020-11-06 17:58 ` Sargun Dhillon
2020-11-06 19:00   ` Amir Goldstein
2020-11-06 19:20     ` Vivek Goyal
2020-11-09 17:22     ` Vivek Goyal
2020-11-09 17:25       ` Sargun Dhillon
2020-11-09 19:39         ` Amir Goldstein
2020-11-09 20:24           ` Vivek Goyal
2020-11-06 19:03   ` Vivek Goyal
2020-11-06 19:42     ` Giuseppe Scrivano
2020-11-07  9:35       ` Amir Goldstein
2020-11-07 11:52         ` Sargun Dhillon
2020-11-09 20:40           ` Vivek Goyal
2020-11-09  8:53         ` Giuseppe Scrivano
2020-11-09 10:10           ` Amir Goldstein
2020-11-09 16:36         ` Vivek Goyal
2020-11-09 17:09         ` Vivek Goyal
2020-11-09 17:20           ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxi6Hc4gNwCiogBG+FeeW-bAUd-ZsW2X=TPJ+6JZCbodVQ@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=dwalsh@redhat.com \
    --cc=gscrivan@redhat.com \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).