From: Amir Goldstein <amir73il@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: overlayfs <linux-unionfs@vger.kernel.org>,
Miklos Szeredi <miklos@szeredi.hu>,
Giuseppe Scrivano <gscrivan@redhat.com>,
Daniel J Walsh <dwalsh@redhat.com>
Subject: Re: [PATCH v7] overlayfs: Provide a mount option "volatile" to skip sync
Date: Tue, 1 Sep 2020 11:22:26 +0300 [thread overview]
Message-ID: <CAOQ4uxi6Hc4gNwCiogBG+FeeW-bAUd-ZsW2X=TPJ+6JZCbodVQ@mail.gmail.com> (raw)
In-Reply-To: <20200831181529.GA1193654@redhat.com>
On Mon, Aug 31, 2020 at 9:15 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Container folks are complaining that dnf/yum issues too many sync while
> installing packages and this slows down the image build. Build
> requirement is such that they don't care if a node goes down while
> build was still going on. In that case, they will simply throw away
> unfinished layer and start new build. So they don't care about syncing
> intermediate state to the disk and hence don't want to pay the price
> associated with sync.
>
> So they are asking for mount options where they can disable sync on overlay
> mount point.
>
> They primarily seem to have two use cases.
>
> - For building images, they will mount overlay with nosync and then sync
> upper layer after unmounting overlay and reuse upper as lower for next
> layer.
>
> - For running containers, they don't seem to care about syncing upper
> layer because if node goes down, they will simply throw away upper
> layer and create a fresh one.
>
> So this patch provides a mount option "volatile" which disables all forms
> of sync. Now it is caller's responsibility to throw away upper if
> system crashes or shuts down and start fresh.
>
> With "volatile", I am seeing roughly 20% speed up in my VM where I am just
> installing emacs in an image. Installation time drops from 31 seconds to
> 25 seconds when nosync option is used. This is for the case of building on top
> of an image where all packages are already cached. That way I take
> out the network operations latency out of the measurement.
>
> Giuseppe is also looking to cut down on number of iops done on the
> disk. He is complaining that often in cloud their VMs are throttled
> if they cross the limit. This option can help them where they reduce
> number of iops (by cutting down on frequent sync and writebacks).
>
> Changes from v6:
> - Got rid of logic to check for volatile/dirty file. Now Amir's
> patch checks for presence of incomat/volatile directory and errors
> out if present. User is now required to remove volatile
> directory. (Amir).
>
> Changes from v5:
> - Added support to detect that previous overlay was mounted with
> "volatile" option and fail mount. (Miklos and Amir).
>
> Changes from v4:
> - Dropped support for sync=fs (Miklos)
> - Renamed "sync=off" to "volatile". (Miklos)
>
> Changes from v3:
> - Used only enums and dropped bit flags (Amir Goldstein)
> - Dropped error when conflicting sync options provided. (Amir Goldstein)
>
> Changes from v2:
> - Added helper functions (Amir Goldstein)
> - Used enums to keep sync state (Amir Goldstein)
>
> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
See one suggestion below, but you may ignore it...
> +/*
> + * Creates $workdir/work/incompat/volatile/dirty file if it is not
> + * already present.
> + */
> +static int ovl_create_volatile_dirty(struct ovl_fs *ofs)
> +{
> + struct dentry *parent, *child;
> + char *name;
> + int i, len, err;
> + char *dirty_path[] = {OVL_WORKDIR_NAME, "incompat", "volatile", "dirty"};
Technically, you are calling this right after creating OVL_WORKDIR_NAME, so you
could start from ofs->workdir and drop the first level, but as you wrote it this
function could also be called also after the assignment ovl->workdir =
ovl->indexdir
so it is probably safer to start with ofs->workbasedir as you did.
> + int nr_elems = ARRAY_SIZE(dirty_path);
> +
> + err = 0;
> + parent = ofs->workbasedir;
> + dget(parent);
> +
> + for (i = 0; i < nr_elems; i++) {
> + name = dirty_path[i];
> + len = strlen(name);
> + inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> + child = lookup_one_len(name, parent, len);
> + if (IS_ERR(child)) {
> + err = PTR_ERR(child);
> + goto out_unlock;
> + }
> +
> + if (!child->d_inode) {
> + unsigned short ftype;
> +
> + ftype = (i == (nr_elems - 1)) ? S_IFREG : S_IFDIR;
> + child = ovl_create_real(parent->d_inode, child,
> + OVL_CATTR(ftype | 0));
> + if (IS_ERR(child)) {
> + err = PTR_ERR(child);
> + goto out_unlock;
> + }
> + }
> +
> + inode_unlock(parent->d_inode);
> + dput(parent);
> + parent = child;
> + child = NULL;
> + }
> +
> + dput(parent);
> + return err;
> +
> +out_unlock:
> + inode_unlock(parent->d_inode);
> + dput(parent);
> + return err;
> +}
> +
I think a helper ovl_test_create() along the lines of the helper found on
my ovl-features branch could make this code a lot easier to follow.
Note that the helper in that branch in not ready to be cherry-picked
as is - it needs changes, so take it or leave it.
Thanks,
Amir.
next prev parent reply other threads:[~2020-09-01 8:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-31 18:15 [PATCH v7] overlayfs: Provide a mount option "volatile" to skip sync Vivek Goyal
2020-09-01 8:22 ` Amir Goldstein [this message]
2020-09-01 13:14 ` Vivek Goyal
2020-11-06 17:58 ` Sargun Dhillon
2020-11-06 19:00 ` Amir Goldstein
2020-11-06 19:20 ` Vivek Goyal
2020-11-09 17:22 ` Vivek Goyal
2020-11-09 17:25 ` Sargun Dhillon
2020-11-09 19:39 ` Amir Goldstein
2020-11-09 20:24 ` Vivek Goyal
2020-11-06 19:03 ` Vivek Goyal
2020-11-06 19:42 ` Giuseppe Scrivano
2020-11-07 9:35 ` Amir Goldstein
2020-11-07 11:52 ` Sargun Dhillon
2020-11-09 20:40 ` Vivek Goyal
2020-11-09 8:53 ` Giuseppe Scrivano
2020-11-09 10:10 ` Amir Goldstein
2020-11-09 16:36 ` Vivek Goyal
2020-11-09 17:09 ` Vivek Goyal
2020-11-09 17:20 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOQ4uxi6Hc4gNwCiogBG+FeeW-bAUd-ZsW2X=TPJ+6JZCbodVQ@mail.gmail.com' \
--to=amir73il@gmail.com \
--cc=dwalsh@redhat.com \
--cc=gscrivan@redhat.com \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).