linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: overlayfs <linux-unionfs@vger.kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Daniel J Walsh <dwalsh@redhat.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Eric Sandeen <sandeen@redhat.com>,
	pmatilai@redhat.com
Subject: Re: [RFC PATCH v3] overlayfs: Provide mount options sync=off/fs to skip sync
Date: Thu, 2 Jul 2020 10:26:44 +0300	[thread overview]
Message-ID: <CAOQ4uxhLe3ptTcdsyGXRB=w3ub5_is3HcUY0z7OLgLPDg1Mk5w@mail.gmail.com> (raw)
In-Reply-To: <20200701215029.GF369085@redhat.com>

On Thu, Jul 2, 2020 at 12:50 AM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Container folks are complaining that dnf/yum issues too many sync while
> installing packages and this slows down the image build. Build
> requirement is such that they don't care if a node goes down while
> build was still going on. In that case, they will simply throw away
> unfinished layer and start new build. So they don't care about syncing
> intermediate state to the disk and hence don't want to pay the price
> associated with sync.
>
> So they are asking for mount options where they can disable sync on overlay
> mount point.
>
> They primarily seem to have two use cases.
>
> - For building images, they will mount overlay with nosync and then sync
>   upper layer after unmounting overlay and reuse upper as lower for next
>   layer.
>
> - For running containers, they don't seem to care about syncing upper
>   layer because if node goes down, they will simply throw away upper
>   layer and create a fresh one.
>
> So this patch provides two mount options "sync=off" and "sync=fs".
> First option disables all forms of sync. Now it is caller's responsibility
> to throw away upper if system crashes or shuts down and start fresh.
>
> Option "sync=fs" disables all forms of sync except syncfs/umount/remount.
> This is basically useful for image build where we want to persist upper
> layer only after operation is complete and upper will be renamed and
> reused as lower for next layer build.
>
> With sync=off, I am seeing roughly 20% speed up in my VM where I am just
> installing emacs in an image. Installation time drops from 31 seconds to
> 25 seconds when nosync option is used. This is for the case of building on top
> of an image where all packages are already cached. That way I take
> out the network operations latency out of the measurement.
>
> With sync=fs, I am seeing roughly 12% speed up.
>
> Giuseppe is also looking to cut down on number of iops done on the
> disk. He is complaining that often in cloud their VMs are throttled
> if they cross the limit. This option can help them where they reduce
> number of iops (by cutting down on frequent sync and writebacks).
>
> Changes from v2:
> - Added helper functions (Amir Goldstein)
> - Used enums to keep sync state (Amir Goldstein)

Thank you for doing that.
Sorry, but I failed to mention the practical reasons for suggesting the
enum config and you failed to read my mind :-/
So I will explain my reasoning using more words (see below)

>
> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Anyway, you may add:
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index b429c80879ee..e7aea92733f5 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -5,6 +5,14 @@
>   * Copyright (C) 2016 Red Hat, Inc.
>   */
>
> +enum ovl_sync_type {
> +       __OVL_SYNC_OFF          = (1 << 0),
> +       __OVL_SYNC_FS           = (1 << 1),
> +};
> +
> +#define OVL_SYNC_OFF(type)     ((type) & __OVL_SYNC_OFF)
> +#define OVL_SYNC_FS(type)      ((type) & __OVL_SYNC_FS)
> +

My unspoken intention when I wrote either "enum" or "flags" is -

If you use bitwise flags, they reflect the ovl_should_xxx queries:
__OVL_NOSYNC_FILE and __OVL_NOSYNC_FS
#define OVL_SYNC_FILE(type)     (!((type) & __OVL_NOSYNC_FILE))
#define OVL_SYNC_FS(type)      (!((type) & __OVL_NOSYNC_FS))


If you use enum (not bitwise), the distinct enum values reflect the
mount option:
OVL_SYNC_ON (=0), OVL_SYNC_OFF, OVL_SYNC_FS

I am not commenting on this because of some sort of aesthetic taste.
I am commenting on this because I think it would make parts of the
patch simpler/clearer (see below).

As far as I am concerned, for the three possible config values off/fs/on
the distinct enum values are better.
Of course, that is *my* opinion. You may disagree.

>  struct ovl_config {
>         char *lowerdir;
>         char *upperdir;
> @@ -17,6 +25,7 @@ struct ovl_config {
>         bool nfs_export;
>         int xino;
>         bool metacopy;
> +       enum ovl_sync_type sync;
>  };
>
>  struct ovl_sb {
> @@ -90,6 +99,17 @@ static inline struct ovl_fs *OVL_FS(struct super_block *sb)
>         return (struct ovl_fs *)sb->s_fs_info;
>  }
>
> +static inline bool ovl_should_fsync(struct ovl_fs *ofs)
> +{
> +       return (!OVL_SYNC_OFF(ofs->config.sync) &&
> +               ! OVL_SYNC_FS(ofs->config.sync));

option #1 (bitwise):
return OVL_SYNC_FILE(ofs->config.sync);

option #2 (distinct):
return ofs->config.sync == OVL_SYNC_ON;

> +}
> +
> +static inline bool ovl_should_syncfs(struct ovl_fs *ofs)
> +{
> +       return !OVL_SYNC_OFF(ofs->config.sync);

option #1 (bitwise):
return OVL_SYNC_FS(ofs->config.sync);

option #2 (distinct):
return ofs->config.sync != OVL_SYNC_OFF;

[...]

> @@ -362,6 +364,10 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>         if (ofs->config.metacopy != ovl_metacopy_def)
>                 seq_printf(m, ",metacopy=%s",
>                            ofs->config.metacopy ? "on" : "off");
> +       if (OVL_SYNC_OFF(ofs->config.sync))
> +               seq_puts(m, ",sync=off");
> +       if (OVL_SYNC_FS(ofs->config.sync))
> +               seq_puts(m, ",sync=fs");

option #1 (bitwise):
       if (!ofs->config.sync)
               seq_puts(m, ",sync=off");
      else if (!OVL_SYNC_FILE(ofs->config.sync))
               seq_puts(m, ",sync=fs");

option #2 (distinct):
Would be better. See ovl_xino_str[].


> @@ -411,6 +419,8 @@ enum {
>         OPT_XINO_AUTO,
>         OPT_METACOPY_ON,
>         OPT_METACOPY_OFF,
> +       OPT_SYNC_OFF,
> +       OPT_SYNC_FS,
>         OPT_ERR,
>  };
>
> @@ -429,6 +439,8 @@ static const match_table_t ovl_tokens = {
>         {OPT_XINO_AUTO,                 "xino=auto"},
>         {OPT_METACOPY_ON,               "metacopy=on"},
>         {OPT_METACOPY_OFF,              "metacopy=off"},
> +       {OPT_SYNC_OFF,                  "sync=off"},
> +       {OPT_SYNC_FS,                   "sync=fs"},
>         {OPT_ERR,                       NULL}
>  };
>
> @@ -573,6 +585,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>                         metacopy_opt = true;
>                         break;
>
> +               case OPT_SYNC_OFF:
> +                       config->sync |= __OVL_SYNC_OFF;

There is really no point in ORing with old config options here.

option #1 (bitwise):
                       config->sync = __OVL_NOSYNC_FILE | __OVL_NOSYNC_FS;

option #2 (distinct):
                       config->sync = OVL_SYNC_OFF;


> +                       break;
> +
> +               case OPT_SYNC_FS:
> +                       config->sync |= __OVL_SYNC_FS;
> +                       break;
> +
>                 default:
>                         pr_err("unrecognized mount option \"%s\" or missing value\n",
>                                         p);
> @@ -588,6 +608,17 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>                 config->workdir = NULL;
>         }
>
> +       if (OVL_SYNC_OFF(config->sync) && OVL_SYNC_FS(config->sync)) {
> +               pr_err("conflicting options: sync=off,sync=fs\n");
> +               return -EINVAL;
> +       }
> +

We are not warning user that metacopy=off conflicts with metacopy=on,
we just let the last option overwrite previous ones.


> +       if (!config->upperdir &&
> +           (OVL_SYNC_OFF(config->sync) || OVL_SYNC_FS(config->sync))) {
> +               pr_info("option sync=off/fs is meaningless in a non-upper mount, ignoring it.\n");
> +               config->sync = 0;
> +       }
> +

With either bitwise/distinct options this could be simplified:

       if (!config->upperdir && config->sync)
               pr_info("option "sync=off/fs" is meaningless in a
non-upper mount, ignoring it.\n");
               config->sync = 0;

Thanks,
Amir.

  reply	other threads:[~2020-07-02  7:26 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-01 21:50 [RFC PATCH v3] overlayfs: Provide mount options sync=off/fs to skip sync Vivek Goyal
2020-07-02  7:26 ` Amir Goldstein [this message]
2020-07-06 16:06   ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxhLe3ptTcdsyGXRB=w3ub5_is3HcUY0z7OLgLPDg1Mk5w@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=dwalsh@redhat.com \
    --cc=gscrivan@redhat.com \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=pmatilai@redhat.com \
    --cc=sandeen@redhat.com \
    --cc=swhiteho@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).