All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data
@ 2017-10-17 21:05 Vivek Goyal
  2017-10-17 21:05 ` [PATCH 01/11] ovl: Create origin xattr on copy up for all files Vivek Goyal
                   ` (10 more replies)
  0 siblings, 11 replies; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

Hi,

This is V4 of the patches. Changes from V3 are as follows.

This patch series applies on top of Amir's patch to not return -EIO and
allows me to create ORIGIN for hardlinks even if index=off.

- Got rid of dependency on index=on/off. (Amir)
- Re-introduced patch to create ORIGIN for hardlink copy up even if
  index=off. 
- Gor rid of helpers around new barriers and open coded these. It will
  atleast help to see what I am trying to achieve. (Amir).
- Got rid of boolean set_size (Amir)
- Reorganized some code (Amir)
- Renamed ovl_copy_up_data_inode() function. (Amir).

Original description
--------------------

In one of the recent converstions, people mentioned that chown/chmod
lead to copy up files as well as data. We could optimize it so that
only metadata is copied up during chown/chmod and data is copied up when
file is opened for WRITE.

This optimization potentially could be useful with containers and user
namespaces. In popular scenario, people end up doing chown() on whole
image directory tree based on container mappings. And this chown copies
up everything, breaking sharing of page cache between containers.

With these patches, only metadat is copied up during chown() and if file
is opened for READ, d_real() returns lower dentry/inode. That way,
different containers can still continue to use page cache. That's the
use case I have in mind.

Basically, I am relying on storing OVL_XATTR_ORIGIN in upper inode
during copy up. I use that information to get to lower inode later and
do data copy up when necessary.

I also store OVL_XATTR_METACOPY in upper inode to mark that only
metadata has been copied up and data copy up still might be required.

Any feedback is helpful.

Thanks
Vivek

Vivek Goyal (11):
  ovl: Create origin xattr on copy up for all files
  ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check
  ovl: During copy up, first copy up metadata and then data
  ovl: Provide a mount option metacopy=on/off for metadata copyup
  ovl: Copy up only metadata during copy up where it makes sense
  ovl: Set xattr OVL_XATTR_METACOPY on upper file
  ovl: Fix ovl_getattr() to get number of blocks from lower
  ovl: Set OVL_METACOPY flag during ovl_lookup()
  ovl: Return lower dentry if only metadata copy up took place
  ovl: Introduce read/write barriers around metacopy flag update
  ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update

 fs/overlayfs/Kconfig     |   8 ++++
 fs/overlayfs/copy_up.c   | 113 ++++++++++++++++++++++++++++++++++++-----------
 fs/overlayfs/inode.c     |  17 ++++++-
 fs/overlayfs/namei.c     |  40 +++++++++++++++++
 fs/overlayfs/overlayfs.h |   5 ++-
 fs/overlayfs/ovl_entry.h |   1 +
 fs/overlayfs/super.c     |  48 ++++++++++++++++++--
 fs/overlayfs/util.c      |  46 ++++++++++++++-----
 8 files changed, 235 insertions(+), 43 deletions(-)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/11] ovl: Create origin xattr on copy up for all files
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  4:09   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 02/11] ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check Vivek Goyal
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

Right now my understanding is that origin xattr is created for all copied
up files if index=on. And if index=off, then we create it for all type
of files except hardlinks (nlink != 1).

With metadata only copy up, I will still require origin xattr to copy up
data later, so create it even for hardlinks even with index=off. I am
hoping it does not break anything because we do OVL_INDEX test before
using inode number of origin.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index c441f9387a1b..0a6294ea64d5 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -538,9 +538,6 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 	    c->stat.nlink > 1)
 		indexed = true;
 
-	if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || indexed)
-		c->origin = true;
-
 	if (indexed) {
 		c->destdir = ovl_indexdir(c->dentry->d_sb);
 		err = ovl_get_index_name(c->lowerpath.dentry, &c->destname);
@@ -596,6 +593,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		.parent = parent,
 		.dentry = dentry,
 		.workdir = ovl_workdir(dentry),
+		.origin = true,
 	};
 
 	if (WARN_ON(!ctx.workdir))
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 02/11] ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
  2017-10-17 21:05 ` [PATCH 01/11] ovl: Create origin xattr on copy up for all files Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  4:11   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 03/11] ovl: During copy up, first copy up metadata and then data Vivek Goyal
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

At mount time we check if upper supports xattr or not and set ofs->noxattr
field accordingly. But in ovl_check_setxattr() still seems to have logic
which expects that ovl_do_setxattr() can return -EOPNOTSUPP. Not sure when
and how can we hit this code. Feels redundant to me. So I am removing
this code.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/util.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index b9b239fa5cfd..a4ce1c9944f0 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -348,21 +348,12 @@ int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
 		       const char *name, const void *value, size_t size,
 		       int xerr)
 {
-	int err;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 
 	if (ofs->noxattr)
 		return xerr;
 
-	err = ovl_do_setxattr(upperdentry, name, value, size, 0);
-
-	if (err == -EOPNOTSUPP) {
-		pr_warn("overlayfs: cannot set %s xattr on upper\n", name);
-		ofs->noxattr = true;
-		return xerr;
-	}
-
-	return err;
+	return ovl_do_setxattr(upperdentry, name, value, size, 0);
 }
 
 int ovl_set_impure(struct dentry *dentry, struct dentry *upperdentry)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 03/11] ovl: During copy up, first copy up metadata and then data
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
  2017-10-17 21:05 ` [PATCH 01/11] ovl: Create origin xattr on copy up for all files Vivek Goyal
  2017-10-17 21:05 ` [PATCH 02/11] ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  4:13   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

This just helps with later patches where after copying up metadata, we
skip data copying step, if needed.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 0a6294ea64d5..4c0acb2a23ea 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -445,6 +445,23 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 {
 	int err;
 
+	err = ovl_copy_xattr(c->lowerpath.dentry, temp);
+	if (err)
+		return err;
+
+	/*
+	 * Store identifier of lower inode in upper inode xattr to
+	 * allow lookup of the copy up origin inode.
+	 *
+	 * Don't set origin when we are breaking the association with a lower
+	 * hard link.
+	 */
+	if (c->origin) {
+		err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
+		if (err)
+			return err;
+	}
+
 	if (S_ISREG(c->stat.mode)) {
 		struct path upperpath;
 
@@ -457,29 +474,12 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	err = ovl_copy_xattr(c->lowerpath.dentry, temp);
-	if (err)
-		return err;
-
 	inode_lock(temp->d_inode);
 	err = ovl_set_attr(temp, &c->stat);
 	inode_unlock(temp->d_inode);
 	if (err)
 		return err;
 
-	/*
-	 * Store identifier of lower inode in upper inode xattr to
-	 * allow lookup of the copy up origin inode.
-	 *
-	 * Don't set origin when we are breaking the association with a lower
-	 * hard link.
-	 */
-	if (c->origin) {
-		err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
-		if (err)
-			return err;
-	}
-
 	return 0;
 }
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (2 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 03/11] ovl: During copy up, first copy up metadata and then data Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  4:31   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 05/11] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

By default metadata only copy up is disabled. Provide a mount option so
that users can choose one way or other.

Also provide a kernel config and module option to enable/disable
metacopy feature.

Like index feature, when overlay is mounted, on root upper directory we
set ORIGIN which points to lower. And at later mount time it is verified
again. This hopes to get the configuration right. But this does only so
much as we don't verify all the lowers. So it is possible that a lower is
missing and later data copy up fails.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/Kconfig     |  8 ++++++++
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/super.c     | 38 +++++++++++++++++++++++++++++++++++---
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index cbfc196e5dc5..17a0b17ad14c 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -43,3 +43,11 @@ config OVERLAY_FS_INDEX
 	  outcomes.  However, mounting the same overlay with an old kernel
 	  read-write and then mounting it again with a new kernel, will have
 	  unexpected results.
+
+config OVERLAY_FS_METACOPY
+	bool "Overlayfs: turn on metadata only copy up feature by default"
+	depends on OVERLAY_FS
+	help
+	  If this config option is enabled then overlay filesystems will
+	  copy up only metadata where appropriate and data copy up will
+	  happen when a file is opended for WRITE operation.
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 25d9b5adcd42..6806f0b0fbc2 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -15,6 +15,7 @@ struct ovl_config {
 	bool default_permissions;
 	bool redirect_dir;
 	bool index;
+	bool metacopy;
 };
 
 /* private information held for overlayfs's superblock */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 092d150643c1..32e3d4be1a71 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -39,6 +39,11 @@ module_param_named(index, ovl_index_def, bool, 0644);
 MODULE_PARM_DESC(ovl_index_def,
 		 "Default to on or off for the inodes index feature");
 
+static bool ovl_metacopy_def = IS_ENABLED(CONFIG_OVERLAY_FS_METACOPY);
+module_param_named(metacopy, ovl_metacopy_def, bool, 0644);
+MODULE_PARM_DESC(ovl_metacopy_def,
+		 "Default to on or off for the metadata only copy up feature");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -303,6 +308,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ufs->config.index != ovl_index_def)
 		seq_printf(m, ",index=%s",
 			   ufs->config.index ? "on" : "off");
+	if (ufs->config.metacopy != ovl_metacopy_def)
+		seq_printf(m, ",metacopy=%s",
+			   ufs->config.metacopy ? "on" : "off");
 	return 0;
 }
 
@@ -336,6 +344,8 @@ enum {
 	OPT_REDIRECT_DIR_OFF,
 	OPT_INDEX_ON,
 	OPT_INDEX_OFF,
+	OPT_METACOPY_ON,
+	OPT_METACOPY_OFF,
 	OPT_ERR,
 };
 
@@ -348,6 +358,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_REDIRECT_DIR_OFF,		"redirect_dir=off"},
 	{OPT_INDEX_ON,			"index=on"},
 	{OPT_INDEX_OFF,			"index=off"},
+	{OPT_METACOPY_ON,		"metacopy=on"},
+	{OPT_METACOPY_OFF,		"metacopy=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -428,6 +440,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->index = false;
 			break;
 
+		case OPT_METACOPY_ON:
+			config->metacopy = true;
+			break;
+
+		case OPT_METACOPY_OFF:
+			config->metacopy = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -644,9 +664,16 @@ static int ovl_lower_dir(const char *name, struct path *path,
 	 * The inodes index feature needs to encode and decode file
 	 * handles, so it requires that all layers support them.
 	 */
-	if (ofs->config.index && !ovl_can_decode_fh(path->dentry->d_sb)) {
+	if ((ofs->config.index || ofs->config.metacopy) &&
+	     !ovl_can_decode_fh(path->dentry->d_sb)) {
+		if (ofs->config.index)
+			pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
+
+		if (ofs->config.metacopy)
+			pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to metacopy=off.\n", name);
+
 		ofs->config.index = false;
-		pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
+		ofs->config.metacopy = false;
 	}
 
 	return 0;
@@ -847,6 +874,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	ufs->config.redirect_dir = ovl_redirect_dir_def;
 	ufs->config.index = ovl_index_def;
+	ufs->config.metacopy = ovl_metacopy_def;
+
 	err = ovl_parse_opt((char *) data, &ufs->config);
 	if (err)
 		goto out_free_config;
@@ -1057,7 +1086,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
 		ufs->same_sb = NULL;
 
-	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
+	if (!(ovl_force_readonly(ufs)) &&
+	      (ufs->config.index || ufs->config.metacopy)) {
 		/* Verify lower root is upper root origin */
 		err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
 					stack[0].dentry, false, true);
@@ -1065,7 +1095,9 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			pr_err("overlayfs: failed to verify upper root origin\n");
 			goto out_put_lower_mnt;
 		}
+	}
 
+	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
 		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
 						   OVL_INDEXDIR_NAME, true);
 		if (ufs->indexdir) {
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 05/11] ovl: Copy up only metadata during copy up where it makes sense
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (3 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  4:46   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file Vivek Goyal
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

If it makes sense to copy up only metadata during copy up, do it. This
is done for regular files which are not opened for WRITE and have origin
being saved.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 4c0acb2a23ea..cf0b36518a3a 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -306,11 +306,8 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 			return PTR_ERR(fh);
 	}
 
-	/*
-	 * Do not fail when upper doesn't support xattrs.
-	 */
 	err = ovl_check_setxattr(dentry, upper, OVL_XATTR_ORIGIN, fh,
-				 fh ? fh->len : 0, 0);
+				 fh ? fh->len : 0, -EOPNOTSUPP);
 	kfree(fh);
 
 	return err;
@@ -328,6 +325,7 @@ struct ovl_copy_up_ctx {
 	struct dentry *workdir;
 	bool tmpfile;
 	bool origin;
+	bool metadata_only;
 };
 
 static int ovl_link_up(struct ovl_copy_up_ctx *c)
@@ -458,11 +456,15 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 	 */
 	if (c->origin) {
 		err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
-		if (err)
-			return err;
+		if (err) {
+			if (err != -EOPNOTSUPP)
+				return err;
+			/* Upper does not support xattr. Copy up data as well */
+			c->metadata_only = false;
+		}
 	}
 
-	if (S_ISREG(c->stat.mode)) {
+	if (S_ISREG(c->stat.mode) && !c->metadata_only) {
 		struct path upperpath;
 
 		ovl_path_upper(c->dentry, &upperpath);
@@ -589,11 +591,13 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	int err;
 	DEFINE_DELAYED_CALL(done);
 	struct path parentpath;
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_copy_up_ctx ctx = {
 		.parent = parent,
 		.dentry = dentry,
 		.workdir = ovl_workdir(dentry),
 		.origin = true,
+		.metadata_only = ofs->config.metacopy,
 	};
 
 	if (WARN_ON(!ctx.workdir))
@@ -614,9 +618,17 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	if (err)
 		return err;
 
+	if (!S_ISREG(ctx.stat.mode))
+		ctx.metadata_only = false;
+
 	/* maybe truncate regular file. this has no effect on dirs */
-	if (flags & O_TRUNC)
+	if (flags & O_TRUNC) {
 		ctx.stat.size = 0;
+		ctx.metadata_only = false;
+	}
+
+	if (flags & (OPEN_FMODE(flags) & FMODE_WRITE))
+		ctx.metadata_only = false;
 
 	if (S_ISLNK(ctx.stat.mode)) {
 		ctx.link = vfs_get_link(ctx.lowerpath.dentry, &done);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (4 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 05/11] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  4:57   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

Presence of OVL_XATTR_METACOPY reflects that file has been copied up
with metadata only and data needs to be copied up later from lower.
So this xattr is set when a metadata copy takes place and cleared when
data copy takes place.

We also use a bit in ovl_inode->flags to cache OVL_METACOPY which reflects
whether ovl inode has only metadata copied up.

ovl_set_size() code has been taken from Amir's patch.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 59 ++++++++++++++++++++++++++++++++++++++++++------
 fs/overlayfs/inode.c     |  3 ++-
 fs/overlayfs/overlayfs.h |  5 +++-
 fs/overlayfs/util.c      | 21 +++++++++++++++--
 4 files changed, 77 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index cf0b36518a3a..99b7be4ff4fc 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -196,6 +196,16 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 	return error;
 }
 
+static int ovl_set_size(struct dentry *upperdentry, struct kstat *stat)
+{
+	struct iattr attr = {
+		.ia_valid = ATTR_SIZE,
+		.ia_size = stat->size,
+	};
+
+	return notify_change(upperdentry, &attr, NULL);
+}
+
 static int ovl_set_timestamps(struct dentry *upperdentry, struct kstat *stat)
 {
 	struct iattr attr = {
@@ -439,6 +449,27 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
 	goto out;
 }
 
+/* Copy up data of an inode which was copied up metadata only in the past. */
+static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
+{
+	struct path upperpath;
+	int err;
+
+	ovl_path_upper(c->dentry, &upperpath);
+	BUG_ON(upperpath.dentry == NULL);
+
+	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	if (err)
+		return err;
+
+	err= vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY);
+	if (err)
+		return err;
+
+	ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
+	return err;
+}
+
 static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 {
 	int err;
@@ -476,13 +507,21 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
+	if (c->metadata_only) {
+		err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
+					 NULL, 0, -EOPNOTSUPP);
+		if (err)
+			return err;
+	}
+
 	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
-	inode_unlock(temp->d_inode);
-	if (err)
-		return err;
+	if (c->metadata_only)
+		err = ovl_set_size(temp, &c->stat);
 
-	return 0;
+	if (!err)
+		err = ovl_set_attr(temp, &c->stat);
+	inode_unlock(temp->d_inode);
+	return err;
 }
 
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
@@ -510,6 +549,8 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
 	if (err)
 		goto out_cleanup;
 
+	if (c->metadata_only)
+		ovl_set_flag(OVL_METACOPY, d_inode(c->dentry));
 	ovl_inode_update(d_inode(c->dentry), newdentry);
 out:
 	dput(temp);
@@ -637,7 +678,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	}
 	ovl_do_check_copy_up(ctx.lowerpath.dentry);
 
-	err = ovl_copy_up_start(dentry);
+	err = ovl_copy_up_start(dentry, flags);
 	/* err < 0: interrupted, err > 0: raced with another copy-up */
 	if (unlikely(err)) {
 		if (err > 0)
@@ -647,6 +688,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			err = ovl_do_copy_up(&ctx);
 		if (!err && !ovl_dentry_has_upper_alias(dentry))
 			err = ovl_link_up(&ctx);
+		if (!err && ovl_dentry_needs_data_copy_up(dentry, flags))
+			err = ovl_copy_up_meta_inode_data(&ctx);
 		ovl_copy_up_end(dentry);
 	}
 	do_delayed_call(&done);
@@ -677,8 +720,10 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		 *      with rename.
 		 */
 		if (ovl_dentry_upper(dentry) &&
-		    ovl_dentry_has_upper_alias(dentry))
+		    ovl_dentry_has_upper_alias(dentry) &&
+		    !ovl_dentry_needs_data_copy_up(dentry, flags)) {
 			break;
+		}
 
 		next = dget(dentry);
 		/* find the topmost dentry not yet copied up */
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 321511ed8c42..1b4b42c45ed5 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -320,7 +320,8 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	if (ovl_dentry_upper(dentry) &&
-	    ovl_dentry_has_upper_alias(dentry))
+	    ovl_dentry_has_upper_alias(dentry) &&
+	    !ovl_dentry_needs_data_copy_up(dentry, flags))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d9a0edd4e57e..dcec9456c654 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -26,10 +26,12 @@ enum ovl_path_type {
 #define OVL_XATTR_ORIGIN OVL_XATTR_PREFIX "origin"
 #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure"
 #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink"
+#define OVL_XATTR_METACOPY OVL_XATTR_PREFIX "metacopy"
 
 enum ovl_flag {
 	OVL_IMPURE,
 	OVL_INDEX,
+	OVL_METACOPY,
 };
 
 /*
@@ -211,6 +213,7 @@ bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry);
 bool ovl_dentry_has_upper_alias(struct dentry *dentry);
 void ovl_dentry_set_upper_alias(struct dentry *dentry);
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags);
 bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
@@ -221,7 +224,7 @@ void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
-int ovl_copy_up_start(struct dentry *dentry);
+int ovl_copy_up_start(struct dentry *dentry, int flags);
 void ovl_copy_up_end(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index a4ce1c9944f0..91c542d1a39d 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -227,6 +227,22 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
 	oe->has_upper = true;
 }
 
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
+	if (!S_ISREG(d_inode(dentry)->i_mode))
+		return false;
+
+	if (!flags)
+		return false;
+
+	if (!(OPEN_FMODE(flags) & FMODE_WRITE))
+		return false;
+
+	if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
+		return false;
+
+	return true;
+}
+
 bool ovl_redirect_dir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -310,13 +326,14 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
-int ovl_copy_up_start(struct dentry *dentry)
+int ovl_copy_up_start(struct dentry *dentry, int flags)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_dentry_has_upper_alias(dentry)) {
+	if (!err && ovl_dentry_has_upper_alias(dentry) &&
+	    !ovl_dentry_needs_data_copy_up(dentry, flags)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (5 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  5:01   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup() Vivek Goyal
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

If an inode has been copied up metadata only, then we need to query the
number of blocks from lower and fill up the stat->st_blocks.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 1b4b42c45ed5..ad30edc0f425 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -140,6 +140,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	if (!is_dir && ovl_test_flag(OVL_INDEX, d_inode(dentry)))
 		stat->nlink = dentry->d_inode->i_nlink;
 
+	if (ovl_test_flag(OVL_METACOPY, d_inode(dentry))) {
+		struct kstat lowerstat;
+
+		ovl_path_lower(dentry, &realpath);
+		err = vfs_getattr(&realpath, &lowerstat, STATX_BLOCKS, flags);
+		if (err)
+			goto out;
+		stat->blocks = lowerstat.blocks;
+	}
 out:
 	revert_creds(old_cred);
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup()
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (6 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  5:06   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 09/11] ovl: Return lower dentry if only metadata copy up took place Vivek Goyal
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

During lookup, check for presence of OVL_XATTR_METACOPY and if present,
set OVL_METACOPY bit in flags.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/namei.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 88ff1daeb3d7..d32da041f4bf 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -26,6 +26,24 @@ struct ovl_lookup_data {
 	char *redirect;
 };
 
+/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
+static int ovl_check_metacopy(struct dentry *dentry)
+{
+	int res;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto out;
+	}
+
+	return 1;
+out:
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	return res;
+}
+
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 			      size_t prelen, const char *post)
 {
@@ -594,6 +612,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	bool metacopy = false;
 	struct ovl_lookup_data d = {
 		.name = dentry->d_name,
 		.is_dir = false,
@@ -634,6 +653,12 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 					       roe->numlower, &stack, &ctr);
 			if (err)
 				goto out;
+
+			err = ovl_check_metacopy(upperdentry);
+			if (err < 0)
+				goto out;
+			if (err == 1)
+				metacopy = true;
 		}
 
 		if (d.redirect) {
@@ -719,6 +744,19 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		OVL_I(inode)->redirect = upperredirect;
 		if (index)
 			ovl_set_flag(OVL_INDEX, inode);
+
+		if (metacopy) {
+			/*
+			 * Found an upper with metacopy set but at the same
+			 * time there is no lower dentry. Something is not
+			 * right.
+			 */
+			if (!ctr) {
+				err = -ESTALE;
+				goto out_put_inode;
+			}
+			ovl_set_flag(OVL_METACOPY, inode);
+		}
 	}
 
 	revert_creds(old_cred);
@@ -729,6 +767,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	return NULL;
 
+out_put_inode:
+	iput(inode);
 out_free_oe:
 	dentry->d_fsdata = NULL;
 	kfree(oe);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/11] ovl: Return lower dentry if only metadata copy up took place
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (7 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup() Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  5:07   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update Vivek Goyal
  2017-10-17 21:05 ` [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update Vivek Goyal
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

Upper dentry inode does not have data. So return lower dentry if upper
is only a metadata copy.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/super.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 32e3d4be1a71..fa802c89235a 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -101,10 +101,14 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 			err = ovl_check_append_only(d_inode(real), open_flags);
 			if (err)
 				return ERR_PTR(err);
+
+			if (ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
+				goto lower;
 		}
 		return real;
 	}
 
+lower:
 	real = ovl_dentry_lower(dentry);
 	if (!real)
 		goto bug;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (8 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 09/11] ovl: Return lower dentry if only metadata copy up took place Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  5:19   ` Amir Goldstein
  2017-10-17 21:05 ` [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update Vivek Goyal
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

If a file is copied up metadata only and later when same file is opened
for WRITE, then data copy up takes place. We copy up data, remove METACOPY
xattr and then clear the METACOPY flag from ovl_entry->flags. While all
these operations happen with oi->lock held, read side of oi->flags is
lockless. That is another thread on another cpu can check if METACOPY
flag is clear or not.

So this gives us an ordering requirement w.r.t METACOPY flag. That is, if
another cpu sees METACOPY flag clear, then it should be guaranteed that
effects of data copy up and remove xattr operations are also visible.

For example.

	CPU1				CPU2
ovl_copy_up_flags()			acquire(oi->lock)
 ovl_dentry_needs_data_copy_up()	  ovl_copy_up_data()
   ovl_test_flag(OVL_METACOPY)		  vfs_removexattr()
					  ovl_clear_metacopy_flag()
					release(oi->lock)

Say CPU2 is copying up data and in the end clears metacopy flag. But if
CPU1 perceives the effects of clearing of metacopy flag but not effects of
preceeding operations, that would be a problem.

Hence this patch introduces smp_wmb() on metacopy flag clear operation
and smp_rmb() on metacopy flag test operation.

May be some other lock or barrier is already covering it. But I am not sure
what that is and is it obvious enough that we will not break it in future.

So hence trying to be safe here and introducing barriers explicitly for
metacopy bit.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c |  6 ++++++
 fs/overlayfs/util.c    | 12 ++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 99b7be4ff4fc..200ee3b7d397 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -466,6 +466,12 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
 	if (err)
 		return err;
 
+	/*
+	 * Pairs with smp_rmb() after test_bit(OVL_METACOPY). Make sure
+	 * if OVL_METACOPY flag reset is visible, then all the write
+	 * operations before it are visible as well
+	 */
+	smp_wmb();
 	ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
 	return err;
 }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 91c542d1a39d..6bca9a960c6d 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -228,6 +228,8 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
 }
 
 bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
+	bool metacopy;
+
 	if (!S_ISREG(d_inode(dentry)->i_mode))
 		return false;
 
@@ -237,9 +239,15 @@ bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
 	if (!(OPEN_FMODE(flags) & FMODE_WRITE))
 		return false;
 
-	if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
+	metacopy = ovl_test_flag(OVL_METACOPY, d_inode(dentry));
+	/*
+	 * Pairs with smp_wmb() while clearing OVL_METACOPY. Make sure if
+	 * clearing of OVL_METACOPY is visible, then effects of writes
+	 * before that are visible too.
+	 */
+	smp_rmb();
+	if (!metacopy)
 		return false;
-
 	return true;
 }
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (9 preceding siblings ...)
  2017-10-17 21:05 ` [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update Vivek Goyal
@ 2017-10-17 21:05 ` Vivek Goyal
  2017-10-18  5:40   ` Amir Goldstein
  10 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-17 21:05 UTC (permalink / raw)
  To: linux-unionfs; +Cc: amir73il, miklos, vgoyal

OVL_METACOPY in oi->flags can be accessed in lockless manner. So if a file
is being copied up metadata only, we need to make sure that upperdentry is
visible only after OVL_METACOPY flag has been set. IOW, if oi->__upperdentry
is visble to a cpu, then we also need to make sure any updates to OVL_METACOPY
flags are visible too.

Otherwise it might happen that a file was copied up metadata only but d_real()
might think that this is not a metacopy only dentry and return upper dentry
instead of lower dentry. Something similar might happen on getattr() path. We
might return number of blocks from upper dentry instead of querying lower and
getting number of blocks from there.

There is already a smp_read_barrier_depends(), when we fetch oi->__upperdentry
pointer.

static inline struct dentry *ovl_upperdentry_dereference(struct ovl_inode *oi)
{
        return lockless_dereference(oi->__upperdentry);
}

lockless_deference() has smp_read_barrier_depends().

My understanding is that, this is a data dependency barrier and not a full
read barrier. So it will only make sure that any updates to "dentry" are
visible but there are no guarantees about oi->flags updates. So a data
dependency barrier might not be sufficient.

Hence this patch introduces an extra smp_rmb() explicitly. So following is
the intended pattern.

	CPU1					CPU2

	LOAD oi->__upperdentry			STORE oi->flags
	smp_rmb();				smp_wmb();
	LOAD oi->flags				STORE oi->__upperdentry;

This should make sure that if any oi->__upperdentry is visible to CPU1, then
updates to oi->flags are visible too.

Question1: Should we replace data dependency barrier with full read barrier
in ovl_upperdentry_dereference(), instead?

Question2: Am I overthinking and these barriers are not needed and some
	   other implicity barriers already cover this situation.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c | 5 +++++
 fs/overlayfs/super.c | 6 ++++++
 fs/overlayfs/util.c  | 6 ++++++
 3 files changed, 17 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index ad30edc0f425..ee623674628e 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -140,6 +140,11 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	if (!is_dir && ovl_test_flag(OVL_INDEX, d_inode(dentry)))
 		stat->nlink = dentry->d_inode->i_nlink;
 
+	/*
+	 * Pairs with smp_wmb() in ovl_inode_update(). Make sure if upperdenty
+	 * is visible, then any possible udpate to OVL_METACOPY is visible too.
+	 */
+	smp_rmb();
 	if (ovl_test_flag(OVL_METACOPY, d_inode(dentry))) {
 		struct kstat lowerstat;
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index fa802c89235a..fc6a78b31739 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -102,6 +102,12 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 			if (err)
 				return ERR_PTR(err);
 
+			/*
+			 * Pairs with smp_wmb() in ovl_inode_update(). Make
+			 * sure if upper dentry is visible, then any updates
+			 * to OVL_METACOPY flags are visible too.
+			 */
+			smp_rmb();
 			if (ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
 				goto lower;
 		}
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 6bca9a960c6d..8567a9d59230 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -239,6 +239,12 @@ bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
 	if (!(OPEN_FMODE(flags) & FMODE_WRITE))
 		return false;
 
+	/*
+	 * This smp_rmb() pairs with smp_wmb() in ovl_inode_update(). Basically
+	 * we are trying to make sure that if upper dentry is visible, then
+	 * make sure OVL_METACOPY flag is visible too.
+	 */
+	smp_rmb();
 	metacopy = ovl_test_flag(OVL_METACOPY, d_inode(dentry));
 	/*
 	 * Pairs with smp_wmb() while clearing OVL_METACOPY. Make sure if
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/11] ovl: Create origin xattr on copy up for all files
  2017-10-17 21:05 ` [PATCH 01/11] ovl: Create origin xattr on copy up for all files Vivek Goyal
@ 2017-10-18  4:09   ` Amir Goldstein
  2017-10-18 12:55     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:09 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Right now my understanding is that origin xattr is created for all copied
> up files if index=on. And if index=off, then we create it for all type
> of files except hardlinks (nlink != 1).
>
> With metadata only copy up, I will still require origin xattr to copy up
> data later, so create it even for hardlinks even with index=off. I am
> hoping it does not break anything because we do OVL_INDEX test before
> using inode number of origin.

I still have a nudging buzz that loosing the information about this upper
being a broken hardlink may come back to bite us some time, but can't
put my finger on how.

To reduce the level of that buzz, I suggest to set origin for broken hardlink
only when doing metacopy and then break the origin association when
removing metacopy xattr.

In any case, this patch needs to update the comment above setting
origin xattr that claims to break the association with broken hardlink.

>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/copy_up.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index c441f9387a1b..0a6294ea64d5 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -538,9 +538,6 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
>             c->stat.nlink > 1)
>                 indexed = true;
>
> -       if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || indexed)
> -               c->origin = true;
> -
>         if (indexed) {
>                 c->destdir = ovl_indexdir(c->dentry->d_sb);
>                 err = ovl_get_index_name(c->lowerpath.dentry, &c->destname);
> @@ -596,6 +593,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>                 .parent = parent,
>                 .dentry = dentry,
>                 .workdir = ovl_workdir(dentry),
> +               .origin = true,
>         };
>
>         if (WARN_ON(!ctx.workdir))
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check
  2017-10-17 21:05 ` [PATCH 02/11] ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check Vivek Goyal
@ 2017-10-18  4:11   ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:11 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> At mount time we check if upper supports xattr or not and set ofs->noxattr
> field accordingly. But in ovl_check_setxattr() still seems to have logic
> which expects that ovl_do_setxattr() can return -EOPNOTSUPP. Not sure when
> and how can we hit this code. Feels redundant to me. So I am removing
> this code.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Looks good

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/util.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index b9b239fa5cfd..a4ce1c9944f0 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -348,21 +348,12 @@ int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
>                        const char *name, const void *value, size_t size,
>                        int xerr)
>  {
> -       int err;
>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>
>         if (ofs->noxattr)
>                 return xerr;
>
> -       err = ovl_do_setxattr(upperdentry, name, value, size, 0);
> -
> -       if (err == -EOPNOTSUPP) {
> -               pr_warn("overlayfs: cannot set %s xattr on upper\n", name);
> -               ofs->noxattr = true;
> -               return xerr;
> -       }
> -
> -       return err;
> +       return ovl_do_setxattr(upperdentry, name, value, size, 0);
>  }
>
>  int ovl_set_impure(struct dentry *dentry, struct dentry *upperdentry)
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 03/11] ovl: During copy up, first copy up metadata and then data
  2017-10-17 21:05 ` [PATCH 03/11] ovl: During copy up, first copy up metadata and then data Vivek Goyal
@ 2017-10-18  4:13   ` Amir Goldstein
  2017-10-18  4:39     ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:13 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> This just helps with later patches where after copying up metadata, we
> skip data copying step, if needed.

This patch looks fine, but I must be missing how it helps with later patches.
If it doesn't help please remove it.

>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/copy_up.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 0a6294ea64d5..4c0acb2a23ea 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -445,6 +445,23 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
>  {
>         int err;
>
> +       err = ovl_copy_xattr(c->lowerpath.dentry, temp);
> +       if (err)
> +               return err;
> +
> +       /*
> +        * Store identifier of lower inode in upper inode xattr to
> +        * allow lookup of the copy up origin inode.
> +        *
> +        * Don't set origin when we are breaking the association with a lower
> +        * hard link.
> +        */
> +       if (c->origin) {
> +               err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
> +               if (err)
> +                       return err;
> +       }
> +
>         if (S_ISREG(c->stat.mode)) {
>                 struct path upperpath;
>
> @@ -457,29 +474,12 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
>                         return err;
>         }
>
> -       err = ovl_copy_xattr(c->lowerpath.dentry, temp);
> -       if (err)
> -               return err;
> -
>         inode_lock(temp->d_inode);
>         err = ovl_set_attr(temp, &c->stat);
>         inode_unlock(temp->d_inode);
>         if (err)
>                 return err;
>
> -       /*
> -        * Store identifier of lower inode in upper inode xattr to
> -        * allow lookup of the copy up origin inode.
> -        *
> -        * Don't set origin when we are breaking the association with a lower
> -        * hard link.
> -        */
> -       if (c->origin) {
> -               err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
> -               if (err)
> -                       return err;
> -       }
> -
>         return 0;
>  }
>
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-17 21:05 ` [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
@ 2017-10-18  4:31   ` Amir Goldstein
  2017-10-18 13:03     ` Vivek Goyal
  2017-10-18 14:10     ` Vivek Goyal
  0 siblings, 2 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:31 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> By default metadata only copy up is disabled. Provide a mount option so
> that users can choose one way or other.
>
> Also provide a kernel config and module option to enable/disable
> metacopy feature.
>
> Like index feature, when overlay is mounted, on root upper directory we
> set ORIGIN which points to lower. And at later mount time it is verified
> again. This hopes to get the configuration right. But this does only so
> much as we don't verify all the lowers. So it is possible that a lower is
> missing and later data copy up fails.

Like index feature, please error mount if ovl_inuse_trylock fails.
As you know, this error is only conditional because of backward
compatibility, so any new opt-in feature should enforce it.

>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/Kconfig     |  8 ++++++++
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/super.c     | 38 +++++++++++++++++++++++++++++++++++---
>  3 files changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index cbfc196e5dc5..17a0b17ad14c 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -43,3 +43,11 @@ config OVERLAY_FS_INDEX
>           outcomes.  However, mounting the same overlay with an old kernel
>           read-write and then mounting it again with a new kernel, will have
>           unexpected results.
> +
> +config OVERLAY_FS_METACOPY
> +       bool "Overlayfs: turn on metadata only copy up feature by default"
> +       depends on OVERLAY_FS
> +       help
> +         If this config option is enabled then overlay filesystems will
> +         copy up only metadata where appropriate and data copy up will
> +         happen when a file is opended for WRITE operation.
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 25d9b5adcd42..6806f0b0fbc2 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -15,6 +15,7 @@ struct ovl_config {
>         bool default_permissions;
>         bool redirect_dir;
>         bool index;
> +       bool metacopy;
>  };
>
>  /* private information held for overlayfs's superblock */
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 092d150643c1..32e3d4be1a71 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -39,6 +39,11 @@ module_param_named(index, ovl_index_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_index_def,
>                  "Default to on or off for the inodes index feature");
>
> +static bool ovl_metacopy_def = IS_ENABLED(CONFIG_OVERLAY_FS_METACOPY);
> +module_param_named(metacopy, ovl_metacopy_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_metacopy_def,
> +                "Default to on or off for the metadata only copy up feature");
> +
>  static void ovl_dentry_release(struct dentry *dentry)
>  {
>         struct ovl_entry *oe = dentry->d_fsdata;
> @@ -303,6 +308,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>         if (ufs->config.index != ovl_index_def)
>                 seq_printf(m, ",index=%s",
>                            ufs->config.index ? "on" : "off");
> +       if (ufs->config.metacopy != ovl_metacopy_def)
> +               seq_printf(m, ",metacopy=%s",
> +                          ufs->config.metacopy ? "on" : "off");
>         return 0;
>  }
>
> @@ -336,6 +344,8 @@ enum {
>         OPT_REDIRECT_DIR_OFF,
>         OPT_INDEX_ON,
>         OPT_INDEX_OFF,
> +       OPT_METACOPY_ON,
> +       OPT_METACOPY_OFF,
>         OPT_ERR,
>  };
>
> @@ -348,6 +358,8 @@ static const match_table_t ovl_tokens = {
>         {OPT_REDIRECT_DIR_OFF,          "redirect_dir=off"},
>         {OPT_INDEX_ON,                  "index=on"},
>         {OPT_INDEX_OFF,                 "index=off"},
> +       {OPT_METACOPY_ON,               "metacopy=on"},
> +       {OPT_METACOPY_OFF,              "metacopy=off"},
>         {OPT_ERR,                       NULL}
>  };
>
> @@ -428,6 +440,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>                         config->index = false;
>                         break;
>
> +               case OPT_METACOPY_ON:
> +                       config->metacopy = true;
> +                       break;
> +
> +               case OPT_METACOPY_OFF:
> +                       config->metacopy = false;
> +                       break;
> +
>                 default:
>                         pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>                         return -EINVAL;
> @@ -644,9 +664,16 @@ static int ovl_lower_dir(const char *name, struct path *path,
>          * The inodes index feature needs to encode and decode file
>          * handles, so it requires that all layers support them.
>          */
> -       if (ofs->config.index && !ovl_can_decode_fh(path->dentry->d_sb)) {
> +       if ((ofs->config.index || ofs->config.metacopy) &&
> +            !ovl_can_decode_fh(path->dentry->d_sb)) {
> +               if (ofs->config.index)
> +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
> +
> +               if (ofs->config.metacopy)
> +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to metacopy=off.\n", name);
> +

In my verify_dir patches I used the following more compact warning
style instead of granular warnings:

    pr_warn("overlayfs: fs on '%s' does not support file handles,
falling back to index=off,metacopy=off.\n", name);

It is a bit less informative, but IMO the result in nicer to look at,
both in the code and in dmesg.
Others may have a different opinion. Please consider.
For example, see how the following warning from my verify_dir patches
has expanded and imagine how
much more messier it would be if we broke it into separate warnings:

    pr_warn("overlayfs: upper fs does not support xattr, falling back
to redirect_dir=off, index=off, no verify_dir and no opaque dir.\n");


>                 ofs->config.index = false;
> -               pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
> +               ofs->config.metacopy = false;
>         }
>
>         return 0;
> @@ -847,6 +874,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>
>         ufs->config.redirect_dir = ovl_redirect_dir_def;
>         ufs->config.index = ovl_index_def;
> +       ufs->config.metacopy = ovl_metacopy_def;
> +
>         err = ovl_parse_opt((char *) data, &ufs->config);
>         if (err)
>                 goto out_free_config;
> @@ -1057,7 +1086,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>         else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
>                 ufs->same_sb = NULL;
>
> -       if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
> +       if (!(ovl_force_readonly(ufs)) &&
> +             (ufs->config.index || ufs->config.metacopy)) {
>                 /* Verify lower root is upper root origin */
>                 err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
>                                         stack[0].dentry, false, true);
> @@ -1065,7 +1095,9 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>                         pr_err("overlayfs: failed to verify upper root origin\n");
>                         goto out_put_lower_mnt;
>                 }
> +       }
>
> +       if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
>                 ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
>                                                    OVL_INDEXDIR_NAME, true);
>                 if (ufs->indexdir) {
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 03/11] ovl: During copy up, first copy up metadata and then data
  2017-10-18  4:13   ` Amir Goldstein
@ 2017-10-18  4:39     ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:39 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 7:13 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> This just helps with later patches where after copying up metadata, we
>> skip data copying step, if needed.
>
> This patch looks fine, but I must be missing how it helps with later patches.
> If it doesn't help please remove it.
>

Found it in patch 5, so

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

>>
>> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>> ---
>>  fs/overlayfs/copy_up.c | 34 +++++++++++++++++-----------------
>>  1 file changed, 17 insertions(+), 17 deletions(-)
>>
>> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>> index 0a6294ea64d5..4c0acb2a23ea 100644
>> --- a/fs/overlayfs/copy_up.c
>> +++ b/fs/overlayfs/copy_up.c
>> @@ -445,6 +445,23 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
>>  {
>>         int err;
>>
>> +       err = ovl_copy_xattr(c->lowerpath.dentry, temp);
>> +       if (err)
>> +               return err;
>> +
>> +       /*
>> +        * Store identifier of lower inode in upper inode xattr to
>> +        * allow lookup of the copy up origin inode.
>> +        *
>> +        * Don't set origin when we are breaking the association with a lower
>> +        * hard link.
>> +        */
>> +       if (c->origin) {
>> +               err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
>> +               if (err)
>> +                       return err;
>> +       }
>> +
>>         if (S_ISREG(c->stat.mode)) {
>>                 struct path upperpath;
>>
>> @@ -457,29 +474,12 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
>>                         return err;
>>         }
>>
>> -       err = ovl_copy_xattr(c->lowerpath.dentry, temp);
>> -       if (err)
>> -               return err;
>> -
>>         inode_lock(temp->d_inode);
>>         err = ovl_set_attr(temp, &c->stat);
>>         inode_unlock(temp->d_inode);
>>         if (err)
>>                 return err;
>>
>> -       /*
>> -        * Store identifier of lower inode in upper inode xattr to
>> -        * allow lookup of the copy up origin inode.
>> -        *
>> -        * Don't set origin when we are breaking the association with a lower
>> -        * hard link.
>> -        */
>> -       if (c->origin) {
>> -               err = ovl_set_origin(c->dentry, c->lowerpath.dentry, temp);
>> -               if (err)
>> -                       return err;
>> -       }
>> -
>>         return 0;
>>  }
>>
>> --
>> 2.13.5
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/11] ovl: Copy up only metadata during copy up where it makes sense
  2017-10-17 21:05 ` [PATCH 05/11] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
@ 2017-10-18  4:46   ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:46 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> If it makes sense to copy up only metadata during copy up, do it. This
> is done for regular files which are not opened for WRITE and have origin
> being saved.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

See one nit below

Also, for bisectability, it may make better sense to organize the patch series
so that data copy is skipped after reading a metacopy file is implemented,
but with metacopy being a new opt-in feature, I guess it doesn't matter
that much. up to Mikos.

> ---
>  fs/overlayfs/copy_up.c | 28 ++++++++++++++++++++--------
>  1 file changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 4c0acb2a23ea..cf0b36518a3a 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -306,11 +306,8 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
>                         return PTR_ERR(fh);
>         }
>
> -       /*
> -        * Do not fail when upper doesn't support xattrs.
> -        */
>         err = ovl_check_setxattr(dentry, upper, OVL_XATTR_ORIGIN, fh,
> -                                fh ? fh->len : 0, 0);
> +                                fh ? fh->len : 0, -EOPNOTSUPP);
>         kfree(fh);
>
>         return err;
> @@ -328,6 +325,7 @@ struct ovl_copy_up_ctx {
>         struct dentry *workdir;
>         bool tmpfile;
>         bool origin;
> +       bool metadata_only;

Why not metacopy? to be consistent?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file
  2017-10-17 21:05 ` [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file Vivek Goyal
@ 2017-10-18  4:57   ` Amir Goldstein
  2017-10-18 13:30     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  4:57 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Presence of OVL_XATTR_METACOPY reflects that file has been copied up
> with metadata only and data needs to be copied up later from lower.
> So this xattr is set when a metadata copy takes place and cleared when
> data copy takes place.
>
> We also use a bit in ovl_inode->flags to cache OVL_METACOPY which reflects
> whether ovl inode has only metadata copied up.
>
> ovl_set_size() code has been taken from Amir's patch.

I waiver this attribution.
In git log history context, it is just noise that adds no information.
Please remove.

>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/copy_up.c   | 59 ++++++++++++++++++++++++++++++++++++++++++------
>  fs/overlayfs/inode.c     |  3 ++-
>  fs/overlayfs/overlayfs.h |  5 +++-
>  fs/overlayfs/util.c      | 21 +++++++++++++++--
>  4 files changed, 77 insertions(+), 11 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index cf0b36518a3a..99b7be4ff4fc 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -196,6 +196,16 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
>         return error;
>  }
>
> +static int ovl_set_size(struct dentry *upperdentry, struct kstat *stat)
> +{
> +       struct iattr attr = {
> +               .ia_valid = ATTR_SIZE,
> +               .ia_size = stat->size,
> +       };
> +
> +       return notify_change(upperdentry, &attr, NULL);
> +}
> +
>  static int ovl_set_timestamps(struct dentry *upperdentry, struct kstat *stat)
>  {
>         struct iattr attr = {
> @@ -439,6 +449,27 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
>         goto out;
>  }
>
> +/* Copy up data of an inode which was copied up metadata only in the past. */
> +static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)

I like the name :)

> +{
> +       struct path upperpath;
> +       int err;
> +
> +       ovl_path_upper(c->dentry, &upperpath);
> +       BUG_ON(upperpath.dentry == NULL);

I know this is copied from ovl_copy_up_inode() and Miklos uses BUG_ON often,
but kernel coding style prefers to use

if (WARN_ON())
    return -EIO;

> +
> +       err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
> +       if (err)
> +               return err;
> +
> +       err= vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY);
> +       if (err)
> +               return err;
> +
> +       ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
> +       return err;
> +}
> +
>  static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
>  {
>         int err;
> @@ -476,13 +507,21 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
>                         return err;
>         }
>
> +       if (c->metadata_only) {
> +               err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
> +                                        NULL, 0, -EOPNOTSUPP);
> +               if (err)
> +                       return err;
> +       }
> +
>         inode_lock(temp->d_inode);
> -       err = ovl_set_attr(temp, &c->stat);
> -       inode_unlock(temp->d_inode);
> -       if (err)
> -               return err;
> +       if (c->metadata_only)
> +               err = ovl_set_size(temp, &c->stat);
>
> -       return 0;
> +       if (!err)
> +               err = ovl_set_attr(temp, &c->stat);
> +       inode_unlock(temp->d_inode);
> +       return err;
>  }
>
>  static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
> @@ -510,6 +549,8 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
>         if (err)
>                 goto out_cleanup;
>
> +       if (c->metadata_only)
> +               ovl_set_flag(OVL_METACOPY, d_inode(c->dentry));
>         ovl_inode_update(d_inode(c->dentry), newdentry);
>  out:
>         dput(temp);
> @@ -637,7 +678,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>         }
>         ovl_do_check_copy_up(ctx.lowerpath.dentry);
>
> -       err = ovl_copy_up_start(dentry);
> +       err = ovl_copy_up_start(dentry, flags);
>         /* err < 0: interrupted, err > 0: raced with another copy-up */
>         if (unlikely(err)) {
>                 if (err > 0)
> @@ -647,6 +688,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>                         err = ovl_do_copy_up(&ctx);
>                 if (!err && !ovl_dentry_has_upper_alias(dentry))
>                         err = ovl_link_up(&ctx);
> +               if (!err && ovl_dentry_needs_data_copy_up(dentry, flags))
> +                       err = ovl_copy_up_meta_inode_data(&ctx);
>                 ovl_copy_up_end(dentry);
>         }
>         do_delayed_call(&done);
> @@ -677,8 +720,10 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
>                  *      with rename.
>                  */
>                 if (ovl_dentry_upper(dentry) &&
> -                   ovl_dentry_has_upper_alias(dentry))
> +                   ovl_dentry_has_upper_alias(dentry) &&
> +                   !ovl_dentry_needs_data_copy_up(dentry, flags)) {
>                         break;
> +               }
>
>                 next = dget(dentry);
>                 /* find the topmost dentry not yet copied up */
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 321511ed8c42..1b4b42c45ed5 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -320,7 +320,8 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>  static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
>  {
>         if (ovl_dentry_upper(dentry) &&
> -           ovl_dentry_has_upper_alias(dentry))
> +           ovl_dentry_has_upper_alias(dentry) &&
> +           !ovl_dentry_needs_data_copy_up(dentry, flags))
>                 return false;
>
>         if (special_file(d_inode(dentry)->i_mode))
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index d9a0edd4e57e..dcec9456c654 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -26,10 +26,12 @@ enum ovl_path_type {
>  #define OVL_XATTR_ORIGIN OVL_XATTR_PREFIX "origin"
>  #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure"
>  #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink"
> +#define OVL_XATTR_METACOPY OVL_XATTR_PREFIX "metacopy"
>
>  enum ovl_flag {
>         OVL_IMPURE,
>         OVL_INDEX,
> +       OVL_METACOPY,
>  };
>
>  /*
> @@ -211,6 +213,7 @@ bool ovl_dentry_is_whiteout(struct dentry *dentry);
>  void ovl_dentry_set_opaque(struct dentry *dentry);
>  bool ovl_dentry_has_upper_alias(struct dentry *dentry);
>  void ovl_dentry_set_upper_alias(struct dentry *dentry);
> +bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags);
>  bool ovl_redirect_dir(struct super_block *sb);
>  const char *ovl_dentry_get_redirect(struct dentry *dentry);
>  void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
> @@ -221,7 +224,7 @@ void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
>  u64 ovl_dentry_version_get(struct dentry *dentry);
>  bool ovl_is_whiteout(struct dentry *dentry);
>  struct file *ovl_path_open(struct path *path, int flags);
> -int ovl_copy_up_start(struct dentry *dentry);
> +int ovl_copy_up_start(struct dentry *dentry, int flags);
>  void ovl_copy_up_end(struct dentry *dentry);
>  bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
>  int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index a4ce1c9944f0..91c542d1a39d 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -227,6 +227,22 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
>         oe->has_upper = true;
>  }
>
> +bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {

Maybe start off with

  if (likely(!ovl_test_flag(OVL_METACOPY, d_inode(dentry))
    return false;

> +       if (!S_ISREG(d_inode(dentry)->i_mode))
> +               return false;
> +
> +       if (!flags)
> +               return false;
> +
> +       if (!(OPEN_FMODE(flags) & FMODE_WRITE))
> +               return false;
> +
> +       if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
> +               return false;
> +
> +       return true;
> +}
> +
>  bool ovl_redirect_dir(struct super_block *sb)
>  {
>         struct ovl_fs *ofs = sb->s_fs_info;
> @@ -310,13 +326,14 @@ struct file *ovl_path_open(struct path *path, int flags)
>         return dentry_open(path, flags | O_NOATIME, current_cred());
>  }
>
> -int ovl_copy_up_start(struct dentry *dentry)
> +int ovl_copy_up_start(struct dentry *dentry, int flags)
>  {
>         struct ovl_inode *oi = OVL_I(d_inode(dentry));
>         int err;
>
>         err = mutex_lock_interruptible(&oi->lock);
> -       if (!err && ovl_dentry_has_upper_alias(dentry)) {
> +       if (!err && ovl_dentry_has_upper_alias(dentry) &&
> +           !ovl_dentry_needs_data_copy_up(dentry, flags)) {
>                 err = 1; /* Already copied up */
>                 mutex_unlock(&oi->lock);
>         }
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower
  2017-10-17 21:05 ` [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
@ 2017-10-18  5:01   ` Amir Goldstein
  2017-10-18 13:39     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  5:01 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> If an inode has been copied up metadata only, then we need to query the
> number of blocks from lower and fill up the stat->st_blocks.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/inode.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 1b4b42c45ed5..ad30edc0f425 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -140,6 +140,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>         if (!is_dir && ovl_test_flag(OVL_INDEX, d_inode(dentry)))
>                 stat->nlink = dentry->d_inode->i_nlink;
>
> +       if (ovl_test_flag(OVL_METACOPY, d_inode(dentry))) {

&& !WARN_ON(!OVL_TYPE_ORIGIN(type))

For now, until this code gets merged into the if (OVL_TYPE_ORIGIN(type))
code above

> +               struct kstat lowerstat;
> +
> +               ovl_path_lower(dentry, &realpath);
> +               err = vfs_getattr(&realpath, &lowerstat, STATX_BLOCKS, flags);
> +               if (err)
> +                       goto out;
> +               stat->blocks = lowerstat.blocks;
> +       }
>  out:
>         revert_creds(old_cred);
>
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup()
  2017-10-17 21:05 ` [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup() Vivek Goyal
@ 2017-10-18  5:06   ` Amir Goldstein
  2017-10-18 13:53     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  5:06 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> During lookup, check for presence of OVL_XATTR_METACOPY and if present,
> set OVL_METACOPY bit in flags.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/namei.c | 40 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 88ff1daeb3d7..d32da041f4bf 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -26,6 +26,24 @@ struct ovl_lookup_data {
>         char *redirect;
>  };
>
> +/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
> +static int ovl_check_metacopy(struct dentry *dentry)
> +{
> +       int res;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
> +       if (res < 0) {
> +               if (res == -ENODATA || res == -EOPNOTSUPP)
> +                       return 0;
> +               goto out;
> +       }
> +
> +       return 1;
> +out:
> +       pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
> +       return res;
> +}
> +
>  static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
>                               size_t prelen, const char *post)
>  {
> @@ -594,6 +612,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         struct dentry *this;
>         unsigned int i;
>         int err;
> +       bool metacopy = false;
>         struct ovl_lookup_data d = {
>                 .name = dentry->d_name,
>                 .is_dir = false,
> @@ -634,6 +653,12 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                                                roe->numlower, &stack, &ctr);
>                         if (err)
>                                 goto out;
> +
> +                       err = ovl_check_metacopy(upperdentry);
> +                       if (err < 0)
> +                               goto out;
> +                       if (err == 1)
> +                               metacopy = true;

Miklos seems to prefer
  metacopy = err;
  if (err < 0)
    goto out;

Not sure if it is just coding style or for better branch prediction code.


>                 }
>
>                 if (d.redirect) {
> @@ -719,6 +744,19 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 OVL_I(inode)->redirect = upperredirect;
>                 if (index)
>                         ovl_set_flag(OVL_INDEX, inode);
> +
> +               if (metacopy) {
> +                       /*
> +                        * Found an upper with metacopy set but at the same
> +                        * time there is no lower dentry. Something is not
> +                        * right.
> +                        */
> +                       if (!ctr) {

Please move this test up right after ovl_check_metacopy()
There is no reason to wait as ctr is not going to grow for non-dir after
that point.

> +                               err = -ESTALE;
> +                               goto out_put_inode;
> +                       }
> +                       ovl_set_flag(OVL_METACOPY, inode);
> +               }
>         }
>
>         revert_creds(old_cred);
> @@ -729,6 +767,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>
>         return NULL;
>
> +out_put_inode:
> +       iput(inode);

And than new goto label becomes obsolete

>  out_free_oe:
>         dentry->d_fsdata = NULL;
>         kfree(oe);
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/11] ovl: Return lower dentry if only metadata copy up took place
  2017-10-17 21:05 ` [PATCH 09/11] ovl: Return lower dentry if only metadata copy up took place Vivek Goyal
@ 2017-10-18  5:07   ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  5:07 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Upper dentry inode does not have data. So return lower dentry if upper
> is only a metadata copy.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/super.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 32e3d4be1a71..fa802c89235a 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -101,10 +101,14 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>                         err = ovl_check_append_only(d_inode(real), open_flags);
>                         if (err)
>                                 return ERR_PTR(err);
> +
> +                       if (ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
> +                               goto lower;
>                 }
>                 return real;
>         }
>
> +lower:
>         real = ovl_dentry_lower(dentry);
>         if (!real)
>                 goto bug;
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update
  2017-10-17 21:05 ` [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update Vivek Goyal
@ 2017-10-18  5:19   ` Amir Goldstein
  2017-10-18 15:32     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  5:19 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> If a file is copied up metadata only and later when same file is opened
> for WRITE, then data copy up takes place. We copy up data, remove METACOPY
> xattr and then clear the METACOPY flag from ovl_entry->flags. While all
> these operations happen with oi->lock held, read side of oi->flags is
> lockless. That is another thread on another cpu can check if METACOPY
> flag is clear or not.
>
> So this gives us an ordering requirement w.r.t METACOPY flag. That is, if
> another cpu sees METACOPY flag clear, then it should be guaranteed that
> effects of data copy up and remove xattr operations are also visible.
>
> For example.
>
>         CPU1                            CPU2
> ovl_copy_up_flags()                     acquire(oi->lock)
>  ovl_dentry_needs_data_copy_up()          ovl_copy_up_data()

The relevant lockless execution path of CPU1 is
ovl_d_real -> ovl_test_flag(OVL_METACOPY) ->  return real
when real is upper, upper needs to be with correct data

The execution path you describe here is going to test the flag
under overlay inode lock again in  ovl_copy_up_one() so it is irrelevant
to the barriers patch


>    ovl_test_flag(OVL_METACOPY)            vfs_removexattr()
>                                           ovl_clear_metacopy_flag()
>                                         release(oi->lock)
>
> Say CPU2 is copying up data and in the end clears metacopy flag. But if
> CPU1 perceives the effects of clearing of metacopy flag but not effects of
> preceeding operations, that would be a problem.
>
> Hence this patch introduces smp_wmb() on metacopy flag clear operation
> and smp_rmb() on metacopy flag test operation.
>
> May be some other lock or barrier is already covering it. But I am not sure
> what that is and is it obvious enough that we will not break it in future.
>
> So hence trying to be safe here and introducing barriers explicitly for
> metacopy bit.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/copy_up.c |  6 ++++++
>  fs/overlayfs/util.c    | 12 ++++++++++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 99b7be4ff4fc..200ee3b7d397 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -466,6 +466,12 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
>         if (err)
>                 return err;
>
> +       /*
> +        * Pairs with smp_rmb() after test_bit(OVL_METACOPY). Make sure

This comment is meant to refer the reader to the code that this pairs with
i.e. pairs with smp_rmb() in ovl_d_real() ..

> +        * if OVL_METACOPY flag reset is visible, then all the write
> +        * operations before it are visible as well
> +        */
> +       smp_wmb();
>         ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
>         return err;
>  }
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 91c542d1a39d..6bca9a960c6d 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -228,6 +228,8 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
>  }
>
>  bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
> +       bool metacopy;
> +
>         if (!S_ISREG(d_inode(dentry)->i_mode))
>                 return false;
>
> @@ -237,9 +239,15 @@ bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
>         if (!(OPEN_FMODE(flags) & FMODE_WRITE))
>                 return false;
>
> -       if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
> +       metacopy = ovl_test_flag(OVL_METACOPY, d_inode(dentry));
> +       /*
> +        * Pairs with smp_wmb() while clearing OVL_METACOPY. Make sure if

This comment is meant to refer the reader to the code that this pairs with
i.e. pairs with smp_wmb() in ovl_copy_up_meta_inode_data() ..

> +        * clearing of OVL_METACOPY is visible, then effects of writes
> +        * before that are visible too.
> +        */
> +       smp_rmb();

Please move this rmb to ovl_d_real() after re-testing flag,
which is the only place that matters for inode data access.

> +       if (!metacopy)
>                 return false;
> -
>         return true;
>  }
>
> --
> 2.13.5
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-17 21:05 ` [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update Vivek Goyal
@ 2017-10-18  5:40   ` Amir Goldstein
  2017-10-19 13:00     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18  5:40 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> OVL_METACOPY in oi->flags can be accessed in lockless manner. So if a file
> is being copied up metadata only, we need to make sure that upperdentry is
> visible only after OVL_METACOPY flag has been set. IOW, if oi->__upperdentry
> is visble to a cpu, then we also need to make sure any updates to OVL_METACOPY
> flags are visible too.
>

You know, I have a feeling that this ordering requirement could be simplified or
completely avoided if you flip the meaning of the flag, i.e.:

bool ovl_dentry_has_upper_data(struct dentry *dentry)
{
        return ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
}

Then flag visibility requirements are the same as visibility requirements
for oe->has_upper.
You probably don't need to add any new barriers for setting setting the flag
on normal copy up.
For setting the flag in copy_up_meta_inode_data, and testing the flag
in ovl_d_real() the requirements stay the same as you implemented in patch 10.

Am I wrong?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/11] ovl: Create origin xattr on copy up for all files
  2017-10-18  4:09   ` Amir Goldstein
@ 2017-10-18 12:55     ` Vivek Goyal
  2017-10-18 13:56       ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 12:55 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 07:09:37AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Right now my understanding is that origin xattr is created for all copied
> > up files if index=on. And if index=off, then we create it for all type
> > of files except hardlinks (nlink != 1).
> >
> > With metadata only copy up, I will still require origin xattr to copy up
> > data later, so create it even for hardlinks even with index=off. I am
> > hoping it does not break anything because we do OVL_INDEX test before
> > using inode number of origin.
> 
> I still have a nudging buzz that loosing the information about this upper
> being a broken hardlink may come back to bite us some time, but can't
> put my finger on how.

Hi Amir,

Can't we do stat() on ORIGIN and check nlink to figure out if it is a broken
hard link or not. (If it ever becomes a problem in future). IOW, we still
have a way to figure out if this is broken hard link or not. Just that
it involves extra step of doing stat() and its not the presence/absense
of ORIGIN.

> 
> To reduce the level of that buzz, I suggest to set origin for broken hardlink
> only when doing metacopy and then break the origin association when
> removing metacopy xattr.

While removing metacopy xattr, I will not have any idea whether it is
broken hard link or not. I guess I will have to use above trick and do
stat() on ORIGIN and detect broken hard links.

I feel really odd about special casing this broken hard link case and
setting and clearing ORIGIN with metacopy flags. It makes code
twisted and complicated.

Given stat() should allow us to figure out if something is broken hard
link or not, does it make you feel any better about it.

> 
> In any case, this patch needs to update the comment above setting
> origin xattr that claims to break the association with broken hardlink.

Ok, Will do.

Vivek
> 
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/copy_up.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index c441f9387a1b..0a6294ea64d5 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -538,9 +538,6 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
> >             c->stat.nlink > 1)
> >                 indexed = true;
> >
> > -       if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || indexed)
> > -               c->origin = true;
> > -
> >         if (indexed) {
> >                 c->destdir = ovl_indexdir(c->dentry->d_sb);
> >                 err = ovl_get_index_name(c->lowerpath.dentry, &c->destname);
> > @@ -596,6 +593,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> >                 .parent = parent,
> >                 .dentry = dentry,
> >                 .workdir = ovl_workdir(dentry),
> > +               .origin = true,
> >         };
> >
> >         if (WARN_ON(!ctx.workdir))
> > --
> > 2.13.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-18  4:31   ` Amir Goldstein
@ 2017-10-18 13:03     ` Vivek Goyal
  2017-10-18 14:09       ` Amir Goldstein
  2017-10-18 14:10     ` Vivek Goyal
  1 sibling, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 13:03 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 07:31:51AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > By default metadata only copy up is disabled. Provide a mount option so
> > that users can choose one way or other.
> >
> > Also provide a kernel config and module option to enable/disable
> > metacopy feature.
> >
> > Like index feature, when overlay is mounted, on root upper directory we
> > set ORIGIN which points to lower. And at later mount time it is verified
> > again. This hopes to get the configuration right. But this does only so
> > much as we don't verify all the lowers. So it is possible that a lower is
> > missing and later data copy up fails.
> 
> Like index feature, please error mount if ovl_inuse_trylock fails.
> As you know, this error is only conditional because of backward
> compatibility, so any new opt-in feature should enforce it.

Hi Amir,

I am not so sure about it. Avoiding leaking any mount point is really
really hard. And I don't think current container runtime have been
modified to make it fool proof.

IMHO, if we really want to enforce something like this, then we need
to have some sort of capability to find existing superblock and reuse it.
(Something like what happens with block devices).

I am afraid that if I start enforcing this, then this feature will not
be used at all because software has not been hardended enough to avoid
mount point leaks completely.

Vivek

> 
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/Kconfig     |  8 ++++++++
> >  fs/overlayfs/ovl_entry.h |  1 +
> >  fs/overlayfs/super.c     | 38 +++++++++++++++++++++++++++++++++++---
> >  3 files changed, 44 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> > index cbfc196e5dc5..17a0b17ad14c 100644
> > --- a/fs/overlayfs/Kconfig
> > +++ b/fs/overlayfs/Kconfig
> > @@ -43,3 +43,11 @@ config OVERLAY_FS_INDEX
> >           outcomes.  However, mounting the same overlay with an old kernel
> >           read-write and then mounting it again with a new kernel, will have
> >           unexpected results.
> > +
> > +config OVERLAY_FS_METACOPY
> > +       bool "Overlayfs: turn on metadata only copy up feature by default"
> > +       depends on OVERLAY_FS
> > +       help
> > +         If this config option is enabled then overlay filesystems will
> > +         copy up only metadata where appropriate and data copy up will
> > +         happen when a file is opended for WRITE operation.
> > diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> > index 25d9b5adcd42..6806f0b0fbc2 100644
> > --- a/fs/overlayfs/ovl_entry.h
> > +++ b/fs/overlayfs/ovl_entry.h
> > @@ -15,6 +15,7 @@ struct ovl_config {
> >         bool default_permissions;
> >         bool redirect_dir;
> >         bool index;
> > +       bool metacopy;
> >  };
> >
> >  /* private information held for overlayfs's superblock */
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index 092d150643c1..32e3d4be1a71 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -39,6 +39,11 @@ module_param_named(index, ovl_index_def, bool, 0644);
> >  MODULE_PARM_DESC(ovl_index_def,
> >                  "Default to on or off for the inodes index feature");
> >
> > +static bool ovl_metacopy_def = IS_ENABLED(CONFIG_OVERLAY_FS_METACOPY);
> > +module_param_named(metacopy, ovl_metacopy_def, bool, 0644);
> > +MODULE_PARM_DESC(ovl_metacopy_def,
> > +                "Default to on or off for the metadata only copy up feature");
> > +
> >  static void ovl_dentry_release(struct dentry *dentry)
> >  {
> >         struct ovl_entry *oe = dentry->d_fsdata;
> > @@ -303,6 +308,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
> >         if (ufs->config.index != ovl_index_def)
> >                 seq_printf(m, ",index=%s",
> >                            ufs->config.index ? "on" : "off");
> > +       if (ufs->config.metacopy != ovl_metacopy_def)
> > +               seq_printf(m, ",metacopy=%s",
> > +                          ufs->config.metacopy ? "on" : "off");
> >         return 0;
> >  }
> >
> > @@ -336,6 +344,8 @@ enum {
> >         OPT_REDIRECT_DIR_OFF,
> >         OPT_INDEX_ON,
> >         OPT_INDEX_OFF,
> > +       OPT_METACOPY_ON,
> > +       OPT_METACOPY_OFF,
> >         OPT_ERR,
> >  };
> >
> > @@ -348,6 +358,8 @@ static const match_table_t ovl_tokens = {
> >         {OPT_REDIRECT_DIR_OFF,          "redirect_dir=off"},
> >         {OPT_INDEX_ON,                  "index=on"},
> >         {OPT_INDEX_OFF,                 "index=off"},
> > +       {OPT_METACOPY_ON,               "metacopy=on"},
> > +       {OPT_METACOPY_OFF,              "metacopy=off"},
> >         {OPT_ERR,                       NULL}
> >  };
> >
> > @@ -428,6 +440,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
> >                         config->index = false;
> >                         break;
> >
> > +               case OPT_METACOPY_ON:
> > +                       config->metacopy = true;
> > +                       break;
> > +
> > +               case OPT_METACOPY_OFF:
> > +                       config->metacopy = false;
> > +                       break;
> > +
> >                 default:
> >                         pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
> >                         return -EINVAL;
> > @@ -644,9 +664,16 @@ static int ovl_lower_dir(const char *name, struct path *path,
> >          * The inodes index feature needs to encode and decode file
> >          * handles, so it requires that all layers support them.
> >          */
> > -       if (ofs->config.index && !ovl_can_decode_fh(path->dentry->d_sb)) {
> > +       if ((ofs->config.index || ofs->config.metacopy) &&
> > +            !ovl_can_decode_fh(path->dentry->d_sb)) {
> > +               if (ofs->config.index)
> > +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
> > +
> > +               if (ofs->config.metacopy)
> > +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to metacopy=off.\n", name);
> > +
> 
> In my verify_dir patches I used the following more compact warning
> style instead of granular warnings:
> 
>     pr_warn("overlayfs: fs on '%s' does not support file handles,
> falling back to index=off,metacopy=off.\n", name);
> 
> It is a bit less informative, but IMO the result in nicer to look at,
> both in the code and in dmesg.
> Others may have a different opinion. Please consider.
> For example, see how the following warning from my verify_dir patches
> has expanded and imagine how
> much more messier it would be if we broke it into separate warnings:
> 
>     pr_warn("overlayfs: upper fs does not support xattr, falling back
> to redirect_dir=off, index=off, no verify_dir and no opaque dir.\n");
> 
> 
> >                 ofs->config.index = false;
> > -               pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
> > +               ofs->config.metacopy = false;
> >         }
> >
> >         return 0;
> > @@ -847,6 +874,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >
> >         ufs->config.redirect_dir = ovl_redirect_dir_def;
> >         ufs->config.index = ovl_index_def;
> > +       ufs->config.metacopy = ovl_metacopy_def;
> > +
> >         err = ovl_parse_opt((char *) data, &ufs->config);
> >         if (err)
> >                 goto out_free_config;
> > @@ -1057,7 +1086,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >         else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
> >                 ufs->same_sb = NULL;
> >
> > -       if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
> > +       if (!(ovl_force_readonly(ufs)) &&
> > +             (ufs->config.index || ufs->config.metacopy)) {
> >                 /* Verify lower root is upper root origin */
> >                 err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
> >                                         stack[0].dentry, false, true);
> > @@ -1065,7 +1095,9 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >                         pr_err("overlayfs: failed to verify upper root origin\n");
> >                         goto out_put_lower_mnt;
> >                 }
> > +       }
> >
> > +       if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
> >                 ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
> >                                                    OVL_INDEXDIR_NAME, true);
> >                 if (ufs->indexdir) {
> > --
> > 2.13.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file
  2017-10-18  4:57   ` Amir Goldstein
@ 2017-10-18 13:30     ` Vivek Goyal
  0 siblings, 0 replies; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 13:30 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 07:57:16AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Presence of OVL_XATTR_METACOPY reflects that file has been copied up
> > with metadata only and data needs to be copied up later from lower.
> > So this xattr is set when a metadata copy takes place and cleared when
> > data copy takes place.
> >
> > We also use a bit in ovl_inode->flags to cache OVL_METACOPY which reflects
> > whether ovl inode has only metadata copied up.
> >
> > ovl_set_size() code has been taken from Amir's patch.
> 
> I waiver this attribution.
> In git log history context, it is just noise that adds no information.
> Please remove.

Ok, will do.

> 
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/copy_up.c   | 59 ++++++++++++++++++++++++++++++++++++++++++------
> >  fs/overlayfs/inode.c     |  3 ++-
> >  fs/overlayfs/overlayfs.h |  5 +++-
> >  fs/overlayfs/util.c      | 21 +++++++++++++++--
> >  4 files changed, 77 insertions(+), 11 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index cf0b36518a3a..99b7be4ff4fc 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -196,6 +196,16 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
> >         return error;
> >  }
> >
> > +static int ovl_set_size(struct dentry *upperdentry, struct kstat *stat)
> > +{
> > +       struct iattr attr = {
> > +               .ia_valid = ATTR_SIZE,
> > +               .ia_size = stat->size,
> > +       };
> > +
> > +       return notify_change(upperdentry, &attr, NULL);
> > +}
> > +
> >  static int ovl_set_timestamps(struct dentry *upperdentry, struct kstat *stat)
> >  {
> >         struct iattr attr = {
> > @@ -439,6 +449,27 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
> >         goto out;
> >  }
> >
> > +/* Copy up data of an inode which was copied up metadata only in the past. */
> > +static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
> 
> I like the name :)

Glad to hear that. :-)

> 
> > +{
> > +       struct path upperpath;
> > +       int err;
> > +
> > +       ovl_path_upper(c->dentry, &upperpath);
> > +       BUG_ON(upperpath.dentry == NULL);
> 
> I know this is copied from ovl_copy_up_inode() and Miklos uses BUG_ON often,
> but kernel coding style prefers to use
> 
> if (WARN_ON())
>     return -EIO;

Ok, I will change it and use above instead.

> 
> > +
> > +       err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
> > +       if (err)
> > +               return err;
> > +
> > +       err= vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY);
> > +       if (err)
> > +               return err;
> > +
> > +       ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
> > +       return err;
> > +}
> > +
> >  static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
> >  {
> >         int err;
> > @@ -476,13 +507,21 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
> >                         return err;
> >         }
> >
> > +       if (c->metadata_only) {
> > +               err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
> > +                                        NULL, 0, -EOPNOTSUPP);
> > +               if (err)
> > +                       return err;
> > +       }
> > +
> >         inode_lock(temp->d_inode);
> > -       err = ovl_set_attr(temp, &c->stat);
> > -       inode_unlock(temp->d_inode);
> > -       if (err)
> > -               return err;
> > +       if (c->metadata_only)
> > +               err = ovl_set_size(temp, &c->stat);
> >
> > -       return 0;
> > +       if (!err)
> > +               err = ovl_set_attr(temp, &c->stat);
> > +       inode_unlock(temp->d_inode);
> > +       return err;
> >  }
> >
> >  static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
> > @@ -510,6 +549,8 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
> >         if (err)
> >                 goto out_cleanup;
> >
> > +       if (c->metadata_only)
> > +               ovl_set_flag(OVL_METACOPY, d_inode(c->dentry));
> >         ovl_inode_update(d_inode(c->dentry), newdentry);
> >  out:
> >         dput(temp);
> > @@ -637,7 +678,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> >         }
> >         ovl_do_check_copy_up(ctx.lowerpath.dentry);
> >
> > -       err = ovl_copy_up_start(dentry);
> > +       err = ovl_copy_up_start(dentry, flags);
> >         /* err < 0: interrupted, err > 0: raced with another copy-up */
> >         if (unlikely(err)) {
> >                 if (err > 0)
> > @@ -647,6 +688,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> >                         err = ovl_do_copy_up(&ctx);
> >                 if (!err && !ovl_dentry_has_upper_alias(dentry))
> >                         err = ovl_link_up(&ctx);
> > +               if (!err && ovl_dentry_needs_data_copy_up(dentry, flags))
> > +                       err = ovl_copy_up_meta_inode_data(&ctx);
> >                 ovl_copy_up_end(dentry);
> >         }
> >         do_delayed_call(&done);
> > @@ -677,8 +720,10 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
> >                  *      with rename.
> >                  */
> >                 if (ovl_dentry_upper(dentry) &&
> > -                   ovl_dentry_has_upper_alias(dentry))
> > +                   ovl_dentry_has_upper_alias(dentry) &&
> > +                   !ovl_dentry_needs_data_copy_up(dentry, flags)) {
> >                         break;
> > +               }
> >
> >                 next = dget(dentry);
> >                 /* find the topmost dentry not yet copied up */
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index 321511ed8c42..1b4b42c45ed5 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -320,7 +320,8 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
> >  static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
> >  {
> >         if (ovl_dentry_upper(dentry) &&
> > -           ovl_dentry_has_upper_alias(dentry))
> > +           ovl_dentry_has_upper_alias(dentry) &&
> > +           !ovl_dentry_needs_data_copy_up(dentry, flags))
> >                 return false;
> >
> >         if (special_file(d_inode(dentry)->i_mode))
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index d9a0edd4e57e..dcec9456c654 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -26,10 +26,12 @@ enum ovl_path_type {
> >  #define OVL_XATTR_ORIGIN OVL_XATTR_PREFIX "origin"
> >  #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure"
> >  #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink"
> > +#define OVL_XATTR_METACOPY OVL_XATTR_PREFIX "metacopy"
> >
> >  enum ovl_flag {
> >         OVL_IMPURE,
> >         OVL_INDEX,
> > +       OVL_METACOPY,
> >  };
> >
> >  /*
> > @@ -211,6 +213,7 @@ bool ovl_dentry_is_whiteout(struct dentry *dentry);
> >  void ovl_dentry_set_opaque(struct dentry *dentry);
> >  bool ovl_dentry_has_upper_alias(struct dentry *dentry);
> >  void ovl_dentry_set_upper_alias(struct dentry *dentry);
> > +bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags);
> >  bool ovl_redirect_dir(struct super_block *sb);
> >  const char *ovl_dentry_get_redirect(struct dentry *dentry);
> >  void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
> > @@ -221,7 +224,7 @@ void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
> >  u64 ovl_dentry_version_get(struct dentry *dentry);
> >  bool ovl_is_whiteout(struct dentry *dentry);
> >  struct file *ovl_path_open(struct path *path, int flags);
> > -int ovl_copy_up_start(struct dentry *dentry);
> > +int ovl_copy_up_start(struct dentry *dentry, int flags);
> >  void ovl_copy_up_end(struct dentry *dentry);
> >  bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
> >  int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index a4ce1c9944f0..91c542d1a39d 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -227,6 +227,22 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
> >         oe->has_upper = true;
> >  }
> >
> > +bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
> 
> Maybe start off with
> 
>   if (likely(!ovl_test_flag(OVL_METACOPY, d_inode(dentry))
>     return false;

This is lockless access and last two patches are introducing two smp_rmb()
around this access. So it might turn out to be more expensive to check
this first.

But if we managed to remove those barriers, then it makes sense to check
this first.

Vivek

> 
> > +       if (!S_ISREG(d_inode(dentry)->i_mode))
> > +               return false;
> > +
> > +       if (!flags)
> > +               return false;
> > +
> > +       if (!(OPEN_FMODE(flags) & FMODE_WRITE))
> > +               return false;
> > +
> > +       if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
> > +               return false;
> > +
> > +       return true;
> > +}
> > +
> >  bool ovl_redirect_dir(struct super_block *sb)
> >  {
> >         struct ovl_fs *ofs = sb->s_fs_info;
> > @@ -310,13 +326,14 @@ struct file *ovl_path_open(struct path *path, int flags)
> >         return dentry_open(path, flags | O_NOATIME, current_cred());
> >  }
> >
> > -int ovl_copy_up_start(struct dentry *dentry)
> > +int ovl_copy_up_start(struct dentry *dentry, int flags)
> >  {
> >         struct ovl_inode *oi = OVL_I(d_inode(dentry));
> >         int err;
> >
> >         err = mutex_lock_interruptible(&oi->lock);
> > -       if (!err && ovl_dentry_has_upper_alias(dentry)) {
> > +       if (!err && ovl_dentry_has_upper_alias(dentry) &&
> > +           !ovl_dentry_needs_data_copy_up(dentry, flags)) {
> >                 err = 1; /* Already copied up */
> >                 mutex_unlock(&oi->lock);
> >         }
> > --
> > 2.13.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower
  2017-10-18  5:01   ` Amir Goldstein
@ 2017-10-18 13:39     ` Vivek Goyal
  0 siblings, 0 replies; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 13:39 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 08:01:59AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > If an inode has been copied up metadata only, then we need to query the
> > number of blocks from lower and fill up the stat->st_blocks.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/inode.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index 1b4b42c45ed5..ad30edc0f425 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -140,6 +140,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
> >         if (!is_dir && ovl_test_flag(OVL_INDEX, d_inode(dentry)))
> >                 stat->nlink = dentry->d_inode->i_nlink;
> >
> > +       if (ovl_test_flag(OVL_METACOPY, d_inode(dentry))) {
> 
> && !WARN_ON(!OVL_TYPE_ORIGIN(type))
> 
> For now, until this code gets merged into the if (OVL_TYPE_ORIGIN(type))
> code above

Ok, will add this.

Vivek

> 
> > +               struct kstat lowerstat;
> > +
> > +               ovl_path_lower(dentry, &realpath);
> > +               err = vfs_getattr(&realpath, &lowerstat, STATX_BLOCKS, flags);
> > +               if (err)
> > +                       goto out;
> > +               stat->blocks = lowerstat.blocks;
> > +       }
> >  out:
> >         revert_creds(old_cred);
> >
> > --
> > 2.13.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup()
  2017-10-18  5:06   ` Amir Goldstein
@ 2017-10-18 13:53     ` Vivek Goyal
  0 siblings, 0 replies; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 13:53 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 08:06:19AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > During lookup, check for presence of OVL_XATTR_METACOPY and if present,
> > set OVL_METACOPY bit in flags.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/namei.c | 40 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> > index 88ff1daeb3d7..d32da041f4bf 100644
> > --- a/fs/overlayfs/namei.c
> > +++ b/fs/overlayfs/namei.c
> > @@ -26,6 +26,24 @@ struct ovl_lookup_data {
> >         char *redirect;
> >  };
> >
> > +/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
> > +static int ovl_check_metacopy(struct dentry *dentry)
> > +{
> > +       int res;
> > +
> > +       res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
> > +       if (res < 0) {
> > +               if (res == -ENODATA || res == -EOPNOTSUPP)
> > +                       return 0;
> > +               goto out;
> > +       }
> > +
> > +       return 1;
> > +out:
> > +       pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
> > +       return res;
> > +}
> > +
> >  static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
> >                               size_t prelen, const char *post)
> >  {
> > @@ -594,6 +612,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >         struct dentry *this;
> >         unsigned int i;
> >         int err;
> > +       bool metacopy = false;
> >         struct ovl_lookup_data d = {
> >                 .name = dentry->d_name,
> >                 .is_dir = false,
> > @@ -634,6 +653,12 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                                                roe->numlower, &stack, &ctr);
> >                         if (err)
> >                                 goto out;
> > +
> > +                       err = ovl_check_metacopy(upperdentry);
> > +                       if (err < 0)
> > +                               goto out;
> > +                       if (err == 1)
> > +                               metacopy = true;
> 
> Miklos seems to prefer
>   metacopy = err;
>   if (err < 0)
>     goto out;
> 
> Not sure if it is just coding style or for better branch prediction code.

Ok, will change.

> 
> 
> >                 }
> >
> >                 if (d.redirect) {
> > @@ -719,6 +744,19 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                 OVL_I(inode)->redirect = upperredirect;
> >                 if (index)
> >                         ovl_set_flag(OVL_INDEX, inode);
> > +
> > +               if (metacopy) {
> > +                       /*
> > +                        * Found an upper with metacopy set but at the same
> > +                        * time there is no lower dentry. Something is not
> > +                        * right.
> > +                        */
> > +                       if (!ctr) {
> 
> Please move this test up right after ovl_check_metacopy()
> There is no reason to wait as ctr is not going to grow for non-dir after
> that point.

Makes sense. Will do.

> 
> > +                               err = -ESTALE;
> > +                               goto out_put_inode;
> > +                       }
> > +                       ovl_set_flag(OVL_METACOPY, inode);
> > +               }
> >         }
> >
> >         revert_creds(old_cred);
> > @@ -729,6 +767,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >
> >         return NULL;
> >
> > +out_put_inode:
> > +       iput(inode);
> 
> And than new goto label becomes obsolete

Sounds good.

Vivek

> 
> >  out_free_oe:
> >         dentry->d_fsdata = NULL;
> >         kfree(oe);
> > --
> > 2.13.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/11] ovl: Create origin xattr on copy up for all files
  2017-10-18 12:55     ` Vivek Goyal
@ 2017-10-18 13:56       ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18 13:56 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 3:55 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Oct 18, 2017 at 07:09:37AM +0300, Amir Goldstein wrote:
>> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > Right now my understanding is that origin xattr is created for all copied
>> > up files if index=on. And if index=off, then we create it for all type
>> > of files except hardlinks (nlink != 1).
>> >
>> > With metadata only copy up, I will still require origin xattr to copy up
>> > data later, so create it even for hardlinks even with index=off. I am
>> > hoping it does not break anything because we do OVL_INDEX test before
>> > using inode number of origin.
>>
>> I still have a nudging buzz that loosing the information about this upper
>> being a broken hardlink may come back to bite us some time, but can't
>> put my finger on how.
>
> Hi Amir,
>
> Can't we do stat() on ORIGIN and check nlink to figure out if it is a broken
> hard link or not. (If it ever becomes a problem in future). IOW, we still
> have a way to figure out if this is broken hard link or not. Just that
> it involves extra step of doing stat() and its not the presence/absense
> of ORIGIN.
>
>>
>> To reduce the level of that buzz, I suggest to set origin for broken hardlink
>> only when doing metacopy and then break the origin association when
>> removing metacopy xattr.
>
> While removing metacopy xattr, I will not have any idea whether it is
> broken hard link or not. I guess I will have to use above trick and do
> stat() on ORIGIN and detect broken hard links.
>
> I feel really odd about special casing this broken hard link case and
> setting and clearing ORIGIN with metacopy flags. It makes code
> twisted and complicated.
>
> Given stat() should allow us to figure out if something is broken hard
> link or not, does it make you feel any better about it.

I understand your position about this.
Since I cannot think of a concrete use case where keeping ORIGIN
would hurt us, I'll withdraw my comment.

>
>>
>> In any case, this patch needs to update the comment above setting
>> origin xattr that claims to break the association with broken hardlink.
>
> Ok, Will do.
>
> Vivek
>>
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>> > ---
>> >  fs/overlayfs/copy_up.c | 4 +---
>> >  1 file changed, 1 insertion(+), 3 deletions(-)
>> >
>> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>> > index c441f9387a1b..0a6294ea64d5 100644
>> > --- a/fs/overlayfs/copy_up.c
>> > +++ b/fs/overlayfs/copy_up.c
>> > @@ -538,9 +538,6 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
>> >             c->stat.nlink > 1)
>> >                 indexed = true;
>> >
>> > -       if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || indexed)
>> > -               c->origin = true;
>> > -
>> >         if (indexed) {
>> >                 c->destdir = ovl_indexdir(c->dentry->d_sb);
>> >                 err = ovl_get_index_name(c->lowerpath.dentry, &c->destname);
>> > @@ -596,6 +593,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>> >                 .parent = parent,
>> >                 .dentry = dentry,
>> >                 .workdir = ovl_workdir(dentry),
>> > +               .origin = true,
>> >         };
>> >
>> >         if (WARN_ON(!ctx.workdir))
>> > --
>> > 2.13.5
>> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-18 13:03     ` Vivek Goyal
@ 2017-10-18 14:09       ` Amir Goldstein
  2017-10-18 14:26         ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18 14:09 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 4:03 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Oct 18, 2017 at 07:31:51AM +0300, Amir Goldstein wrote:
>> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > By default metadata only copy up is disabled. Provide a mount option so
>> > that users can choose one way or other.
>> >
>> > Also provide a kernel config and module option to enable/disable
>> > metacopy feature.
>> >
>> > Like index feature, when overlay is mounted, on root upper directory we
>> > set ORIGIN which points to lower. And at later mount time it is verified
>> > again. This hopes to get the configuration right. But this does only so
>> > much as we don't verify all the lowers. So it is possible that a lower is
>> > missing and later data copy up fails.
>>
>> Like index feature, please error mount if ovl_inuse_trylock fails.
>> As you know, this error is only conditional because of backward
>> compatibility, so any new opt-in feature should enforce it.
>
> Hi Amir,
>
> I am not so sure about it. Avoiding leaking any mount point is really
> really hard. And I don't think current container runtime have been
> modified to make it fool proof.
>
> IMHO, if we really want to enforce something like this, then we need
> to have some sort of capability to find existing superblock and reuse it.
> (Something like what happens with block devices).
>

That sounds like a good idea. Any chance you can make it happen?
Keep in mind that would be a change of behavior, so users will have to
opt-in for it as well.

> I am afraid that if I start enforcing this, then this feature will not
> be used at all because software has not been hardended enough to avoid
> mount point leaks completely.
>

I find that approach a bit dodgy.
If not for enjoying new overlayfs feature, why would container runtime
developers ever bother to fix mount leaks.

Anyway, that is up to Miklos to decide.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-18  4:31   ` Amir Goldstein
  2017-10-18 13:03     ` Vivek Goyal
@ 2017-10-18 14:10     ` Vivek Goyal
  2017-10-18 14:26       ` Amir Goldstein
  1 sibling, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 14:10 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 07:31:51AM +0300, Amir Goldstein wrote:

[..]
> > @@ -644,9 +664,16 @@ static int ovl_lower_dir(const char *name, struct path *path,
> >          * The inodes index feature needs to encode and decode file
> >          * handles, so it requires that all layers support them.
> >          */
> > -       if (ofs->config.index && !ovl_can_decode_fh(path->dentry->d_sb)) {
> > +       if ((ofs->config.index || ofs->config.metacopy) &&
> > +            !ovl_can_decode_fh(path->dentry->d_sb)) {
> > +               if (ofs->config.index)
> > +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
> > +
> > +               if (ofs->config.metacopy)
> > +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to metacopy=off.\n", name);
> > +
> 
> In my verify_dir patches I used the following more compact warning
> style instead of granular warnings:
> 
>     pr_warn("overlayfs: fs on '%s' does not support file handles,
> falling back to index=off,metacopy=off.\n", name);

Ok, I was not sure about it. I am fine with above style too.

BTW, I have a question. Current code seems to ensure that upper/work
supports file handles.

	/* Check if upper/work fs supports file handles */
	if (ufs->config.index &&
		!ovl_can_decode_fh(ufs->workdir->d_sb)) {
		ufs->config.index = false;
		pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
	}

I am wondering why that's the case and do I need it for metacopy feature
also. I thought that file handle support requirement was only on lower
dirs because they are the one who will export the origin file handle. All
upper has to do it that save this file handles in an xattr.

What am I missing?

Vivek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-18 14:09       ` Amir Goldstein
@ 2017-10-18 14:26         ` Vivek Goyal
  2017-10-18 14:38           ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 14:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 05:09:08PM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 4:03 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, Oct 18, 2017 at 07:31:51AM +0300, Amir Goldstein wrote:
> >> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > By default metadata only copy up is disabled. Provide a mount option so
> >> > that users can choose one way or other.
> >> >
> >> > Also provide a kernel config and module option to enable/disable
> >> > metacopy feature.
> >> >
> >> > Like index feature, when overlay is mounted, on root upper directory we
> >> > set ORIGIN which points to lower. And at later mount time it is verified
> >> > again. This hopes to get the configuration right. But this does only so
> >> > much as we don't verify all the lowers. So it is possible that a lower is
> >> > missing and later data copy up fails.
> >>
> >> Like index feature, please error mount if ovl_inuse_trylock fails.
> >> As you know, this error is only conditional because of backward
> >> compatibility, so any new opt-in feature should enforce it.
> >
> > Hi Amir,
> >
> > I am not so sure about it. Avoiding leaking any mount point is really
> > really hard. And I don't think current container runtime have been
> > modified to make it fool proof.
> >
> > IMHO, if we really want to enforce something like this, then we need
> > to have some sort of capability to find existing superblock and reuse it.
> > (Something like what happens with block devices).
> >
> 
> That sounds like a good idea. Any chance you can make it happen?
> Keep in mind that would be a change of behavior, so users will have to
> opt-in for it as well.

I will look into it. I have no idea if it is doable and what is needed
at this point of time.

> 
> > I am afraid that if I start enforcing this, then this feature will not
> > be used at all because software has not been hardended enough to avoid
> > mount point leaks completely.
> >
> 
> I find that approach a bit dodgy.

It is. But at the same time I am having hard time understanding what's
wrong with having mount point in two separate mount namespaces. And
why overlay is putting this additional restriction. 

How can one realiably avoid mount point leaks. For example, say a dameon
foo is running in init mount namespace and mounts an overlay mount. Now
systemd starts another service bar and say that service starts with
MountFlags=private. Now overlay mount point will leak. Now daemon foo 
stop (it will unmount overlay), and restarts, it will fail to restart
because it can't mount that overlay anymore (upper/work are busy in 
mount namespace of bar).

I mean what's wrong with above programming model and how would programmers
avoid mount point leaks in other mount namespaces. 

Vivek


> If not for enjoying new overlayfs feature, why would container runtime
> developers ever bother to fix mount leaks.
> 
> Anyway, that is up to Miklos to decide.
> 
> Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-18 14:10     ` Vivek Goyal
@ 2017-10-18 14:26       ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18 14:26 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 5:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Oct 18, 2017 at 07:31:51AM +0300, Amir Goldstein wrote:
>
> [..]
>> > @@ -644,9 +664,16 @@ static int ovl_lower_dir(const char *name, struct path *path,
>> >          * The inodes index feature needs to encode and decode file
>> >          * handles, so it requires that all layers support them.
>> >          */
>> > -       if (ofs->config.index && !ovl_can_decode_fh(path->dentry->d_sb)) {
>> > +       if ((ofs->config.index || ofs->config.metacopy) &&
>> > +            !ovl_can_decode_fh(path->dentry->d_sb)) {
>> > +               if (ofs->config.index)
>> > +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name);
>> > +
>> > +               if (ofs->config.metacopy)
>> > +                       pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to metacopy=off.\n", name);
>> > +
>>
>> In my verify_dir patches I used the following more compact warning
>> style instead of granular warnings:
>>
>>     pr_warn("overlayfs: fs on '%s' does not support file handles,
>> falling back to index=off,metacopy=off.\n", name);
>
> Ok, I was not sure about it. I am fine with above style too.
>
> BTW, I have a question. Current code seems to ensure that upper/work
> supports file handles.
>
>         /* Check if upper/work fs supports file handles */
>         if (ufs->config.index &&
>                 !ovl_can_decode_fh(ufs->workdir->d_sb)) {
>                 ufs->config.index = false;
>                 pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
>         }
>
> I am wondering why that's the case and do I need it for metacopy feature
> also. I thought that file handle support requirement was only on lower
> dirs because they are the one who will export the origin file handle. All
> upper has to do it that save this file handles in an xattr.
>
> What am I missing?
>

There is a concept of an "upper origin" (see is_upper param to
ovl_verify_origin())
With current code it is only used to verify upper root is index dir origin
(i.e. index dir is not reused with a different upper dir)
This is currently the only reason that upper is required to have file
handle support.
But index is meant to serve NFS export and directory index entries (implemented
with index=all patches) also contain an ORIGIN xattr pointing to upper dir,
instead of the non-dir hardlink to upper.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2017-10-18 14:26         ` Vivek Goyal
@ 2017-10-18 14:38           ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18 14:38 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 5:26 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Oct 18, 2017 at 05:09:08PM +0300, Amir Goldstein wrote:
>> On Wed, Oct 18, 2017 at 4:03 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Wed, Oct 18, 2017 at 07:31:51AM +0300, Amir Goldstein wrote:
>> >> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> >> > By default metadata only copy up is disabled. Provide a mount option so
>> >> > that users can choose one way or other.
>> >> >
>> >> > Also provide a kernel config and module option to enable/disable
>> >> > metacopy feature.
>> >> >
>> >> > Like index feature, when overlay is mounted, on root upper directory we
>> >> > set ORIGIN which points to lower. And at later mount time it is verified
>> >> > again. This hopes to get the configuration right. But this does only so
>> >> > much as we don't verify all the lowers. So it is possible that a lower is
>> >> > missing and later data copy up fails.
>> >>
>> >> Like index feature, please error mount if ovl_inuse_trylock fails.
>> >> As you know, this error is only conditional because of backward
>> >> compatibility, so any new opt-in feature should enforce it.
>> >
>> > Hi Amir,
>> >
>> > I am not so sure about it. Avoiding leaking any mount point is really
>> > really hard. And I don't think current container runtime have been
>> > modified to make it fool proof.
>> >
>> > IMHO, if we really want to enforce something like this, then we need
>> > to have some sort of capability to find existing superblock and reuse it.
>> > (Something like what happens with block devices).
>> >
>>
>> That sounds like a good idea. Any chance you can make it happen?
>> Keep in mind that would be a change of behavior, so users will have to
>> opt-in for it as well.
>
> I will look into it. I have no idea if it is doable and what is needed
> at this point of time.
>
>>
>> > I am afraid that if I start enforcing this, then this feature will not
>> > be used at all because software has not been hardended enough to avoid
>> > mount point leaks completely.
>> >
>>
>> I find that approach a bit dodgy.
>
> It is. But at the same time I am having hard time understanding what's
> wrong with having mount point in two separate mount namespaces. And
> why overlay is putting this additional restriction.
>
> How can one realiably avoid mount point leaks. For example, say a dameon
> foo is running in init mount namespace and mounts an overlay mount. Now
> systemd starts another service bar and say that service starts with
> MountFlags=private. Now overlay mount point will leak. Now daemon foo
> stop (it will unmount overlay), and restarts, it will fail to restart
> because it can't mount that overlay anymore (upper/work are busy in
> mount namespace of bar).
>
> I mean what's wrong with above programming model and how would programmers
> avoid mount point leaks in other mount namespaces.
>

You are absolutely right. There is nothing wrong with having overlay mount
replicated in other namespaces. The only problem is that we need to guard from
mounting different super block with same upper/work.
As I wrote in the warning in the exclusive lock fix patch, concurrent
use of upper/work,
can only do damage if both mounts are written to concurrently,
but there is absolutely no way for the kernel to know that the leaked mounts are
not going to be used by anyone.

The correct solution seems to be what you suggested - to bind the new mount
to existing super block when device name and all relevant overlay
config matches.
Until then, we can only enforce the poor man's I_OVL_INUSE lock.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update
  2017-10-18  5:19   ` Amir Goldstein
@ 2017-10-18 15:32     ` Vivek Goyal
  2017-10-18 16:05       ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-18 15:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 08:19:50AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > If a file is copied up metadata only and later when same file is opened
> > for WRITE, then data copy up takes place. We copy up data, remove METACOPY
> > xattr and then clear the METACOPY flag from ovl_entry->flags. While all
> > these operations happen with oi->lock held, read side of oi->flags is
> > lockless. That is another thread on another cpu can check if METACOPY
> > flag is clear or not.
> >
> > So this gives us an ordering requirement w.r.t METACOPY flag. That is, if
> > another cpu sees METACOPY flag clear, then it should be guaranteed that
> > effects of data copy up and remove xattr operations are also visible.
> >
> > For example.
> >
> >         CPU1                            CPU2
> > ovl_copy_up_flags()                     acquire(oi->lock)
> >  ovl_dentry_needs_data_copy_up()          ovl_copy_up_data()
> 
> The relevant lockless execution path of CPU1 is
> ovl_d_real -> ovl_test_flag(OVL_METACOPY) ->  return real
> when real is upper, upper needs to be with correct data
> 

That path is relevant for ordering requirements w.r.t oi->__upperdentry
and next patch does put an smp_rmb() there to cover it.

> The execution path you describe here is going to test the flag
> under overlay inode lock again in  ovl_copy_up_one() so it is irrelevant
> to the barriers patch

It will go into ovl_copy_up_one() only if this cpu saw OVL_METACOPY=1. But
what about the case where OVL_METACOPY is being cleared on CPU2, and
it became visible on CPU1. That means cpu1 will assume file copy up is
complete and it will continue and never both to take oi->lock.

> 
> 
> >    ovl_test_flag(OVL_METACOPY)            vfs_removexattr()
> >                                           ovl_clear_metacopy_flag()
> >                                         release(oi->lock)
> >
> > Say CPU2 is copying up data and in the end clears metacopy flag. But if
> > CPU1 perceives the effects of clearing of metacopy flag but not effects of
> > preceeding operations, that would be a problem.
> >
> > Hence this patch introduces smp_wmb() on metacopy flag clear operation
> > and smp_rmb() on metacopy flag test operation.
> >
> > May be some other lock or barrier is already covering it. But I am not sure
> > what that is and is it obvious enough that we will not break it in future.
> >
> > So hence trying to be safe here and introducing barriers explicitly for
> > metacopy bit.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/copy_up.c |  6 ++++++
> >  fs/overlayfs/util.c    | 12 ++++++++++--
> >  2 files changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > index 99b7be4ff4fc..200ee3b7d397 100644
> > --- a/fs/overlayfs/copy_up.c
> > +++ b/fs/overlayfs/copy_up.c
> > @@ -466,6 +466,12 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
> >         if (err)
> >                 return err;
> >
> > +       /*
> > +        * Pairs with smp_rmb() after test_bit(OVL_METACOPY). Make sure
> 
> This comment is meant to refer the reader to the code that this pairs with
> i.e. pairs with smp_rmb() in ovl_d_real() ..

This pairs with smp_rmb() in ovl_dentry_needs_data_copy_up(). Next patch
has smp_rmb() in ovl_d_real() and I will explain that more.

> 
> > +        * if OVL_METACOPY flag reset is visible, then all the write
> > +        * operations before it are visible as well
> > +        */
> > +       smp_wmb();
> >         ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
> >         return err;
> >  }
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index 91c542d1a39d..6bca9a960c6d 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -228,6 +228,8 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
> >  }
> >
> >  bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
> > +       bool metacopy;
> > +
> >         if (!S_ISREG(d_inode(dentry)->i_mode))
> >                 return false;
> >
> > @@ -237,9 +239,15 @@ bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
> >         if (!(OPEN_FMODE(flags) & FMODE_WRITE))
> >                 return false;
> >
> > -       if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
> > +       metacopy = ovl_test_flag(OVL_METACOPY, d_inode(dentry));
> > +       /*
> > +        * Pairs with smp_wmb() while clearing OVL_METACOPY. Make sure if
> 
> This comment is meant to refer the reader to the code that this pairs with
> i.e. pairs with smp_wmb() in ovl_copy_up_meta_inode_data() ..

Will do. I guess I will retain this which explains what's the intent
and also add funciton name where OVL_METACOPY is being cleared.

> 
> > +        * clearing of OVL_METACOPY is visible, then effects of writes
> > +        * before that are visible too.
> > +        */
> > +       smp_rmb();
> 
> Please move this rmb to ovl_d_real() after re-testing flag,
> which is the only place that matters for inode data access.

But we are testing it atleast at two places.
(ovl_dentry_needs_data_copy_up()).

First call is in.

ovl_d_real()
  ovl_open_need_copy_up()
     ovl_dentry_needs_data_copy_up()

And second call is in ovl_copy_up_flags() further down.

ovl_d_real()
  ovl_open_need_copy_up()
     ovl_dentry_needs_data_copy_up()  <---- First Call
          ovl_copy_up_flags()
	      ovl_dentry_needs_data_copy_up() <--- Second Call

Given we are checking/loading oi->flags at both the places, IIUC, we
need smp_rmb() at both the places. And if that's the case, its better
to keep it inside ovl_dentry_needs_data_copy_up() instead of open
coding it in ovl_open_need_copy_up() and ovl_copy_up_flags().

Thanks
Vivek


> 
> > +       if (!metacopy)
> >                 return false;
> > -
> >         return true;
> >  }
> >
> > --
> > 2.13.5
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update
  2017-10-18 15:32     ` Vivek Goyal
@ 2017-10-18 16:05       ` Amir Goldstein
  0 siblings, 0 replies; 48+ messages in thread
From: Amir Goldstein @ 2017-10-18 16:05 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 6:32 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Oct 18, 2017 at 08:19:50AM +0300, Amir Goldstein wrote:
>> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > If a file is copied up metadata only and later when same file is opened
>> > for WRITE, then data copy up takes place. We copy up data, remove METACOPY
>> > xattr and then clear the METACOPY flag from ovl_entry->flags. While all
>> > these operations happen with oi->lock held, read side of oi->flags is
>> > lockless. That is another thread on another cpu can check if METACOPY
>> > flag is clear or not.
>> >
>> > So this gives us an ordering requirement w.r.t METACOPY flag. That is, if
>> > another cpu sees METACOPY flag clear, then it should be guaranteed that
>> > effects of data copy up and remove xattr operations are also visible.
>> >
>> > For example.
>> >
>> >         CPU1                            CPU2
>> > ovl_copy_up_flags()                     acquire(oi->lock)
>> >  ovl_dentry_needs_data_copy_up()          ovl_copy_up_data()
>>
>> The relevant lockless execution path of CPU1 is
>> ovl_d_real -> ovl_test_flag(OVL_METACOPY) ->  return real
>> when real is upper, upper needs to be with correct data
>>
>
> That path is relevant for ordering requirements w.r.t oi->__upperdentry
> and next patch does put an smp_rmb() there to cover it.
>
>> The execution path you describe here is going to test the flag
>> under overlay inode lock again in  ovl_copy_up_one() so it is irrelevant
>> to the barriers patch
>
> It will go into ovl_copy_up_one() only if this cpu saw OVL_METACOPY=1. But
> what about the case where OVL_METACOPY is being cleared on CPU2, and
> it became visible on CPU1. That means cpu1 will assume file copy up is
> complete and it will continue and never both to take oi->lock.

Yes, it will continue only to meet the next smp_rmb in d_real before testing
the flag again.
Nevermind that. I think the scheme you are trying to implement over complicates
ordering requirement because you need to toggle the bit twice.
If you start with a cleat bit (OVL_UPPER_DATA) and only set it when data copy
is finalized you have one less memory synchronization to worry about and
life should be simpler.

>
>>
>>
>> >    ovl_test_flag(OVL_METACOPY)            vfs_removexattr()
>> >                                           ovl_clear_metacopy_flag()
>> >                                         release(oi->lock)
>> >
>> > Say CPU2 is copying up data and in the end clears metacopy flag. But if
>> > CPU1 perceives the effects of clearing of metacopy flag but not effects of
>> > preceeding operations, that would be a problem.
>> >
>> > Hence this patch introduces smp_wmb() on metacopy flag clear operation
>> > and smp_rmb() on metacopy flag test operation.
>> >
>> > May be some other lock or barrier is already covering it. But I am not sure
>> > what that is and is it obvious enough that we will not break it in future.
>> >
>> > So hence trying to be safe here and introducing barriers explicitly for
>> > metacopy bit.
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>> > ---
>> >  fs/overlayfs/copy_up.c |  6 ++++++
>> >  fs/overlayfs/util.c    | 12 ++++++++++--
>> >  2 files changed, 16 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>> > index 99b7be4ff4fc..200ee3b7d397 100644
>> > --- a/fs/overlayfs/copy_up.c
>> > +++ b/fs/overlayfs/copy_up.c
>> > @@ -466,6 +466,12 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
>> >         if (err)
>> >                 return err;
>> >
>> > +       /*
>> > +        * Pairs with smp_rmb() after test_bit(OVL_METACOPY). Make sure
>>
>> This comment is meant to refer the reader to the code that this pairs with
>> i.e. pairs with smp_rmb() in ovl_d_real() ..
>
> This pairs with smp_rmb() in ovl_dentry_needs_data_copy_up(). Next patch
> has smp_rmb() in ovl_d_real() and I will explain that more.
>
>>
>> > +        * if OVL_METACOPY flag reset is visible, then all the write
>> > +        * operations before it are visible as well
>> > +        */
>> > +       smp_wmb();
>> >         ovl_clear_flag(OVL_METACOPY, d_inode(c->dentry));
>> >         return err;
>> >  }
>> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
>> > index 91c542d1a39d..6bca9a960c6d 100644
>> > --- a/fs/overlayfs/util.c
>> > +++ b/fs/overlayfs/util.c
>> > @@ -228,6 +228,8 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
>> >  }
>> >
>> >  bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
>> > +       bool metacopy;
>> > +
>> >         if (!S_ISREG(d_inode(dentry)->i_mode))
>> >                 return false;
>> >
>> > @@ -237,9 +239,15 @@ bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags) {
>> >         if (!(OPEN_FMODE(flags) & FMODE_WRITE))
>> >                 return false;
>> >
>> > -       if (!ovl_test_flag(OVL_METACOPY, d_inode(dentry)))
>> > +       metacopy = ovl_test_flag(OVL_METACOPY, d_inode(dentry));
>> > +       /*
>> > +        * Pairs with smp_wmb() while clearing OVL_METACOPY. Make sure if
>>
>> This comment is meant to refer the reader to the code that this pairs with
>> i.e. pairs with smp_wmb() in ovl_copy_up_meta_inode_data() ..
>
> Will do. I guess I will retain this which explains what's the intent
> and also add funciton name where OVL_METACOPY is being cleared.
>
>>
>> > +        * clearing of OVL_METACOPY is visible, then effects of writes
>> > +        * before that are visible too.
>> > +        */
>> > +       smp_rmb();
>>
>> Please move this rmb to ovl_d_real() after re-testing flag,
>> which is the only place that matters for inode data access.
>
> But we are testing it atleast at two places.
> (ovl_dentry_needs_data_copy_up()).
>
> First call is in.
>
> ovl_d_real()
>   ovl_open_need_copy_up()
>      ovl_dentry_needs_data_copy_up()
>
> And second call is in ovl_copy_up_flags() further down.
>
> ovl_d_real()
>   ovl_open_need_copy_up()
>      ovl_dentry_needs_data_copy_up()  <---- First Call
>           ovl_copy_up_flags()
>               ovl_dentry_needs_data_copy_up() <--- Second Call
>
> Given we are checking/loading oi->flags at both the places, IIUC, we
> need smp_rmb() at both the places. And if that's the case, its better
> to keep it inside ovl_dentry_needs_data_copy_up() instead of open
> coding it in ovl_open_need_copy_up() and ovl_copy_up_flags().
>

This seems over complicated.
Let's try the single bit flip scheme and see where that gets us.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-18  5:40   ` Amir Goldstein
@ 2017-10-19 13:00     ` Vivek Goyal
  2017-10-19 13:21       ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-19 13:00 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, Oct 18, 2017 at 08:40:59AM +0300, Amir Goldstein wrote:
> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > OVL_METACOPY in oi->flags can be accessed in lockless manner. So if a file
> > is being copied up metadata only, we need to make sure that upperdentry is
> > visible only after OVL_METACOPY flag has been set. IOW, if oi->__upperdentry
> > is visble to a cpu, then we also need to make sure any updates to OVL_METACOPY
> > flags are visible too.
> >
> 
> You know, I have a feeling that this ordering requirement could be simplified or
> completely avoided if you flip the meaning of the flag, i.e.:
> 
> bool ovl_dentry_has_upper_data(struct dentry *dentry)
> {
>         return ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
> }
> 
> Then flag visibility requirements are the same as visibility requirements
> for oe->has_upper.
> You probably don't need to add any new barriers for setting setting the flag
> on normal copy up.
> For setting the flag in copy_up_meta_inode_data, and testing the flag
> in ovl_d_real() the requirements stay the same as you implemented in patch 10.

Hi Amir,

I thought about it and IIUC, flipping the bit does not do away with the
ordering requirement w.r.t ovl_inode_update(). For example.

Say on CPU1 a file is being copied up (both data and metadata copy up).

ovl_copy_up_inode()
install inode;
ovl_set_flag(OVL_UPPER_DATA);
ovl_inode_update();

Assume, Say another CPU2 is doing d_real() with flags=0.

ovl_d_real()
  real = ovl_dentry_upper(dentry);
  if (real) {
	if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
		goto lower;
  } 
 
Now assume that CPU2 has not seen the update of OVL_UPPER_DATA yet. So it
will end up returning a "lower" dentry, while it should have returned
an upper dentry. So ordering requirement is very much still there.

What do you think?

To simplify ordering requirements w.r.t ovl_inode_updat(), can we replace
data dependency barrier in ovl_upperdentry_dereference() with a read
barrier instead (smp_rmb()). That way we will not have to introduce
this additional smp_rmb() everything and code will be simpler.

Thoughts?

Vivek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 13:00     ` Vivek Goyal
@ 2017-10-19 13:21       ` Amir Goldstein
  2017-10-19 14:58         ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-19 13:21 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 4:00 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Oct 18, 2017 at 08:40:59AM +0300, Amir Goldstein wrote:
>> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > OVL_METACOPY in oi->flags can be accessed in lockless manner. So if a file
>> > is being copied up metadata only, we need to make sure that upperdentry is
>> > visible only after OVL_METACOPY flag has been set. IOW, if oi->__upperdentry
>> > is visble to a cpu, then we also need to make sure any updates to OVL_METACOPY
>> > flags are visible too.
>> >
>>
>> You know, I have a feeling that this ordering requirement could be simplified or
>> completely avoided if you flip the meaning of the flag, i.e.:
>>
>> bool ovl_dentry_has_upper_data(struct dentry *dentry)
>> {
>>         return ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
>> }
>>
>> Then flag visibility requirements are the same as visibility requirements
>> for oe->has_upper.
>> You probably don't need to add any new barriers for setting setting the flag
>> on normal copy up.
>> For setting the flag in copy_up_meta_inode_data, and testing the flag
>> in ovl_d_real() the requirements stay the same as you implemented in patch 10.
>
> Hi Amir,
>
> I thought about it and IIUC, flipping the bit does not do away with the
> ordering requirement w.r.t ovl_inode_update(). For example.
>
> Say on CPU1 a file is being copied up (both data and metadata copy up).
>
> ovl_copy_up_inode()
> install inode;
> ovl_set_flag(OVL_UPPER_DATA);
> ovl_inode_update();
>
> Assume, Say another CPU2 is doing d_real() with flags=0.
>
> ovl_d_real()
>   real = ovl_dentry_upper(dentry);
>   if (real) {
>         if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
>                 goto lower;
>   }
>
> Now assume that CPU2 has not seen the update of OVL_UPPER_DATA yet. So it
> will end up returning a "lower" dentry, while it should have returned
> an upper dentry. So ordering requirement is very much still there.
>
> What do you think?

Process 2 will get lower dentry on open for read at 8AM
Process 1 will copy up file at 9AM (on CPU1)
Process 2 will open same file for read at 9AM (on CPU2)
Does it matter if process 2 gets lower or upper dentry? No.
It only matter that IF process 2 gets an upper dentry, that
this dentry is consistent, so it only matters that IF __upperdentry
is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
CPU2 then dentry and its inode are consistent.

So IMO you may only need to add smp_rmb() before
ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
in ovl_inode_update() should be sufficient.
Change the comment in ovl_inode_update() to mention that wmb also
matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.


>
> To simplify ordering requirements w.r.t ovl_inode_updat(), can we replace
> data dependency barrier in ovl_upperdentry_dereference() with a read
> barrier instead (smp_rmb()). That way we will not have to introduce
> this additional smp_rmb() everything and code will be simpler.
>
> Thoughts?
>

I am am right with my claims above, this can be avoided.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 13:21       ` Amir Goldstein
@ 2017-10-19 14:58         ` Vivek Goyal
  2017-10-19 15:08           ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-19 14:58 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 4:00 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, Oct 18, 2017 at 08:40:59AM +0300, Amir Goldstein wrote:
> >> On Wed, Oct 18, 2017 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > OVL_METACOPY in oi->flags can be accessed in lockless manner. So if a file
> >> > is being copied up metadata only, we need to make sure that upperdentry is
> >> > visible only after OVL_METACOPY flag has been set. IOW, if oi->__upperdentry
> >> > is visble to a cpu, then we also need to make sure any updates to OVL_METACOPY
> >> > flags are visible too.
> >> >
> >>
> >> You know, I have a feeling that this ordering requirement could be simplified or
> >> completely avoided if you flip the meaning of the flag, i.e.:
> >>
> >> bool ovl_dentry_has_upper_data(struct dentry *dentry)
> >> {
> >>         return ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
> >> }
> >>
> >> Then flag visibility requirements are the same as visibility requirements
> >> for oe->has_upper.
> >> You probably don't need to add any new barriers for setting setting the flag
> >> on normal copy up.
> >> For setting the flag in copy_up_meta_inode_data, and testing the flag
> >> in ovl_d_real() the requirements stay the same as you implemented in patch 10.
> >
> > Hi Amir,
> >
> > I thought about it and IIUC, flipping the bit does not do away with the
> > ordering requirement w.r.t ovl_inode_update(). For example.
> >
> > Say on CPU1 a file is being copied up (both data and metadata copy up).
> >
> > ovl_copy_up_inode()
> > install inode;
> > ovl_set_flag(OVL_UPPER_DATA);
> > ovl_inode_update();
> >
> > Assume, Say another CPU2 is doing d_real() with flags=0.
> >
> > ovl_d_real()
> >   real = ovl_dentry_upper(dentry);
> >   if (real) {
> >         if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
> >                 goto lower;
> >   }
> >
> > Now assume that CPU2 has not seen the update of OVL_UPPER_DATA yet. So it
> > will end up returning a "lower" dentry, while it should have returned
> > an upper dentry. So ordering requirement is very much still there.
> >
> > What do you think?
> 
> Process 2 will get lower dentry on open for read at 8AM
> Process 1 will copy up file at 9AM (on CPU1)
> Process 2 will open same file for read at 9AM (on CPU2)
> Does it matter if process 2 gets lower or upper dentry? No.
> It only matter that IF process 2 gets an upper dentry, that
> this dentry is consistent, so it only matters that IF __upperdentry
> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
> CPU2 then dentry and its inode are consistent.

That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
yet, then CPU1 will use lower dentry. And this is equivalent to as if file
copy up has not taken place yet.

And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
and that will take oi->lock and make sure OVL_UPPER_DATA is set.

So only *additional* smp_rmb()/smp_wmb() we require for the case when 
data is copied up later and we need to make sure OVL_UPPER_DATA is
visible only after the full data copy up is done and stable.

Sounds good.

> 
> So IMO you may only need to add smp_rmb() before
> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
> in ovl_inode_update() should be sufficient.
> Change the comment in ovl_inode_update() to mention that wmb also
> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.

Hmm..., I agree that we require smp_rmb() here but it will pair with
smp_wmb() in ovl_copy_meta_data_inode() and not the one in
ovl_inode_update(), right? Something like.

ovl_d_real() {
	bool has_upper_data;

	has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
	/* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
	smp_rmb();
	if (!has_upper_data)
		goto lower;

	...
	...
	return real;
}

Note that now smp_rmb() will be placed after loading OVL_UPPER_DATA and
not before it. Because we are ensuring ordering w.r.t smp_wmb() in
ovl_copy_up_meta_inode_data().

In fact I think my current patches are buggy because they should have had
this smp_rmb() to begin with.

> 
> 
> >
> > To simplify ordering requirements w.r.t ovl_inode_updat(), can we replace
> > data dependency barrier in ovl_upperdentry_dereference() with a read
> > barrier instead (smp_rmb()). That way we will not have to introduce
> > this additional smp_rmb() everything and code will be simpler.
> >
> > Thoughts?
> >
> 
> I am am right with my claims above, this can be avoided.

Agreed. By flipping the bit, instead of two ordering requirements we 
should have only one ordering requirement and place barrier only for
that requirement (That's patch 10).

Vivek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 14:58         ` Vivek Goyal
@ 2017-10-19 15:08           ` Amir Goldstein
  2017-10-19 15:22             ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-19 15:08 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
...
>>
>> Process 2 will get lower dentry on open for read at 8AM
>> Process 1 will copy up file at 9AM (on CPU1)
>> Process 2 will open same file for read at 9AM (on CPU2)
>> Does it matter if process 2 gets lower or upper dentry? No.
>> It only matter that IF process 2 gets an upper dentry, that
>> this dentry is consistent, so it only matters that IF __upperdentry
>> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
>> CPU2 then dentry and its inode are consistent.
>
> That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
> yet, then CPU1 will use lower dentry. And this is equivalent to as if file
> copy up has not taken place yet.
>
> And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
> and that will take oi->lock and make sure OVL_UPPER_DATA is set.
>
> So only *additional* smp_rmb()/smp_wmb() we require for the case when
> data is copied up later and we need to make sure OVL_UPPER_DATA is
> visible only after the full data copy up is done and stable.
>
>

Right. forgot about that wmb.

>>
>> So IMO you may only need to add smp_rmb() before
>> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
>> in ovl_inode_update() should be sufficient.
>> Change the comment in ovl_inode_update() to mention that wmb also
>> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
>
> Hmm..., I agree that we require smp_rmb() here but it will pair with
> smp_wmb() in ovl_copy_meta_data_inode() and not the one in
> ovl_inode_update(), right? Something like.

Right. my bad.

>
> ovl_d_real() {
>         bool has_upper_data;
>
>         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
>         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
>         smp_rmb();
>         if (!has_upper_data)
>                 goto lower;

Just put smp_rmb() here. no need for the bool variable.
rmb does matter if you goto lower...

>
>         ...
>         ...
>         return real;
> }
>
> Note that now smp_rmb() will be placed after loading OVL_UPPER_DATA and
> not before it. Because we are ensuring ordering w.r.t smp_wmb() in
> ovl_copy_up_meta_inode_data().
>
> In fact I think my current patches are buggy because they should have had
> this smp_rmb() to begin with.
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 15:08           ` Amir Goldstein
@ 2017-10-19 15:22             ` Vivek Goyal
  2017-10-19 15:39               ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-19 15:22 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 06:08:32PM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
> ...
> >>
> >> Process 2 will get lower dentry on open for read at 8AM
> >> Process 1 will copy up file at 9AM (on CPU1)
> >> Process 2 will open same file for read at 9AM (on CPU2)
> >> Does it matter if process 2 gets lower or upper dentry? No.
> >> It only matter that IF process 2 gets an upper dentry, that
> >> this dentry is consistent, so it only matters that IF __upperdentry
> >> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
> >> CPU2 then dentry and its inode are consistent.
> >
> > That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
> > yet, then CPU1 will use lower dentry. And this is equivalent to as if file
> > copy up has not taken place yet.
> >
> > And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
> > and that will take oi->lock and make sure OVL_UPPER_DATA is set.
> >
> > So only *additional* smp_rmb()/smp_wmb() we require for the case when
> > data is copied up later and we need to make sure OVL_UPPER_DATA is
> > visible only after the full data copy up is done and stable.
> >
> >
> 
> Right. forgot about that wmb.
> 
> >>
> >> So IMO you may only need to add smp_rmb() before
> >> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
> >> in ovl_inode_update() should be sufficient.
> >> Change the comment in ovl_inode_update() to mention that wmb also
> >> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
> >
> > Hmm..., I agree that we require smp_rmb() here but it will pair with
> > smp_wmb() in ovl_copy_meta_data_inode() and not the one in
> > ovl_inode_update(), right? Something like.
> 
> Right. my bad.
> 
> >
> > ovl_d_real() {
> >         bool has_upper_data;
> >
> >         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
> >         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
> >         smp_rmb();
> >         if (!has_upper_data)
> >                 goto lower;
> 
> Just put smp_rmb() here. no need for the bool variable.
> rmb does matter if you goto lower...

I thought smp_rmb() has to be put *only* after LOAD of oi->flags.
Something like.

LOAD oi->flags
smp_rmb()
Look at results of oi->flags and take action.

So that means I need to store results of oi->flags load in variable
temporarily so that I can analyze it after smp_rmb(). IOW, I am not
sure how would I get rid of boolean here. I need some kind of temp
variable.

Vivek
> 
> >
> >         ...
> >         ...
> >         return real;
> > }
> >
> > Note that now smp_rmb() will be placed after loading OVL_UPPER_DATA and
> > not before it. Because we are ensuring ordering w.r.t smp_wmb() in
> > ovl_copy_up_meta_inode_data().
> >
> > In fact I think my current patches are buggy because they should have had
> > this smp_rmb() to begin with.
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 15:22             ` Vivek Goyal
@ 2017-10-19 15:39               ` Amir Goldstein
  2017-10-19 15:59                 ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-19 15:39 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 6:22 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Oct 19, 2017 at 06:08:32PM +0300, Amir Goldstein wrote:
>> On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
>> ...
>> >>
>> >> Process 2 will get lower dentry on open for read at 8AM
>> >> Process 1 will copy up file at 9AM (on CPU1)
>> >> Process 2 will open same file for read at 9AM (on CPU2)
>> >> Does it matter if process 2 gets lower or upper dentry? No.
>> >> It only matter that IF process 2 gets an upper dentry, that
>> >> this dentry is consistent, so it only matters that IF __upperdentry
>> >> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
>> >> CPU2 then dentry and its inode are consistent.
>> >
>> > That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
>> > yet, then CPU1 will use lower dentry. And this is equivalent to as if file
>> > copy up has not taken place yet.
>> >
>> > And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
>> > and that will take oi->lock and make sure OVL_UPPER_DATA is set.
>> >
>> > So only *additional* smp_rmb()/smp_wmb() we require for the case when
>> > data is copied up later and we need to make sure OVL_UPPER_DATA is
>> > visible only after the full data copy up is done and stable.
>> >
>> >
>>
>> Right. forgot about that wmb.
>>
>> >>
>> >> So IMO you may only need to add smp_rmb() before
>> >> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
>> >> in ovl_inode_update() should be sufficient.
>> >> Change the comment in ovl_inode_update() to mention that wmb also
>> >> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
>> >
>> > Hmm..., I agree that we require smp_rmb() here but it will pair with
>> > smp_wmb() in ovl_copy_meta_data_inode() and not the one in
>> > ovl_inode_update(), right? Something like.
>>
>> Right. my bad.
>>
>> >
>> > ovl_d_real() {
>> >         bool has_upper_data;
>> >
>> >         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
>> >         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
>> >         smp_rmb();
>> >         if (!has_upper_data)
>> >                 goto lower;
>>
>> Just put smp_rmb() here. no need for the bool variable.
>> rmb does matter if you goto lower...
>
> I thought smp_rmb() has to be put *only* after LOAD of oi->flags.
> Something like.
>
> LOAD oi->flags
> smp_rmb()
> Look at results of oi->flags and take action.
>
> So that means I need to store results of oi->flags load in variable
> temporarily so that I can analyze it after smp_rmb(). IOW, I am not
> sure how would I get rid of boolean here. I need some kind of temp
> variable.
>

One of us is very confused.

Remember you are not synchronizing the value of OVL_UPPER_DATA between CPUs
You don't care if user gets lower or upper dentry.
You only care about the upper case so you can put smb_rmb() after goto
lower line
which will make sure CPU cannot read inconsistent upper inode state
from before smp_wmb()
in ovl_copy_up_meta_inode_data() after CPU read positive
OVL_UPPER_DATA before smp_rmb().
That's the way I understand it.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 15:39               ` Amir Goldstein
@ 2017-10-19 15:59                 ` Vivek Goyal
  2017-10-19 16:33                   ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-19 15:59 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 06:39:57PM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 6:22 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Thu, Oct 19, 2017 at 06:08:32PM +0300, Amir Goldstein wrote:
> >> On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
> >> ...
> >> >>
> >> >> Process 2 will get lower dentry on open for read at 8AM
> >> >> Process 1 will copy up file at 9AM (on CPU1)
> >> >> Process 2 will open same file for read at 9AM (on CPU2)
> >> >> Does it matter if process 2 gets lower or upper dentry? No.
> >> >> It only matter that IF process 2 gets an upper dentry, that
> >> >> this dentry is consistent, so it only matters that IF __upperdentry
> >> >> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
> >> >> CPU2 then dentry and its inode are consistent.
> >> >
> >> > That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
> >> > yet, then CPU1 will use lower dentry. And this is equivalent to as if file
> >> > copy up has not taken place yet.
> >> >
> >> > And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
> >> > and that will take oi->lock and make sure OVL_UPPER_DATA is set.
> >> >
> >> > So only *additional* smp_rmb()/smp_wmb() we require for the case when
> >> > data is copied up later and we need to make sure OVL_UPPER_DATA is
> >> > visible only after the full data copy up is done and stable.
> >> >
> >> >
> >>
> >> Right. forgot about that wmb.
> >>
> >> >>
> >> >> So IMO you may only need to add smp_rmb() before
> >> >> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
> >> >> in ovl_inode_update() should be sufficient.
> >> >> Change the comment in ovl_inode_update() to mention that wmb also
> >> >> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
> >> >
> >> > Hmm..., I agree that we require smp_rmb() here but it will pair with
> >> > smp_wmb() in ovl_copy_meta_data_inode() and not the one in
> >> > ovl_inode_update(), right? Something like.
> >>
> >> Right. my bad.
> >>
> >> >
> >> > ovl_d_real() {
> >> >         bool has_upper_data;
> >> >
> >> >         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
> >> >         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
> >> >         smp_rmb();
> >> >         if (!has_upper_data)
> >> >                 goto lower;
> >>
> >> Just put smp_rmb() here. no need for the bool variable.
> >> rmb does matter if you goto lower...
> >
> > I thought smp_rmb() has to be put *only* after LOAD of oi->flags.
> > Something like.
> >
> > LOAD oi->flags
> > smp_rmb()
> > Look at results of oi->flags and take action.
> >
> > So that means I need to store results of oi->flags load in variable
> > temporarily so that I can analyze it after smp_rmb(). IOW, I am not
> > sure how would I get rid of boolean here. I need some kind of temp
> > variable.
> >
> 
> One of us is very confused.
> 
> Remember you are not synchronizing the value of OVL_UPPER_DATA between CPUs
> You don't care if user gets lower or upper dentry.
> You only care about the upper case so you can put smb_rmb() after goto
> lower line
> which will make sure CPU cannot read inconsistent upper inode state
> from before smp_wmb()
> in ovl_copy_up_meta_inode_data() after CPU read positive
> OVL_UPPER_DATA before smp_rmb().
> That's the way I understand it.

ok, I think I get it now. You are suggesting following structure.

	if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
		goto lower;
	smp_rmb();
	return real;

So if we are returning lower, we don't have to do smp_rmb(). But if we
saw OVL_UPPER_DATA, set, then we need to do smp_rmb() to make sure upper
is consistent (just in case it was data copied up just now).

In fact, I should probably put is outside if condition block. That is.

        real = ovl_dentry_upper(dentry);
        if (real && (!inode || inode == d_inode(real))) {
                if (!inode) { 
                        err = ovl_check_append_only(d_inode(real), open_flags);
                        if (err)
                                return ERR_PTR(err);
                        
                        if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
                                goto lower;
                }
                smp_rmb();
                return real;
	}

Vivek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 15:59                 ` Vivek Goyal
@ 2017-10-19 16:33                   ` Amir Goldstein
  2017-10-19 20:33                     ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-19 16:33 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 6:59 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Oct 19, 2017 at 06:39:57PM +0300, Amir Goldstein wrote:
>> On Thu, Oct 19, 2017 at 6:22 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Thu, Oct 19, 2017 at 06:08:32PM +0300, Amir Goldstein wrote:
>> >> On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> >> > On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
>> >> ...
>> >> >>
>> >> >> Process 2 will get lower dentry on open for read at 8AM
>> >> >> Process 1 will copy up file at 9AM (on CPU1)
>> >> >> Process 2 will open same file for read at 9AM (on CPU2)
>> >> >> Does it matter if process 2 gets lower or upper dentry? No.
>> >> >> It only matter that IF process 2 gets an upper dentry, that
>> >> >> this dentry is consistent, so it only matters that IF __upperdentry
>> >> >> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
>> >> >> CPU2 then dentry and its inode are consistent.
>> >> >
>> >> > That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
>> >> > yet, then CPU1 will use lower dentry. And this is equivalent to as if file
>> >> > copy up has not taken place yet.
>> >> >
>> >> > And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
>> >> > and that will take oi->lock and make sure OVL_UPPER_DATA is set.
>> >> >
>> >> > So only *additional* smp_rmb()/smp_wmb() we require for the case when
>> >> > data is copied up later and we need to make sure OVL_UPPER_DATA is
>> >> > visible only after the full data copy up is done and stable.
>> >> >
>> >> >
>> >>
>> >> Right. forgot about that wmb.
>> >>
>> >> >>
>> >> >> So IMO you may only need to add smp_rmb() before
>> >> >> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
>> >> >> in ovl_inode_update() should be sufficient.
>> >> >> Change the comment in ovl_inode_update() to mention that wmb also
>> >> >> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
>> >> >
>> >> > Hmm..., I agree that we require smp_rmb() here but it will pair with
>> >> > smp_wmb() in ovl_copy_meta_data_inode() and not the one in
>> >> > ovl_inode_update(), right? Something like.
>> >>
>> >> Right. my bad.
>> >>
>> >> >
>> >> > ovl_d_real() {
>> >> >         bool has_upper_data;
>> >> >
>> >> >         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
>> >> >         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
>> >> >         smp_rmb();
>> >> >         if (!has_upper_data)
>> >> >                 goto lower;
>> >>
>> >> Just put smp_rmb() here. no need for the bool variable.
>> >> rmb does matter if you goto lower...
>> >
>> > I thought smp_rmb() has to be put *only* after LOAD of oi->flags.
>> > Something like.
>> >
>> > LOAD oi->flags
>> > smp_rmb()
>> > Look at results of oi->flags and take action.
>> >
>> > So that means I need to store results of oi->flags load in variable
>> > temporarily so that I can analyze it after smp_rmb(). IOW, I am not
>> > sure how would I get rid of boolean here. I need some kind of temp
>> > variable.
>> >
>>
>> One of us is very confused.
>>
>> Remember you are not synchronizing the value of OVL_UPPER_DATA between CPUs
>> You don't care if user gets lower or upper dentry.
>> You only care about the upper case so you can put smb_rmb() after goto
>> lower line
>> which will make sure CPU cannot read inconsistent upper inode state
>> from before smp_wmb()
>> in ovl_copy_up_meta_inode_data() after CPU read positive
>> OVL_UPPER_DATA before smp_rmb().
>> That's the way I understand it.
>
> ok, I think I get it now. You are suggesting following structure.
>
>         if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
>                 goto lower;
>         smp_rmb();
>         return real;
>
> So if we are returning lower, we don't have to do smp_rmb(). But if we
> saw OVL_UPPER_DATA, set, then we need to do smp_rmb() to make sure upper
> is consistent (just in case it was data copied up just now).
>
> In fact, I should probably put is outside if condition block. That is.

If that was true then you should have tested OVL_UPPER_DATA outside
the if condition. The reason it is not needed if inode is not NULL and equals
to the upper inode then we have already made it visible to someone, then
we can make it visible to evevryone.

This is very not clear from this code, so worth some fat comments.
Also at top of ovl_d_real(),  D_REAL_UPPER just returns upper dentry
without testing flag :-/ and this is not good for may_write_real()
not sure about update_ovl_inode_times()...

You should run another pass on all ovl_dentry_upper()
ovl_dentry_real(), ovl_path_real() and ovl_path_upper()
ovl_inode_upper() ovl_i_dentry_upper()
I have a feeling there other issues lurking...


>
>         real = ovl_dentry_upper(dentry);
>         if (real && (!inode || inode == d_inode(real))) {
>                 if (!inode) {
>                         err = ovl_check_append_only(d_inode(real), open_flags);
>                         if (err)
>                                 return ERR_PTR(err);
>
>                         if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
>                                 goto lower;
>                 }
>                 smp_rmb();
>                 return real;
>         }
>
> Vivek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 16:33                   ` Amir Goldstein
@ 2017-10-19 20:33                     ` Vivek Goyal
  2017-10-20  4:09                       ` Amir Goldstein
  0 siblings, 1 reply; 48+ messages in thread
From: Vivek Goyal @ 2017-10-19 20:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 07:33:37PM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 6:59 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Thu, Oct 19, 2017 at 06:39:57PM +0300, Amir Goldstein wrote:
> >> On Thu, Oct 19, 2017 at 6:22 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > On Thu, Oct 19, 2017 at 06:08:32PM +0300, Amir Goldstein wrote:
> >> >> On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> >> > On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
> >> >> ...
> >> >> >>
> >> >> >> Process 2 will get lower dentry on open for read at 8AM
> >> >> >> Process 1 will copy up file at 9AM (on CPU1)
> >> >> >> Process 2 will open same file for read at 9AM (on CPU2)
> >> >> >> Does it matter if process 2 gets lower or upper dentry? No.
> >> >> >> It only matter that IF process 2 gets an upper dentry, that
> >> >> >> this dentry is consistent, so it only matters that IF __upperdentry
> >> >> >> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
> >> >> >> CPU2 then dentry and its inode are consistent.
> >> >> >
> >> >> > That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
> >> >> > yet, then CPU1 will use lower dentry. And this is equivalent to as if file
> >> >> > copy up has not taken place yet.
> >> >> >
> >> >> > And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
> >> >> > and that will take oi->lock and make sure OVL_UPPER_DATA is set.
> >> >> >
> >> >> > So only *additional* smp_rmb()/smp_wmb() we require for the case when
> >> >> > data is copied up later and we need to make sure OVL_UPPER_DATA is
> >> >> > visible only after the full data copy up is done and stable.
> >> >> >
> >> >> >
> >> >>
> >> >> Right. forgot about that wmb.
> >> >>
> >> >> >>
> >> >> >> So IMO you may only need to add smp_rmb() before
> >> >> >> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
> >> >> >> in ovl_inode_update() should be sufficient.
> >> >> >> Change the comment in ovl_inode_update() to mention that wmb also
> >> >> >> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
> >> >> >
> >> >> > Hmm..., I agree that we require smp_rmb() here but it will pair with
> >> >> > smp_wmb() in ovl_copy_meta_data_inode() and not the one in
> >> >> > ovl_inode_update(), right? Something like.
> >> >>
> >> >> Right. my bad.
> >> >>
> >> >> >
> >> >> > ovl_d_real() {
> >> >> >         bool has_upper_data;
> >> >> >
> >> >> >         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
> >> >> >         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
> >> >> >         smp_rmb();
> >> >> >         if (!has_upper_data)
> >> >> >                 goto lower;
> >> >>
> >> >> Just put smp_rmb() here. no need for the bool variable.
> >> >> rmb does matter if you goto lower...
> >> >
> >> > I thought smp_rmb() has to be put *only* after LOAD of oi->flags.
> >> > Something like.
> >> >
> >> > LOAD oi->flags
> >> > smp_rmb()
> >> > Look at results of oi->flags and take action.
> >> >
> >> > So that means I need to store results of oi->flags load in variable
> >> > temporarily so that I can analyze it after smp_rmb(). IOW, I am not
> >> > sure how would I get rid of boolean here. I need some kind of temp
> >> > variable.
> >> >
> >>
> >> One of us is very confused.
> >>
> >> Remember you are not synchronizing the value of OVL_UPPER_DATA between CPUs
> >> You don't care if user gets lower or upper dentry.
> >> You only care about the upper case so you can put smb_rmb() after goto
> >> lower line
> >> which will make sure CPU cannot read inconsistent upper inode state
> >> from before smp_wmb()
> >> in ovl_copy_up_meta_inode_data() after CPU read positive
> >> OVL_UPPER_DATA before smp_rmb().
> >> That's the way I understand it.
> >
> > ok, I think I get it now. You are suggesting following structure.
> >
> >         if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
> >                 goto lower;
> >         smp_rmb();
> >         return real;
> >
> > So if we are returning lower, we don't have to do smp_rmb(). But if we
> > saw OVL_UPPER_DATA, set, then we need to do smp_rmb() to make sure upper
> > is consistent (just in case it was data copied up just now).
> >
> > In fact, I should probably put is outside if condition block. That is.
> 
> If that was true then you should have tested OVL_UPPER_DATA outside
> the if condition. The reason it is not needed if inode is not NULL and equals
> to the upper inode then we have already made it visible to someone, then
> we can make it visible to evevryone.

That's a good point. I think there is more to it. More below.
> 
> This is very not clear from this code, so worth some fat comments.
> Also at top of ovl_d_real(),  D_REAL_UPPER just returns upper dentry
> without testing flag :-/

When you say "flag" you mean testing for OVL_UPPER_DATA?

> and this is not good for may_write_real()
> not sure about update_ovl_inode_times()...

Hmm..., thinking aloud about the whole issue. I think I have not
thought through the issue of d_real() returning a metadata only
dentry and what does that mean for rest of the system.

I think atleast current code is not broken w.r.t d_real(D_REAL_UPPER).
There are two users of D_REAL_UPPER currently. may_write_real() and
update_ovl_inode_times().

If we have a upper dentry which has only metadata, then may_write_real()
should return -EPERM because "file_inode(file) != d_inode(dentry)". IIUC,
these will be equal only if file had been opened with WRITE and in that
case returned dentry will not be metadata only.

And update_ovl_inode_times() just seems to retrieve metadata information
from upper inode and which should be just fine for metacopy inode. 

But D_REAL_UPPER is only one path. There are other ways one can get
pointer to upper metadata only dentry.

d_real(inode=X), can return upper dentry with metacopy only.

d_real(inode=NULL), will *not* return upper dentry with metacopy only. If
upper is metacopy only, it return lower dentry instead. I think this is
primarily the case of open(O_RDONLY).

So in terms of semantics of d_real() I am thinking of this now.

- d_real(inode=NULL, flags=0), will never return METACOPY dentry. Either
  it will copy up lower (if open_flags & WRITE) or will return lower.

If caller forces d_real() to return a specific dentry (either by
specifying D_REAL_UPPER or by specifying an inode, then one can
get back a METACOPY dentry. And one should not do any of data
operations on that dentry/inode.

So d_real(D_REAL_UPPER) and d_real(inode=X) can return METACOPY only
dentry.

Does that sound reasonable, or it is too fragile and broken?

I will try to audit the callers of d_real() now and see if something
else can be broken due to METACOPY only dentry being returned.

> 
> You should run another pass on all ovl_dentry_upper()
> ovl_dentry_real(), ovl_path_real() and ovl_path_upper()
> ovl_inode_upper() ovl_i_dentry_upper()
> I have a feeling there other issues lurking...

Will do.

I am also wondering what happens to various timestamps when data is
copied up later on a metadata only inode. I am guessing that I will
have to atleast copy mtime from lower and apply on upper. 

Vivek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-19 20:33                     ` Vivek Goyal
@ 2017-10-20  4:09                       ` Amir Goldstein
  2017-10-20 15:41                         ` Vivek Goyal
  0 siblings, 1 reply; 48+ messages in thread
From: Amir Goldstein @ 2017-10-20  4:09 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Oct 19, 2017 at 11:33 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Oct 19, 2017 at 07:33:37PM +0300, Amir Goldstein wrote:
...
>>
>> If that was true then you should have tested OVL_UPPER_DATA outside
>> the if condition. The reason it is not needed if inode is not NULL and equals
>> to the upper inode then we have already made it visible to someone, then
>> we can make it visible to evevryone.
>
> That's a good point. I think there is more to it. More below.
>>
>> This is very not clear from this code, so worth some fat comments.
>> Also at top of ovl_d_real(),  D_REAL_UPPER just returns upper dentry
>> without testing flag :-/
>
> When you say "flag" you mean testing for OVL_UPPER_DATA?

That's right.

>
>> and this is not good for may_write_real()
>> not sure about update_ovl_inode_times()...
>
> Hmm..., thinking aloud about the whole issue. I think I have not
> thought through the issue of d_real() returning a metadata only
> dentry and what does that mean for rest of the system.

It's not only d_real(), there are more entry points from vfs.
For example, ovl_permission,ovl_xattr_set/get will get to upper inode
from overlay inode (which is good for metadata copy case).

>
> I think atleast current code is not broken w.r.t d_real(D_REAL_UPPER).
> There are two users of D_REAL_UPPER currently. may_write_real() and
> update_ovl_inode_times().
>
> If we have a upper dentry which has only metadata, then may_write_real()
> should return -EPERM because "file_inode(file) != d_inode(dentry)". IIUC,
> these will be equal only if file had been opened with WRITE and in that
> case returned dentry will not be metadata only.

Right.

>
> And update_ovl_inode_times() just seems to retrieve metadata information
> from upper inode and which should be just fine for metacopy inode.

I guess it ok, but if that information leads to atime update and atime update
is done on wrong inode that its not good. didn't check.

>
> But D_REAL_UPPER is only one path. There are other ways one can get
> pointer to upper metadata only dentry.
>
> d_real(inode=X), can return upper dentry with metacopy only.
>
> d_real(inode=NULL), will *not* return upper dentry with metacopy only. If
> upper is metacopy only, it return lower dentry instead. I think this is
> primarily the case of open(O_RDONLY).
>
> So in terms of semantics of d_real() I am thinking of this now.
>
> - d_real(inode=NULL, flags=0), will never return METACOPY dentry. Either
>   it will copy up lower (if open_flags & WRITE) or will return lower.
>
> If caller forces d_real() to return a specific dentry (either by
> specifying D_REAL_UPPER or by specifying an inode, then one can
> get back a METACOPY dentry. And one should not do any of data
> operations on that dentry/inode.
>
> So d_real(D_REAL_UPPER) and d_real(inode=X) can return METACOPY only
> dentry.
>
> Does that sound reasonable, or it is too fragile and broken?

Fragile - yes! broken - probably ;)

>
> I will try to audit the callers of d_real() now and see if something
> else can be broken due to METACOPY only dentry being returned.
>
>>
>> You should run another pass on all ovl_dentry_upper()
>> ovl_dentry_real(), ovl_path_real() and ovl_path_upper()
>> ovl_inode_upper() ovl_i_dentry_upper()
>> I have a feeling there other issues lurking...
>
> Will do.

Hmm maybe we need to take yet a different approach.
Store __metaupperdentry when upper is metacopy
and store __upperdentry only after upper is blessed with data.

So only code that is aware of metacopy like setattr/getattr/permission
will even know there is a metacopy to read/write information from
rest of vfs cannot be exposed.
And it's even simpler than metacopy flag w.r.t memory barriers.

I've actually already implemented this scheme, but since then
abandoned this method. You may be able to use some code from:
https://github.com/amir73il/linux/commits/ovl-rocopyup-v1

Mainly ovl_dentry_ro_upper() and the code changes in util.c
and ovl_entry.h.

>
> I am also wondering what happens to various timestamps when data is
> copied up later on a metadata only inode. I am guessing that I will
> have to atleast copy mtime from lower and apply on upper.
>

Hmm this could be tricky. I think you need to "merge" mtime with ctime/atime
of metacopy inode. There are some relations to maintain among them,
which may not be trivial.
The simplest rule I think is mtime := ctime on data copy up,
ignoring the case of lower being modified after metadata was copied.
In most cases, mtime is about to be updated by write shortly after anyway.

Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update
  2017-10-20  4:09                       ` Amir Goldstein
@ 2017-10-20 15:41                         ` Vivek Goyal
  0 siblings, 0 replies; 48+ messages in thread
From: Vivek Goyal @ 2017-10-20 15:41 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Fri, Oct 20, 2017 at 07:09:52AM +0300, Amir Goldstein wrote:

[..]
> Hmm maybe we need to take yet a different approach.
> Store __metaupperdentry when upper is metacopy
> and store __upperdentry only after upper is blessed with data.
> 
> So only code that is aware of metacopy like setattr/getattr/permission
> will even know there is a metacopy to read/write information from
> rest of vfs cannot be exposed.
> And it's even simpler than metacopy flag w.r.t memory barriers.

Hmmm..., so basically metacopy dentry/inode will not be visible outside
overlayfs and only overlayfs knows about it. This defnitely sounds a
safer option.

So d_real(D_REAL_UPPER) will return nil for metacopy only dentry.

What about d_real(inode=X) for metacopy inode? That probably is a bug. If
we have never exposed metacopy only dentry/inode, then we should never be
called to return us dentry belonging to metacopy only inode.

One downside of this approach is that if there is any metadata operation
which happens outside overlayfs, it can't be done. For example, ideally
chattr. 

update_ovl_inode_times() may be a problem too. It will return NULL dentry
for metadata inode and atime *should* be broken. May be I can update ovl
inode ctime when metadata inode is copied up. That should make sure
that for regular files, update_ovl_inode_times() is probably not required.

But in general, not having to expose metacopy inode/dentry to vfs
directly makes me feel little better.

I will give this a try and see how does it look.

> 
> I've actually already implemented this scheme, but since then
> abandoned this method. You may be able to use some code from:
> https://github.com/amir73il/linux/commits/ovl-rocopyup-v1
> 
> Mainly ovl_dentry_ro_upper() and the code changes in util.c
> and ovl_entry.h.

Thanks. I will have a look at it.

Thanks
Vivek

> 
> >
> > I am also wondering what happens to various timestamps when data is
> > copied up later on a metadata only inode. I am guessing that I will
> > have to atleast copy mtime from lower and apply on upper.
> >
> 
> Hmm this could be tricky. I think you need to "merge" mtime with ctime/atime
> of metacopy inode. There are some relations to maintain among them,
> which may not be trivial.
> The simplest rule I think is mtime := ctime on data copy up,
> ignoring the case of lower being modified after metadata was copied.
> In most cases, mtime is about to be updated by write shortly after anyway.
> 
> Amir.

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2017-10-20 15:41 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-17 21:05 [RFC PATCH 00/11][V4] overlayfs: Delayed copy up of data Vivek Goyal
2017-10-17 21:05 ` [PATCH 01/11] ovl: Create origin xattr on copy up for all files Vivek Goyal
2017-10-18  4:09   ` Amir Goldstein
2017-10-18 12:55     ` Vivek Goyal
2017-10-18 13:56       ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 02/11] ovl: ovl_check_setxattr() get rid of redundant -EOPNOTSUPP check Vivek Goyal
2017-10-18  4:11   ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 03/11] ovl: During copy up, first copy up metadata and then data Vivek Goyal
2017-10-18  4:13   ` Amir Goldstein
2017-10-18  4:39     ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 04/11] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
2017-10-18  4:31   ` Amir Goldstein
2017-10-18 13:03     ` Vivek Goyal
2017-10-18 14:09       ` Amir Goldstein
2017-10-18 14:26         ` Vivek Goyal
2017-10-18 14:38           ` Amir Goldstein
2017-10-18 14:10     ` Vivek Goyal
2017-10-18 14:26       ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 05/11] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
2017-10-18  4:46   ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 06/11] ovl: Set xattr OVL_XATTR_METACOPY on upper file Vivek Goyal
2017-10-18  4:57   ` Amir Goldstein
2017-10-18 13:30     ` Vivek Goyal
2017-10-17 21:05 ` [PATCH 07/11] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
2017-10-18  5:01   ` Amir Goldstein
2017-10-18 13:39     ` Vivek Goyal
2017-10-17 21:05 ` [PATCH 08/11] ovl: Set OVL_METACOPY flag during ovl_lookup() Vivek Goyal
2017-10-18  5:06   ` Amir Goldstein
2017-10-18 13:53     ` Vivek Goyal
2017-10-17 21:05 ` [PATCH 09/11] ovl: Return lower dentry if only metadata copy up took place Vivek Goyal
2017-10-18  5:07   ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 10/11] ovl: Introduce read/write barriers around metacopy flag update Vivek Goyal
2017-10-18  5:19   ` Amir Goldstein
2017-10-18 15:32     ` Vivek Goyal
2017-10-18 16:05       ` Amir Goldstein
2017-10-17 21:05 ` [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update Vivek Goyal
2017-10-18  5:40   ` Amir Goldstein
2017-10-19 13:00     ` Vivek Goyal
2017-10-19 13:21       ` Amir Goldstein
2017-10-19 14:58         ` Vivek Goyal
2017-10-19 15:08           ` Amir Goldstein
2017-10-19 15:22             ` Vivek Goyal
2017-10-19 15:39               ` Amir Goldstein
2017-10-19 15:59                 ` Vivek Goyal
2017-10-19 16:33                   ` Amir Goldstein
2017-10-19 20:33                     ` Vivek Goyal
2017-10-20  4:09                       ` Amir Goldstein
2017-10-20 15:41                         ` Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.