linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/28] overlayfs: Delayed copy up of data
@ 2018-05-29 14:45 Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 01/28] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Miklos Szeredi
                   ` (27 more replies)
  0 siblings, 28 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

In one of the recent converstions, people mentioned that chown/chmod
lead to copy up files as well as data. We could optimize it so that
only metadata is copied up during chown/chmod and data is copied up when
file is opened for WRITE.

This optimization potentially could be useful with containers and user
namespaces. In popular scenario, people end up doing chown() on whole
image directory tree based on container mappings. And this chown copies
up everything, breaking sharing of page cache between containers.

With these patches, only metadat is copied up during chown() and if file
is opened for READ, d_real() returns lower dentry/inode. That way,
different containers can still continue to use page cache. That's the
use case I have in mind.

---
Vivek Goyal (28):
  ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  ovl: Move the copy up helpers to copy_up.c
  ovl: Provide a mount option metacopy=on/off for metadata copyup
  ovl: During copy up, first copy up metadata and then data
  ovl: Copy up only metadata during copy up where it makes sense
  ovl: Add helper ovl_already_copied_up()
  ovl: A new xattr OVL_XATTR_METACOPY for file on upper
  ovl: Use out_err instead of out_nomem
  ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  ovl: Copy up meta inode data from lowest data inode
  ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  ovl: Fix ovl_getattr() to get number of blocks from lower
  ovl: Store lower data inode in ovl_inode
  ovl: Add helper ovl_inode_realdata()
  ovl: Open file with data except for the case of fsync
  ovl: Do not expose metacopy only dentry from d_real()
  ovl: Move some dir related ovl_lookup_single() code in else block
  ovl: Check redirects for metacopy files
  ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  ovl: Add an inode flag OVL_CONST_INO
  ovl: Do not set dentry type ORIGIN for broken hardlinks
  ovl: Set redirect on metacopy files upon rename
  ovl: Set redirect on upper inode when it is linked
  ovl: Check redirect on index as well
  ovl: Disbale metacopy for MAP_SHARED mmap()
  ovl: Do not do metadata only copy-up for truncate operation
  ovl: Do not do metacopy only for ioctl modifying file attr
  ovl: Enable metadata only feature

 Documentation/filesystems/overlayfs.txt |  30 +++-
 fs/overlayfs/Kconfig                    |  19 +++
 fs/overlayfs/copy_up.c                  | 160 ++++++++++++++++-----
 fs/overlayfs/dir.c                      |  74 +++++++---
 fs/overlayfs/export.c                   |   3 +
 fs/overlayfs/file.c                     |  43 ++++--
 fs/overlayfs/inode.c                    | 112 +++++++++------
 fs/overlayfs/namei.c                    | 195 ++++++++++++++++----------
 fs/overlayfs/overlayfs.h                |  33 ++++-
 fs/overlayfs/ovl_entry.h                |   6 +-
 fs/overlayfs/super.c                    |  62 +++++++-
 fs/overlayfs/util.c                     | 241 +++++++++++++++++++++++++++++++-
 12 files changed, 780 insertions(+), 198 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/28] ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 02/28] ovl: Move the copy up helpers to copy_up.c Miklos Szeredi
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

ovl_inode->redirect is an inode property and should be initialized in
ovl_get_inode() only when we are adding a new inode to cache.  If inode is
already in cache, it is already initialized and we should not be touching
ovl_inode->redirect field.

As of now this is not a problem as redirects are used only for directories
which don't share inode.  But soon I want to use redirects for regular
files also and there it can become an issue.

Hence, move ->redirect initialization in ovl_get_inode().

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c     | 3 +++
 fs/overlayfs/namei.c     | 8 +-------
 fs/overlayfs/overlayfs.h | 1 +
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index cd46dd8e7e54..c365b94fa158 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -844,6 +844,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 			}
 
 			dput(upperdentry);
+			kfree(oip->redirect);
 			goto out;
 		}
 
@@ -867,6 +868,8 @@ struct inode *ovl_get_inode(struct super_block *sb,
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	OVL_I(inode)->redirect = oip->redirect;
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 08801b45df00..54f206485ab8 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1009,19 +1009,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			.lowerpath = stack,
 			.index = index,
 			.numlower = ctr,
+			.redirect = upperredirect,
 		};
 
 		inode = ovl_get_inode(dentry->d_sb, &oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
-
-		/*
-		 * NB: handle redirected hard links when non-dir redirects
-		 * become possible
-		 */
-		WARN_ON(OVL_I(inode)->redirect);
-		OVL_I(inode)->redirect = upperredirect;
 	}
 
 	revert_creds(old_cred);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index be4f1664f662..1948e38c3c95 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -342,6 +342,7 @@ struct ovl_inode_params {
 	struct ovl_path *lowerpath;
 	struct dentry *index;
 	unsigned int numlower;
+	char *redirect;
 };
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 02/28] ovl: Move the copy up helpers to copy_up.c
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 01/28] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup Miklos Szeredi
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Right now two copy up helpers are in inode.c.  Amir suggested it might be
better to move these to copy_up.c.

There will one more related function which will come in later patch.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c   | 32 ++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c     | 32 --------------------------------
 fs/overlayfs/overlayfs.h |  2 +-
 3 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index e675e8349e71..91b3b668482b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -779,6 +779,38 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 	return err;
 }
 
+static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
+{
+	/* Copy up of disconnected dentry does not set upper alias */
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) ||
+	     (dentry->d_flags & DCACHE_DISCONNECTED)))
+		return false;
+
+	if (special_file(d_inode(dentry)->i_mode))
+		return false;
+
+	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
+		return false;
+
+	return true;
+}
+
+int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
+{
+	int err = 0;
+
+	if (ovl_open_need_copy_up(dentry, file_flags)) {
+		err = ovl_want_write(dentry);
+		if (!err) {
+			err = ovl_copy_up_flags(dentry, file_flags);
+			ovl_drop_write(dentry);
+		}
+	}
+
+	return err;
+}
+
 int ovl_copy_up(struct dentry *dentry)
 {
 	return ovl_copy_up_flags(dentry, 0);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index c365b94fa158..60c88a09a3f3 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -399,38 +399,6 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 	return acl;
 }
 
-static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
-{
-	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) ||
-	     (dentry->d_flags & DCACHE_DISCONNECTED)))
-		return false;
-
-	if (special_file(d_inode(dentry)->i_mode))
-		return false;
-
-	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
-		return false;
-
-	return true;
-}
-
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
-{
-	int err = 0;
-
-	if (ovl_open_need_copy_up(dentry, file_flags)) {
-		err = ovl_want_write(dentry);
-		if (!err) {
-			err = ovl_copy_up_flags(dentry, file_flags);
-			ovl_drop_write(dentry);
-		}
-	}
-
-	return err;
-}
-
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags)
 {
 	if (flags & S_ATIME) {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 1948e38c3c95..9172d3d5d870 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -332,7 +332,6 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 		  void *value, size_t size);
 ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
 struct posix_acl *ovl_get_acl(struct inode *inode, int type);
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
@@ -391,6 +390,7 @@ extern const struct file_operations ovl_file_operations;
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
+int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
 struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 01/28] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 02/28] ovl: Move the copy up helpers to copy_up.c Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 20:44   ` Randy Dunlap
  2018-05-29 14:45 ` [PATCH 04/28] ovl: During copy up, first copy up metadata and then data Miklos Szeredi
                   ` (24 subsequent siblings)
  27 siblings, 1 reply; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

By default metadata only copy up is disabled.  Provide a mount option so
that users can choose one way or other.

Also provide a kernel config and module option to enable/disable metacopy
feature.

metacopy feature requires redirect_dir=on when upper is present.
Otherwise, it requires redirect_dir=follow atleast.

As of now, metacopy does not work with nfs_export=on.  So if both
metacopy=on and nfs_export=on then nfs_export is disabled.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 Documentation/filesystems/overlayfs.txt | 30 ++++++++++++++++++++-
 fs/overlayfs/Kconfig                    | 19 ++++++++++++++
 fs/overlayfs/ovl_entry.h                |  1 +
 fs/overlayfs/super.c                    | 46 ++++++++++++++++++++++++++++++---
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index f087bc40c6a5..79be4a77ca08 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -262,6 +262,30 @@ rightmost one and going left.  In the above example lower1 will be the
 top, lower2 the middle and lower3 the bottom layer.
 
 
+Metadata only copy up
+--------------------
+
+When metadata only copy up feature is enabled, overlayfs will only copy
+up metadata (as opposed to whole file), when a metadata specific operation
+like chown/chmod is performed. Full file will be copied up later when
+file is opened for WRITE operation.
+
+In other words, this is delayed data copy up operation and data is copied
+up when there is a need to actually modify data.
+
+There are multiple ways to enable/disable this feature. A config option
+CONFIG_OVERLAY_FS_METACOPY can be set/unset to enable/disable this feature
+by default. Or one can enable/disable it at module load time with module
+parameter metacopy=on/off. Lastly, there is also a per mount option
+metacopy=on/off to enable/disable this feature per mount.
+
+Do not use metacopy=on with untrusted upper/lower directories. Otherwise
+it is possible that an attacker can create an handcrafted file with
+appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower
+pointed by REDIRECT. This should not be possible on local system as setting
+"trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible
+for untrusted layers like from a pen drive.
+
 Sharing and copying layers
 --------------------------
 
@@ -280,7 +304,7 @@ though it will not result in a crash or deadlock.
 Mounting an overlay using an upper layer path, where the upper layer path
 was previously used by another mounted overlay in combination with a
 different lower layer path, is allowed, unless the "inodes index" feature
-is enabled.
+or "metadata only copy up" feature is enabled.
 
 With the "inodes index" feature, on the first time mount, an NFS file
 handle of the lower layer root directory, along with the UUID of the lower
@@ -293,6 +317,10 @@ lower root origin, mount will fail with ESTALE.  An overlayfs mount with
 does not support NFS export, lower filesystem does not have a valid UUID or
 if the upper filesystem does not support extended attributes.
 
+For "metadata only copy up" feature there is no verification mechanism at
+mount time. So if same upper is mouted with different set of lower, mount
+probably will succeed but expect the unexpected later on. So don't do it.
+
 It is quite a common practice to copy overlay layers to a different
 directory tree on the same or different underlying filesystem, and even
 to a different machine.  With the "inodes index" feature, trying to mount
diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 5d1d40d745c5..e0a090eca65e 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -64,6 +64,7 @@ config OVERLAY_FS_NFS_EXPORT
 	bool "Overlayfs: turn on NFS export feature by default"
 	depends on OVERLAY_FS
 	depends on OVERLAY_FS_INDEX
+	depends on !OVERLAY_FS_METACOPY
 	help
 	  If this config option is enabled then overlay filesystems will use
 	  the inodes index dir to decode overlay NFS file handles by default.
@@ -124,3 +125,21 @@ config OVERLAY_FS_COPY_UP_SHARED
 	  To get a maximally backward compatible kernel, disable this option.
 
 	  If unsure, say N.
+
+config OVERLAY_FS_METACOPY
+	bool "Overlayfs: turn on metadata only copy up feature by default"
+	depends on OVERLAY_FS
+	select OVERLAY_FS_REDIRECT_DIR
+	help
+	  If this config option is enabled then overlay filesystems will
+	  copy up only metadata where appropriate and data copy up will
+	  happen when a file is opended for WRITE operation. It is still
+	  possible to turn off this feature globally with the "metacopy=off"
+	  module option or on a filesystem instance basis with the
+	  "metacopy=off" mount option.
+
+	  Note, that this feature is not backward compatible.  That is,
+	  mounting an overlay which has metacopy only inodes on a kernel
+	  that doesn't support this feature will have unexpected results.
+
+	  If unsure, say N.
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 3bea47c63fd9..da65118a0567 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -20,6 +20,7 @@ struct ovl_config {
 	bool nfs_export;
 	bool copy_up_shared;
 	int xino;
+	bool metacopy;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index cd5c82f105d6..17873b804c04 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -70,6 +70,11 @@ static void ovl_entry_stack_free(struct ovl_entry *oe)
 		dput(oe->lowerstack[i].dentry);
 }
 
+static bool ovl_metacopy_def = IS_ENABLED(CONFIG_OVERLAY_FS_METACOPY);
+module_param_named(metacopy, ovl_metacopy_def, bool, 0644);
+MODULE_PARM_DESC(ovl_metacopy_def,
+		 "Default to on or off for the metadata only copy up feature");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -356,6 +361,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.copy_up_shared != ovl_copy_up_shared_def)
 		seq_printf(m, ",copy_up_shared=%s",
 			   ofs->config.copy_up_shared ? "on" : "off");
+	if (ofs->config.metacopy != ovl_metacopy_def)
+		seq_printf(m, ",metacopy=%s",
+			   ofs->config.metacopy ? "on" : "off");
 	return 0;
 }
 
@@ -395,6 +403,8 @@ enum {
 	OPT_XINO_AUTO,
 	OPT_COPY_UP_SHARED_ON,
 	OPT_COPY_UP_SHARED_OFF,
+	OPT_METACOPY_ON,
+	OPT_METACOPY_OFF,
 	OPT_ERR,
 };
 
@@ -413,6 +423,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_AUTO,			"xino=auto"},
 	{OPT_COPY_UP_SHARED_ON,		"copy_up_shared=on"},
 	{OPT_COPY_UP_SHARED_OFF,	"copy_up_shared=off"},
+	{OPT_METACOPY_ON,		"metacopy=on"},
+	{OPT_METACOPY_OFF,		"metacopy=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -465,6 +477,7 @@ static int ovl_parse_redirect_mode(struct ovl_config *config, const char *mode)
 static int ovl_parse_opt(char *opt, struct ovl_config *config)
 {
 	char *p;
+	int err;
 
 	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
 	if (!config->redirect_mode)
@@ -547,6 +560,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->copy_up_shared = false;
 			break;
 
+		case OPT_METACOPY_ON:
+			config->metacopy = true;
+			break;
+
+		case OPT_METACOPY_OFF:
+			config->metacopy = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -561,7 +582,20 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 		config->workdir = NULL;
 	}
 
-	return ovl_parse_redirect_mode(config, config->redirect_mode);
+	err = ovl_parse_redirect_mode(config, config->redirect_mode);
+	if (err)
+		return err;
+
+	/* metacopy feature with upper requires redirect_dir=on */
+	if (config->upperdir && config->metacopy && !config->redirect_dir) {
+		pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=on\", falling back to metacopy=off.\n");
+		config->metacopy = false;
+	} else if (config->metacopy && !config->redirect_follow) {
+		pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=follow\" on non-upper mount, falling back to metacopy=off.\n");
+		config->metacopy = false;
+	}
+
+	return 0;
 }
 
 #define OVL_WORKDIR_NAME "work"
@@ -1034,7 +1068,8 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 	if (err) {
 		ofs->noxattr = true;
 		ofs->config.index = false;
-		pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off.\n");
+		ofs->config.metacopy = false;
+		pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off and metacopy=off.\n");
 		err = 0;
 	} else {
 		vfs_removexattr(ofs->workdir, OVL_XATTR_OPAQUE);
@@ -1056,7 +1091,6 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 		pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
 		ofs->config.nfs_export = false;
 	}
-
 out:
 	mnt_drop_write(mnt);
 	return err;
@@ -1368,6 +1402,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ofs->config.nfs_export = ovl_nfs_export_def;
 	ofs->config.xino = ovl_xino_def();
 	ofs->config.copy_up_shared = ovl_copy_up_shared_def;
+	ofs->config.metacopy = ovl_metacopy_def;
 	err = ovl_parse_opt((char *) data, &ofs->config);
 	if (err)
 		goto out_err;
@@ -1438,6 +1473,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		}
 	}
 
+	if (ofs->config.metacopy && ofs->config.nfs_export) {
+		pr_warn("overlayfs: NFS export is not supported with metadata only copy up, falling back to nfs_export=off.\n");
+		ofs->config.nfs_export = false;
+	}
+
 	if (ofs->config.nfs_export)
 		sb->s_export_op = &ovl_export_operations;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 04/28] ovl: During copy up, first copy up metadata and then data
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (2 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 05/28] ovl: Copy up only metadata during copy up where it makes sense Miklos Szeredi
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Just a little re-ordering of code.  This helps with next patch where after
copying up metadata, we skip data copying step, if needed.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c | 36 +++++++++++++++++-------------------
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 91b3b668482b..c072a4c0cf08 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -490,28 +490,10 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 {
 	int err;
 
-	if (S_ISREG(c->stat.mode)) {
-		struct path upperpath;
-
-		ovl_path_upper(c->dentry, &upperpath);
-		BUG_ON(upperpath.dentry != NULL);
-		upperpath.dentry = temp;
-
-		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
-		if (err)
-			return err;
-	}
-
 	err = ovl_copy_xattr(c->lowerpath.dentry, temp);
 	if (err)
 		return err;
 
-	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
-	inode_unlock(temp->d_inode);
-	if (err)
-		return err;
-
 	/*
 	 * Store identifier of lower inode in upper inode xattr to
 	 * allow lookup of the copy up origin inode.
@@ -525,7 +507,23 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	return 0;
+	if (S_ISREG(c->stat.mode)) {
+		struct path upperpath;
+
+		ovl_path_upper(c->dentry, &upperpath);
+		BUG_ON(upperpath.dentry != NULL);
+		upperpath.dentry = temp;
+
+		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+		if (err)
+			return err;
+	}
+
+	inode_lock(temp->d_inode);
+	err = ovl_set_attr(temp, &c->stat);
+	inode_unlock(temp->d_inode);
+
+	return err;
 }
 
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 05/28] ovl: Copy up only metadata during copy up where it makes sense
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (3 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 04/28] ovl: During copy up, first copy up metadata and then data Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 06/28] ovl: Add helper ovl_already_copied_up() Miklos Szeredi
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

If it makes sense to copy up only metadata during copy up, do it.  This is
done for regular files which are not opened for WRITE.

Right now ->metacopy is set to 0 always.  Last patch in the series will
remove the hard coded statement and enable metacopy feature.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index c072a4c0cf08..38cfd1acc196 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -388,6 +388,7 @@ struct ovl_copy_up_ctx {
 	bool tmpfile;
 	bool origin;
 	bool indexed;
+	bool metacopy;
 };
 
 static int ovl_link_up(struct ovl_copy_up_ctx *c)
@@ -507,7 +508,7 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	if (S_ISREG(c->stat.mode)) {
+	if (S_ISREG(c->stat.mode) && !c->metacopy) {
 		struct path upperpath;
 
 		ovl_path_upper(c->dentry, &upperpath);
@@ -660,6 +661,26 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 	return err;
 }
 
+static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
+				  int flags)
+{
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+
+	/* TODO: Will enable metacopy in last patch of series */
+	return false;
+
+	if (!ofs->config.metacopy)
+		return false;
+
+	if (!S_ISREG(mode))
+		return false;
+
+	if (flags && ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC)))
+		return false;
+
+	return true;
+}
+
 static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   int flags)
 {
@@ -681,6 +702,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	if (err)
 		return err;
 
+	ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags);
+
 	if (parent) {
 		ovl_path_upper(parent, &parentpath);
 		ctx.destdir = parentpath.dentry;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 06/28] ovl: Add helper ovl_already_copied_up()
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (4 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 05/28] ovl: Copy up only metadata during copy up where it makes sense Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 07/28] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Miklos Szeredi
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

There are couple of places where we need to know if file is already copied
up (in lockless manner).  Right now its open coded and there are only two
conditions to check.  Soon this patch series will introduce another
condition to check and Amir wants to introduce one more.  So introduce a
helper instead to check this so that code is easier to read.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c   | 20 ++------------------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 26 +++++++++++++++++++++++++-
 3 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 38cfd1acc196..6247617fea0b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -761,21 +761,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		struct dentry *next;
 		struct dentry *parent = NULL;
 
-		/*
-		 * Check if copy-up has happened as well as for upper alias (in
-		 * case of hard links) is there.
-		 *
-		 * Both checks are lockless:
-		 *  - false negatives: will recheck under oi->lock
-		 *  - false positives:
-		 *    + ovl_dentry_upper() uses memory barriers to ensure the
-		 *      upper dentry is up-to-date
-		 *    + ovl_dentry_has_upper_alias() relies on locking of
-		 *      upper parent i_rwsem to prevent reordering copy-up
-		 *      with rename.
-		 */
-		if (ovl_dentry_upper(dentry) &&
-		    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+		if (ovl_already_copied_up(dentry))
 			break;
 
 		next = dget(dentry);
@@ -803,9 +789,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) ||
-	     (dentry->d_flags & DCACHE_DISCONNECTED)))
+	if (ovl_already_copied_up(dentry))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9172d3d5d870..2a9c5a80ae48 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -238,6 +238,7 @@ bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
 int ovl_copy_up_start(struct dentry *dentry);
 void ovl_copy_up_end(struct dentry *dentry);
+bool ovl_already_copied_up(struct dentry *dentry);
 bool ovl_check_origin_xattr(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 25d202b47326..43235294e77b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -377,13 +377,37 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
+bool ovl_already_copied_up(struct dentry *dentry)
+{
+	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
+
+	/*
+	 * Check if copy-up has happened as well as for upper alias (in
+	 * case of hard links) is there.
+	 *
+	 * Both checks are lockless:
+	 *  - false negatives: will recheck under oi->lock
+	 *  - false positives:
+	 *    + ovl_dentry_upper() uses memory barriers to ensure the
+	 *      upper dentry is up-to-date
+	 *    + ovl_dentry_has_upper_alias() relies on locking of
+	 *      upper parent i_rwsem to prevent reordering copy-up
+	 *      with rename.
+	 */
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+		return true;
+
+	return false;
+}
+
 int ovl_copy_up_start(struct dentry *dentry)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_dentry_has_upper_alias(dentry)) {
+	if (!err && ovl_already_copied_up(dentry)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 07/28] ovl: A new xattr OVL_XATTR_METACOPY for file on upper
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (5 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 06/28] ovl: Add helper ovl_already_copied_up() Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 08/28] ovl: Use out_err instead of out_nomem Miklos Szeredi
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Now we will have the capability to have upper inodes which might be only
metadata copy up and data is still on lower inode.  So add a new xattr
OVL_XATTR_METACOPY to distinguish between two cases.

Presence of OVL_XATTR_METACOPY reflects that file has been copied up
metadata only and and data will be copied up later from lower origin.  So
this xattr is set when a metadata copy takes place and cleared when data
copy takes place.

We also use a bit in ovl_inode->flags to cache OVL_UPPERDATA which reflects
whether ovl inode has data or not (as opposed to metadata only copy up).

If a file is copied up metadata only and later when same file is opened for
WRITE, then data copy up takes place.  We copy up data, remove METACOPY
xattr and then set the UPPERDATA flag in ovl_inode->flags.  While all these
operations happen with oi->lock held, read side of oi->flags can be
lockless.  That is another thread on another cpu can check if UPPERDATA
flag is set or not.

So this gives us an ordering requirement w.r.t UPPERDATA flag.  That is, if
another cpu sees UPPERDATA flag set, then it should be guaranteed that
effects of data copy up and remove xattr operations are also visible.

For example.

	CPU1				CPU2
ovl_open()				acquire(oi->lock)
 ovl_open_maybe_copy_up()                ovl_copy_up_data()
  open_open_need_copy_up()		 vfs_removexattr()
   ovl_already_copied_up()
    ovl_dentry_needs_data_copy_up()	 ovl_set_flag(OVL_UPPERDATA)
     ovl_test_flag(OVL_UPPERDATA)       release(oi->lock)

Say CPU2 is copying up data and in the end sets UPPERDATA flag.  But if
CPU1 perceives the effects of setting UPPERDATA flag but not the effects of
preceding operations (ex. upper that is not fully copied up), it will be a
problem.

Hence this patch introduces smp_wmb() on setting UPPERDATA flag operation
and smp_rmb() on UPPERDATA flag test operation.

May be some other lock or barrier is already covering it. But I am not sure
what that is and is it obvious enough that we will not break it in future.

So hence trying to be safe here and introducing barriers explicitly for
UPPERDATA flag/bit.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c   | 56 ++++++++++++++++++++++++++++++----
 fs/overlayfs/overlayfs.h | 18 +++++++++--
 fs/overlayfs/super.c     |  1 +
 fs/overlayfs/util.c      | 78 +++++++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 142 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 6247617fea0b..7d01afd345e9 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -180,6 +180,16 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 	return error;
 }
 
+static int ovl_set_size(struct dentry *upperdentry, struct kstat *stat)
+{
+	struct iattr attr = {
+		.ia_valid = ATTR_SIZE,
+		.ia_size = stat->size,
+	};
+
+	return notify_change(upperdentry, &attr, NULL);
+}
+
 static int ovl_set_timestamps(struct dentry *upperdentry, struct kstat *stat)
 {
 	struct iattr attr = {
@@ -520,8 +530,18 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
+	if (c->metacopy) {
+		err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
+					 NULL, 0, -EOPNOTSUPP);
+		if (err)
+			return err;
+	}
+
 	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
+	if (c->metacopy)
+		err = ovl_set_size(temp, &c->stat);
+	if (!err)
+		err = ovl_set_attr(temp, &c->stat);
 	inode_unlock(temp->d_inode);
 
 	return err;
@@ -559,6 +579,8 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
 	if (err)
 		goto out;
 
+	if (!c->metacopy)
+		ovl_set_upperdata(d_inode(c->dentry));
 	inode = d_inode(c->dentry);
 	ovl_inode_update(inode, newdentry);
 	if (S_ISDIR(inode->i_mode))
@@ -681,6 +703,28 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 	return true;
 }
 
+/* Copy up data of an inode which was copied up metadata only in the past. */
+static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
+{
+	struct path upperpath;
+	int err;
+
+	ovl_path_upper(c->dentry, &upperpath);
+	if (WARN_ON(upperpath.dentry == NULL))
+		return -EIO;
+
+	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	if (err)
+		return err;
+
+	err = vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY);
+	if (err)
+		return err;
+
+	ovl_set_upperdata(d_inode(c->dentry));
+	return err;
+}
+
 static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   int flags)
 {
@@ -726,7 +770,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			return PTR_ERR(ctx.link);
 	}
 
-	err = ovl_copy_up_start(dentry);
+	err = ovl_copy_up_start(dentry, flags);
 	/* err < 0: interrupted, err > 0: raced with another copy-up */
 	if (unlikely(err)) {
 		if (err > 0)
@@ -736,6 +780,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			err = ovl_do_copy_up(&ctx);
 		if (!err && parent && !ovl_dentry_has_upper_alias(dentry))
 			err = ovl_link_up(&ctx);
+		if (!err && ovl_dentry_needs_data_copy_up_locked(dentry, flags))
+			err = ovl_copy_up_meta_inode_data(&ctx);
 		ovl_copy_up_end(dentry);
 	}
 	do_delayed_call(&done);
@@ -761,7 +807,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		struct dentry *next;
 		struct dentry *parent = NULL;
 
-		if (ovl_already_copied_up(dentry))
+		if (ovl_already_copied_up(dentry, flags))
 			break;
 
 		next = dget(dentry);
@@ -789,13 +835,13 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_already_copied_up(dentry))
+	if (ovl_already_copied_up(dentry, flags))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
 		return false;
 
-	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
+	if (!ovl_open_flags_need_copy_up(flags))
 		return false;
 
 	return true;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 2a9c5a80ae48..45e4a581a9fe 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -29,6 +29,7 @@ enum ovl_path_type {
 #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure"
 #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink"
 #define OVL_XATTR_UPPER OVL_XATTR_PREFIX "upper"
+#define OVL_XATTR_METACOPY OVL_XATTR_PREFIX "metacopy"
 
 enum ovl_inode_flag {
 	/* Pure upper dir that may contain non pure upper entries */
@@ -36,6 +37,7 @@ enum ovl_inode_flag {
 	/* Non-merge dir that may contain whiteout entries */
 	OVL_WHITEOUTS,
 	OVL_INDEX,
+	OVL_UPPERDATA,
 };
 
 enum ovl_entry_flag {
@@ -191,6 +193,14 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
 	return ret;
 }
 
+static inline bool ovl_open_flags_need_copy_up(int flags)
+{
+	if (!flags)
+		return false;
+
+	return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
+}
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
@@ -226,6 +236,10 @@ bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry);
 bool ovl_dentry_has_upper_alias(struct dentry *dentry);
 void ovl_dentry_set_upper_alias(struct dentry *dentry);
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags);
+bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags);
+bool ovl_has_upperdata(struct inode *inode);
+void ovl_set_upperdata(struct inode *inode);
 bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
@@ -236,9 +250,9 @@ void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
-int ovl_copy_up_start(struct dentry *dentry);
+int ovl_copy_up_start(struct dentry *dentry, int flags);
 void ovl_copy_up_end(struct dentry *dentry);
-bool ovl_already_copied_up(struct dentry *dentry);
+bool ovl_already_copied_up(struct dentry *dentry, int flags);
 bool ovl_check_origin_xattr(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 17873b804c04..d96530717702 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1507,6 +1507,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	/* Root is always merge -> can have whiteouts */
 	ovl_set_flag(OVL_WHITEOUTS, d_inode(root_dentry));
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
+	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
 		       ovl_dentry_lower(root_dentry));
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 43235294e77b..f8e3c95711b8 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -279,6 +279,62 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
 	ovl_dentry_set_flag(OVL_E_UPPER_ALIAS, dentry);
 }
 
+static bool ovl_should_check_upperdata(struct inode *inode)
+{
+	if (!S_ISREG(inode->i_mode))
+		return false;
+
+	if (!ovl_inode_lower(inode))
+		return false;
+
+	return true;
+}
+
+bool ovl_has_upperdata(struct inode *inode)
+{
+	if (!ovl_should_check_upperdata(inode))
+		return true;
+
+	if (!ovl_test_flag(OVL_UPPERDATA, inode))
+		return false;
+	/*
+	 * Pairs with smp_wmb() in ovl_set_upperdata(). Main user of
+	 * ovl_has_upperdata() is ovl_copy_up_meta_inode_data(). Make sure
+	 * if setting of OVL_UPPERDATA is visible, then effects of writes
+	 * before that are visible too.
+	 */
+	smp_rmb();
+	return true;
+}
+
+void ovl_set_upperdata(struct inode *inode)
+{
+	/*
+	 * Pairs with smp_rmb() in ovl_has_upperdata(). Make sure
+	 * if OVL_UPPERDATA flag is visible, then effects of write operations
+	 * before it are visible as well.
+	 */
+	smp_wmb();
+	ovl_set_flag(OVL_UPPERDATA, inode);
+}
+
+/* Caller should hold ovl_inode->lock */
+bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
+{
+	if (!ovl_open_flags_need_copy_up(flags))
+		return false;
+
+	return !ovl_test_flag(OVL_UPPERDATA, d_inode(dentry));
+}
+
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags)
+{
+	if (!ovl_open_flags_need_copy_up(flags))
+		return false;
+
+	return !ovl_has_upperdata(d_inode(dentry));
+}
+
 bool ovl_redirect_dir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -377,7 +433,20 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
-bool ovl_already_copied_up(struct dentry *dentry)
+/* Caller should hold ovl_inode->lock */
+static bool ovl_already_copied_up_locked(struct dentry *dentry, int flags)
+{
+	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
+
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected) &&
+	    !ovl_dentry_needs_data_copy_up_locked(dentry, flags))
+		return true;
+
+	return false;
+}
+
+bool ovl_already_copied_up(struct dentry *dentry, int flags)
 {
 	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
 
@@ -395,19 +464,20 @@ bool ovl_already_copied_up(struct dentry *dentry)
 	 *      with rename.
 	 */
 	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected) &&
+	    !ovl_dentry_needs_data_copy_up(dentry, flags))
 		return true;
 
 	return false;
 }
 
-int ovl_copy_up_start(struct dentry *dentry)
+int ovl_copy_up_start(struct dentry *dentry, int flags)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_already_copied_up(dentry)) {
+	if (!err && ovl_already_copied_up_locked(dentry, flags)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 08/28] ovl: Use out_err instead of out_nomem
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (6 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 07/28] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 09/28] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Miklos Szeredi
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Right now we use goto out_nomem which assumes error code is -ENOMEM.  But
there are other errors returned like -ESTALE as well.  So instead of
out_nomem, use out_err which will do ERR_PTR(err).  That way one can put
error code in err and jump to out_err.

This just code reorganization and no change of functionality.

I am about to add more code and this organization helps laying more code
and error paths on top of it.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 60c88a09a3f3..0d5615b80422 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -782,6 +782,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
+	int err = -ENOMEM;
 
 	if (!realinode)
 		realinode = d_inode(lowerdentry);
@@ -798,7 +799,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 
 		inode = ovl_iget5(sb, oip->newinode, key);
 		if (!inode)
-			goto out_nomem;
+			goto out_err;
 		if (!(inode->i_state & I_NEW)) {
 			/*
 			 * Verify that the underlying files stored in the inode
@@ -807,8 +808,8 @@ struct inode *ovl_get_inode(struct super_block *sb,
 			if (!ovl_verify_inode(inode, lowerdentry, upperdentry,
 					      true)) {
 				iput(inode);
-				inode = ERR_PTR(-ESTALE);
-				goto out;
+				err = -ESTALE;
+				goto out_err;
 			}
 
 			dput(upperdentry);
@@ -824,8 +825,10 @@ struct inode *ovl_get_inode(struct super_block *sb,
 	} else {
 		/* Lower hardlink that will be broken on copy up */
 		inode = new_inode(sb);
-		if (!inode)
-			goto out_nomem;
+		if (!inode) {
+			err = -ENOMEM;
+			goto out_err;
+		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
 	ovl_inode_init(inode, upperdentry, lowerdentry);
@@ -851,7 +854,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 out:
 	return inode;
 
-out_nomem:
-	inode = ERR_PTR(-ENOMEM);
+out_err:
+	inode = ERR_PTR(err);
 	goto out;
 }
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 09/28] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (7 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 08/28] ovl: Use out_err instead of out_nomem Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 10/28] ovl: Copy up meta inode data from lowest data inode Miklos Szeredi
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
It also allows for presence of metacopy dentries in lower layer.

During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
set OVL_UPPERDATA bit in flags.

We don't support metacopy feature with nfs_export.  So in nfs_export code,
we set OVL_UPPERDATA flag set unconditionally if upper inode exists.

Do not follow metacopy origin if we find a metacopy only inode and metacopy
feature is not enabled for that mount.  Like redirect, this can have
security implications where an attacker could hand craft upper and try to
gain access to file on lower which it should not have to begin with.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/export.c    |   3 ++
 fs/overlayfs/inode.c     |  11 ++++-
 fs/overlayfs/namei.c     | 112 ++++++++++++++++++++++++++++++++++++++++-------
 fs/overlayfs/overlayfs.h |   1 +
 fs/overlayfs/util.c      |  22 ++++++++++
 5 files changed, 131 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 9941ece61a14..8fa37cd7818a 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -317,6 +317,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 		return ERR_CAST(inode);
 	}
 
+	if (upper)
+		ovl_set_flag(OVL_UPPERDATA, inode);
+
 	dentry = d_find_any_alias(inode);
 	if (!dentry) {
 		dentry = d_alloc_anon(inode->i_sb);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 0d5615b80422..350e38c5ce7d 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -780,7 +780,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
 					oip->index);
 	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
-	bool is_dir;
+	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = -ENOMEM;
 
@@ -839,6 +839,15 @@ struct inode *ovl_get_inode(struct super_block *sb,
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	if (upperdentry) {
+		err = ovl_check_metacopy_xattr(upperdentry);
+		if (err < 0)
+			goto out_err;
+		metacopy = err;
+		if (!metacopy)
+			ovl_set_flag(OVL_UPPERDATA, inode);
+	}
+
 	OVL_I(inode)->redirect = oip->redirect;
 
 	/* Check for non-merge dir that may have whiteouts */
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 54f206485ab8..fcf0eb40d94b 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -24,6 +24,7 @@ struct ovl_lookup_data {
 	bool stop;
 	bool last;
 	char *redirect;
+	bool metacopy;
 };
 
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
@@ -252,16 +253,25 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		d->stop = d->opaque = true;
 		goto put_and_out;
 	}
-	if (!d_can_lookup(this)) {
+	/*
+	 * This dentry should be a regular file if previous layer lookup
+	 * found a metacopy dentry.
+	 */
+	if (last_element && d->metacopy && !d_is_reg(this)) {
 		d->stop = true;
-		if (d->is_dir)
+		goto put_and_out;
+	}
+	if (!d_can_lookup(this)) {
+		if (d->is_dir || !last_element) {
+			d->stop = true;
 			goto put_and_out;
+		}
+		err = ovl_check_metacopy_xattr(this);
+		if (err < 0)
+			goto out_err;
 
-		/*
-		 * NB: handle failure to lookup non-last element when non-dir
-		 * redirects become possible
-		 */
-		WARN_ON(!last_element);
+		d->metacopy = err;
+		d->stop = !d->metacopy;
 		goto out;
 	}
 	if (last_element)
@@ -823,7 +833,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct ovl_path *stack = NULL;
+	struct ovl_path *stack = NULL, *origin_path = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *origin = NULL;
 	struct dentry *index = NULL;
@@ -834,6 +844,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	bool metacopy = false;
 	struct ovl_lookup_data d = {
 		.name = dentry->d_name,
 		.is_dir = false,
@@ -841,6 +852,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		.stop = false,
 		.last = ofs->config.redirect_follow ? false : !poe->numlower,
 		.redirect = NULL,
+		.metacopy = false,
 	};
 
 	if (dentry->d_name.len > ofs->namelen)
@@ -859,7 +871,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			goto out;
 		}
 		if (upperdentry && !d.is_dir) {
-			BUG_ON(!d.stop || d.redirect);
+			unsigned int origin_ctr = 0;
+
+			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
@@ -870,9 +884,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			 * number - it's the same as if we held a reference
 			 * to a dentry in lower layer that was moved under us.
 			 */
-			err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
+			err = ovl_check_origin(ofs, upperdentry, &origin_path,
+					       &origin_ctr);
 			if (err)
 				goto out_put_upper;
+
+			if (d.metacopy)
+				metacopy = true;
 		}
 
 		if (d.redirect) {
@@ -913,7 +931,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		 * If no origin fh is stored in upper of a merge dir, store fh
 		 * of lower dir and set upper parent "impure".
 		 */
-		if (upperdentry && !ctr && !ofs->noxattr) {
+		if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
 			err = ovl_fix_origin(dentry, this, upperdentry);
 			if (err) {
 				dput(this);
@@ -925,18 +943,35 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		 * When "verify_lower" feature is enabled, do not merge with a
 		 * lower dir that does not match a stored origin xattr. In any
 		 * case, only verified origin is used for index lookup.
+		 *
+		 * For non-dir dentry, if index=on, then ensure origin
+		 * matches the dentry found using path based lookup,
+		 * otherwise error out.
 		 */
-		if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
+		if (upperdentry && !ctr &&
+		    ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
+		     (!d.is_dir && ofs->config.index && origin_path))) {
 			err = ovl_verify_origin(upperdentry, this, false);
 			if (err) {
 				dput(this);
-				break;
+				if (d.is_dir)
+					break;
+				goto out_put;
 			}
-
-			/* Bless lower dir as verified origin */
 			origin = this;
 		}
 
+		if (d.metacopy)
+			metacopy = true;
+		/*
+		 * Do not store intermediate metacopy dentries in chain,
+		 * except top most lower metacopy dentry
+		 */
+		if (d.metacopy && ctr) {
+			dput(this);
+			continue;
+		}
+
 		stack[ctr].dentry = this;
 		stack[ctr].layer = lower.layer;
 		ctr++;
@@ -968,13 +1003,48 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 	}
 
+	if (metacopy) {
+		/*
+		 * Found a metacopy dentry but did not find corresponding
+		 * data dentry
+		 */
+		if (d.metacopy) {
+			err = -EIO;
+			goto out_put;
+		}
+
+		err = -EPERM;
+		if (!ofs->config.metacopy) {
+			pr_warn_ratelimited("overlay: refusing to follow metacopy origin for (%pd2)\n",
+					    dentry);
+			goto out_put;
+		}
+	} else if (!d.is_dir && upperdentry && !ctr && origin_path) {
+		if (WARN_ON(stack != NULL)) {
+			err = -EIO;
+			goto out_put;
+		}
+		stack = origin_path;
+		ctr = 1;
+		origin_path = NULL;
+	}
+
 	/*
 	 * Lookup index by lower inode and verify it matches upper inode.
 	 * We only trust dir index if we verified that lower dir matches
 	 * origin, otherwise dir index entries may be inconsistent and we
-	 * ignore them. Always lookup index of non-dir and non-upper.
+	 * ignore them.
+	 *
+	 * For non-dir upper metacopy dentry, we already set "origin" if we
+	 * verified that lower matched upper origin. If upper origin was
+	 * not present (because lower layer did not support fh encode/decode),
+	 * or indexing is not enabled, do not set "origin" and skip looking up
+	 * index. This case should be handled in same way as a non-dir upper
+	 * without ORIGIN is handled.
+	 *
+	 * Always lookup index of non-dir non-metacopy and non-upper.
 	 */
-	if (ctr && (!upperdentry || !d.is_dir))
+	if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
 		origin = stack[0].dentry;
 
 	if (origin && ovl_indexdir(dentry->d_sb) &&
@@ -1019,6 +1089,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	revert_creds(old_cred);
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
@@ -1033,6 +1107,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		dput(stack[i].dentry);
 	kfree(stack);
 out_put_upper:
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(upperdentry);
 	kfree(upperredirect);
 out:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 45e4a581a9fe..b1f6591b9af4 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -268,6 +268,7 @@ bool ovl_need_index(struct dentry *dentry);
 int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
+int ovl_check_metacopy_xattr(struct dentry *dentry);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f8e3c95711b8..ab9a8fae0f99 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -778,3 +778,25 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
 	pr_err("overlayfs: failed to lock workdir+upperdir\n");
 	return -EIO;
 }
+
+/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
+int ovl_check_metacopy_xattr(struct dentry *dentry)
+{
+	int res;
+
+	/* Only regular files can have metacopy xattr */
+	if (!S_ISREG(d_inode(dentry)->i_mode))
+		return 0;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto out;
+	}
+
+	return 1;
+out:
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	return res;
+}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 10/28] ovl: Copy up meta inode data from lowest data inode
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (8 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 09/28] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 11/28] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Miklos Szeredi
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

So far lower could not be a meta inode.  So whenever it was time to copy up
data of a meta inode, we could copy it up from top most lower dentry.

But now lower itself can be a metacopy inode.  That means data copy up
needs to take place from a data inode in metacopy inode chain.  Find lower
data inode in the chain and use that for data copy up.

Introduced a helper called ovl_path_lowerdata() to find the lower data
inode chain.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c   | 13 +++++++++----
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 12 ++++++++++++
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7d01afd345e9..8c05e9ad9782 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -519,13 +519,14 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 	}
 
 	if (S_ISREG(c->stat.mode) && !c->metacopy) {
-		struct path upperpath;
+		struct path upperpath, datapath;
 
 		ovl_path_upper(c->dentry, &upperpath);
 		BUG_ON(upperpath.dentry != NULL);
 		upperpath.dentry = temp;
 
-		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+		ovl_path_lowerdata(c->dentry, &datapath);
+		err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size);
 		if (err)
 			return err;
 	}
@@ -706,14 +707,18 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 /* Copy up data of an inode which was copied up metadata only in the past. */
 static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
 {
-	struct path upperpath;
+	struct path upperpath, datapath;
 	int err;
 
 	ovl_path_upper(c->dentry, &upperpath);
 	if (WARN_ON(upperpath.dentry == NULL))
 		return -EIO;
 
-	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	ovl_path_lowerdata(c->dentry, &datapath);
+	if (WARN_ON(datapath.dentry == NULL))
+		return -EIO;
+
+	err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size);
 	if (err)
 		return err;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index b1f6591b9af4..31795c041d61 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -217,6 +217,7 @@ bool ovl_dentry_weird(struct dentry *dentry);
 enum ovl_path_type ovl_path_type(struct dentry *dentry);
 void ovl_path_upper(struct dentry *dentry, struct path *path);
 void ovl_path_lower(struct dentry *dentry, struct path *path);
+void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index ab9a8fae0f99..32ff67fa0bfb 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -164,6 +164,18 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
 	}
 }
 
+void ovl_path_lowerdata(struct dentry *dentry, struct path *path)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	if (oe->numlower) {
+		path->mnt = oe->lowerstack[oe->numlower - 1].layer->mnt;
+		path->dentry = oe->lowerstack[oe->numlower - 1].dentry;
+	} else {
+		*path = (struct path) { };
+	}
+}
+
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
 {
 	enum ovl_path_type type = ovl_path_type(dentry);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 11/28] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (9 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 10/28] ovl: Copy up meta inode data from lowest data inode Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 12/28] ovl: Fix ovl_getattr() to get number of blocks from lower Miklos Szeredi
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Now we have the notion of data dentry and metacopy dentry.
ovl_dentry_lower() will return uppermost lower dentry, but it could be
either data or metacopy dentry.  Now we support metacopy dentries in lower
layers so it is possible that lowerstack[0] is metacopy dentry while
lowerstack[1] is actual data dentry.

So add an helper which returns lowest most dentry which is supposed to be
data dentry.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 31795c041d61..13b2151cf4df 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -221,6 +221,7 @@ void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
+struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
 struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 32ff67fa0bfb..7c7b95d5da1f 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -207,6 +207,19 @@ struct ovl_layer *ovl_layer_lower(struct dentry *dentry)
 	return oe->numlower ? oe->lowerstack[0].layer : NULL;
 }
 
+/*
+ * ovl_dentry_lower() could return either a data dentry or metacopy dentry
+ * dependig on what is stored in lowerstack[0]. At times we need to find
+ * lower dentry which has data (and not metacopy dentry). This helper
+ * returns the lower data dentry.
+ */
+struct dentry *ovl_dentry_lowerdata(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	return oe->numlower ? oe->lowerstack[oe->numlower - 1].dentry : NULL;
+}
+
 struct dentry *ovl_dentry_real(struct dentry *dentry)
 {
 	return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 12/28] ovl: Fix ovl_getattr() to get number of blocks from lower
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (10 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 11/28] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 13/28] ovl: Store lower data inode in ovl_inode Miklos Szeredi
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

If an inode has been copied up metadata only, then we need to query the
number of blocks from lower and fill up the stat->st_blocks.

We need to be careful about races where we are doing stat on one cpu and
data copy up is taking place on other cpu.  We want to return
stat->st_blocks either from lower or stable upper and not something in
between.  Hence, ovl_has_upperdata() is called first to figure out whether
block reporting will take place from lower or upper.

We now support metacopy dentries in middle layer.  That means number of
blocks reporting needs to come from lowest data dentry and this could be
different from lower dentry.  Hence we end up making a separate
vfs_getxattr() call for metacopy dentries to get number of blocks.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c     | 35 ++++++++++++++++++++++++++++++++++-
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 16 ++++++++++++++++
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 350e38c5ce7d..6a6693150c37 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -145,6 +145,9 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	bool samefs = ovl_same_sb(dentry->d_sb);
 	struct ovl_layer *lower_layer = NULL;
 	int err;
+	bool metacopy_blocks = false;
+
+	metacopy_blocks = ovl_is_metacopy_dentry(dentry);
 
 	type = ovl_path_real(dentry, &realpath);
 	old_cred = ovl_override_creds(dentry->d_sb);
@@ -166,7 +169,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			lower_layer = ovl_layer_lower(dentry);
 		} else if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
-			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
+			u32 lowermask = STATX_INO | STATX_BLOCKS |
+					(!is_dir ? STATX_NLINK : 0);
 
 			ovl_path_lower(dentry, &realpath);
 			err = vfs_getattr(&realpath, &lowerstat,
@@ -195,6 +199,35 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 				stat->ino = lowerstat.ino;
 				lower_layer = ovl_layer_lower(dentry);
 			}
+
+			/*
+			 * If we are querying a metacopy dentry and lower
+			 * dentry is data dentry, then use the blocks we
+			 * queried just now. We don't have to do additional
+			 * vfs_getattr(). If lower itself is metacopy, then
+			 * additional vfs_getattr() is unavoidable.
+			 */
+			if (metacopy_blocks &&
+			    realpath.dentry == ovl_dentry_lowerdata(dentry)) {
+				stat->blocks = lowerstat.blocks;
+				metacopy_blocks = false;
+			}
+		}
+
+		if (metacopy_blocks) {
+			/*
+			 * If lower is not same as lowerdata or if there was
+			 * no origin on upper, we can end up here.
+			 */
+			struct kstat lowerdatastat;
+			u32 lowermask = STATX_BLOCKS;
+
+			ovl_path_lowerdata(dentry, &realpath);
+			err = vfs_getattr(&realpath, &lowerdatastat,
+					  lowermask, flags);
+			if (err)
+				goto out;
+			stat->blocks = lowerdatastat.blocks;
 		}
 	}
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 13b2151cf4df..ae06fd531b75 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -271,6 +271,7 @@ int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
+bool ovl_is_metacopy_dentry(struct dentry *dentry);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 7c7b95d5da1f..4f9c2ecee74c 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -825,3 +825,19 @@ int ovl_check_metacopy_xattr(struct dentry *dentry)
 	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
 	return res;
 }
+
+bool ovl_is_metacopy_dentry(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	if (!d_is_reg(dentry))
+		return false;
+
+	if (ovl_dentry_upper(dentry)) {
+		if (!ovl_has_upperdata(d_inode(dentry)))
+			return true;
+		return false;
+	}
+
+	return (oe->numlower > 1);
+}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 13/28] ovl: Store lower data inode in ovl_inode
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (11 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 12/28] ovl: Fix ovl_getattr() to get number of blocks from lower Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 14/28] ovl: Add helper ovl_inode_realdata() Miklos Szeredi
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Right now ovl_inode stores inode pointer for lower inode.  This helps with
quickly getting lower inode given overlay inode (ovl_inode_lower()).

Now with metadata only copy-up, we can have metacopy inode in middle layer
as well and inode containing data can be different from ->lower.  I need to
be able to open the real file in ovl_open_realfile() and for that I need to
quickly find the lower data inode.

Hence store lower data inode also in ovl_inode.  Also provide an helper
ovl_inode_lowerdata() to access this field.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c     |  2 +-
 fs/overlayfs/namei.c     |  2 ++
 fs/overlayfs/overlayfs.h |  4 +++-
 fs/overlayfs/ovl_entry.h |  5 ++++-
 fs/overlayfs/super.c     |  8 ++++++--
 fs/overlayfs/util.c      | 12 +++++++++++-
 6 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 6a6693150c37..d4e643e9098b 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -864,7 +864,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
-	ovl_inode_init(inode, upperdentry, lowerdentry);
+	ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index fcf0eb40d94b..b28b255ea8e2 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1080,6 +1080,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			.index = index,
 			.numlower = ctr,
 			.redirect = upperredirect,
+			.lowerdata = (ctr > 1 && !d.is_dir) ?
+				      stack[ctr - 1].dentry : NULL,
 		};
 
 		inode = ovl_get_inode(dentry->d_sb, &oip);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ae06fd531b75..558a1b444286 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -227,6 +227,7 @@ struct dentry *ovl_dentry_real(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
+struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
@@ -246,7 +247,7 @@ bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry);
+		    struct dentry *lowerdentry, struct dentry *lowerdata);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
 void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
@@ -361,6 +362,7 @@ struct ovl_inode_params {
 	struct dentry *index;
 	unsigned int numlower;
 	char *redirect;
+	struct dentry *lowerdata;
 };
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index da65118a0567..6d02ab1f8811 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -90,7 +90,10 @@ static inline struct ovl_entry *OVL_E(struct dentry *dentry)
 }
 
 struct ovl_inode {
-	struct ovl_dir_cache *cache;
+	union {
+		struct ovl_dir_cache *cache;	/* directory */
+		struct inode *lowerdata;	/* regular file */
+	};
 	const char *redirect;
 	u64 version;
 	unsigned long flags;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d96530717702..e0326a686f45 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -187,6 +187,7 @@ static struct inode *ovl_alloc_inode(struct super_block *sb)
 	oi->flags = 0;
 	oi->__upperdentry = NULL;
 	oi->lower = NULL;
+	oi->lowerdata = NULL;
 	mutex_init(&oi->lock);
 
 	return &oi->vfs_inode;
@@ -205,8 +206,11 @@ static void ovl_destroy_inode(struct inode *inode)
 
 	dput(oi->__upperdentry);
 	iput(oi->lower);
+	if (S_ISDIR(inode->i_mode))
+		ovl_dir_cache_free(inode);
+	else
+		iput(oi->lowerdata);
 	kfree(oi->redirect);
-	ovl_dir_cache_free(inode);
 	mutex_destroy(&oi->lock);
 
 	call_rcu(&inode->i_rcu, ovl_i_callback);
@@ -1509,7 +1513,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
 	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
-		       ovl_dentry_lower(root_dentry));
+		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 4f9c2ecee74c..63311c536216 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -247,6 +247,14 @@ struct inode *ovl_inode_real(struct inode *inode)
 	return ovl_inode_upper(inode) ?: ovl_inode_lower(inode);
 }
 
+/* Return inode which contains lower data. Do not return metacopy */
+struct inode *ovl_inode_lowerdata(struct inode *inode)
+{
+	if (WARN_ON(!S_ISREG(inode->i_mode)))
+		return NULL;
+
+	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
+}
 
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
@@ -381,7 +389,7 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
 }
 
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry)
+		    struct dentry *lowerdentry, struct dentry *lowerdata)
 {
 	struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
 
@@ -389,6 +397,8 @@ void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 		OVL_I(inode)->__upperdentry = upperdentry;
 	if (lowerdentry)
 		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
+	if (lowerdata)
+		OVL_I(inode)->lowerdata = igrab(d_inode(lowerdata));
 
 	ovl_copyattr(realinode, inode);
 	ovl_copyflags(realinode, inode);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 14/28] ovl: Add helper ovl_inode_realdata()
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (12 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 13/28] ovl: Store lower data inode in ovl_inode Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-29 14:45 ` [PATCH 15/28] ovl: Open file with data except for the case of fsync Miklos Szeredi
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Add an helper to retrieve real data inode associated with overlay inode.
This helper will ignore all metacopy inodes and will return only the real
inode which has data.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 558a1b444286..dcd600201ec8 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -229,6 +229,7 @@ struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
 struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
+struct inode *ovl_inode_realdata(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
 void ovl_dentry_set_flag(unsigned long flag, struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 63311c536216..73939e08d8bf 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -256,6 +256,18 @@ struct inode *ovl_inode_lowerdata(struct inode *inode)
 	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
 }
 
+/* Return real inode which contains data. Does not return metacopy inode */
+struct inode *ovl_inode_realdata(struct inode *inode)
+{
+	struct inode *upperinode;
+
+	upperinode = ovl_inode_upper(inode);
+	if (upperinode && ovl_has_upperdata(inode))
+		return upperinode;
+
+	return ovl_inode_lowerdata(inode);
+}
+
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
 	return OVL_I(inode)->cache;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 15/28] ovl: Open file with data except for the case of fsync
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (13 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 14/28] ovl: Add helper ovl_inode_realdata() Miklos Szeredi
@ 2018-05-29 14:45 ` Miklos Szeredi
  2018-05-30 14:30   ` Vivek Goyal
  2018-05-29 14:46 ` [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real() Miklos Szeredi
                   ` (12 subsequent siblings)
  27 siblings, 1 reply; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:45 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

ovl_open() should open file which contains data and not open metacopy
inode.  With the introduction of metacopy inodes, with current
implementaion we will end up opening metacopy inode as well.

But there can be certain circumstances like ovl_fsync() where we want to
allow opening a metacopy inode instead.

Hence, change ovl_open_realfile() and add _ovl_open_real() and add extra
parameter which specifies whether to allow opening metacopy inode or not.
If this parameter is false, we look for data inode and open that.

This should allow covering both the cases.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 39 ++++++++++++++++++++++++++++++---------
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 266692ce9a9a..c7738ef492c8 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -14,11 +14,20 @@
 #include <linux/uio.h>
 #include "overlayfs.h"
 
-static struct file *ovl_open_realfile(const struct file *file)
+static char ovl_whatisit(struct inode *inode, struct inode *realinode)
+{
+	if (realinode != ovl_inode_upper(inode))
+		return 'l';
+	if (ovl_has_upperdata(inode))
+		return 'u';
+	else
+		return 'm';
+}
+
+static struct file *ovl_open_realfile(const struct file *file,
+				      struct inode *realinode)
 {
 	struct inode *inode = file_inode(file);
-	struct inode *upperinode = ovl_inode_upper(inode);
-	struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
 	struct file *realfile;
 	const struct cred *old_cred;
 
@@ -28,7 +37,7 @@ static struct file *ovl_open_realfile(const struct file *file)
 	revert_creds(old_cred);
 
 	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
-		 file, file, upperinode ? 'u' : 'l', file->f_flags,
+		 file, file, ovl_whatisit(inode, realinode), file->f_flags,
 		 realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
 
 	return realfile;
@@ -72,17 +81,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
 	return 0;
 }
 
-static int ovl_real_fdget(const struct file *file, struct fd *real)
+static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
+			       bool allow_meta)
 {
 	struct inode *inode = file_inode(file);
+	struct inode *realinode;
 
 	real->flags = 0;
 	real->file = file->private_data;
 
+	if (allow_meta)
+		realinode = ovl_inode_real(inode);
+	else
+		realinode = ovl_inode_realdata(inode);
+
 	/* Has it been copied up since we'd opened it? */
-	if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
+	if (unlikely(file_inode(real->file) != realinode)) {
 		real->flags = FDPUT_FPUT;
-		real->file = ovl_open_realfile(file);
+		real->file = ovl_open_realfile(file, realinode);
 
 		return PTR_ERR_OR_ZERO(real->file);
 	}
@@ -94,6 +110,11 @@ static int ovl_real_fdget(const struct file *file, struct fd *real)
 	return 0;
 }
 
+static int ovl_real_fdget(const struct file *file, struct fd *real)
+{
+	return ovl_real_fdget_meta(file, real, false);
+}
+
 static int ovl_open(struct inode *inode, struct file *file)
 {
 	struct dentry *dentry = file_dentry(file);
@@ -107,7 +128,7 @@ static int ovl_open(struct inode *inode, struct file *file)
 	/* No longer need these flags, so don't pass them on to underlying fs */
 	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
 
-	realfile = ovl_open_realfile(file);
+	realfile = ovl_open_realfile(file, ovl_inode_real(file_inode(file)));
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 
@@ -244,7 +265,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget_meta(file, &real, !datasync);
 	if (ret)
 		return ret;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real()
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (14 preceding siblings ...)
  2018-05-29 14:45 ` [PATCH 15/28] ovl: Open file with data except for the case of fsync Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-30 21:05   ` Vivek Goyal
  2018-05-29 14:46 ` [PATCH 17/28] ovl: Move some dir related ovl_lookup_single() code in else block Miklos Szeredi
                   ` (11 subsequent siblings)
  27 siblings, 1 reply; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Metacopy dentry/inode is internal to overlay and is never exposed outside
of it.  Exception is metacopy upper file used for fsync().  Modify d_real()
to look for dentries/inode which have data, but also allow matching upper
inode without data for the fsync case.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/super.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e0326a686f45..6687d547ec6b 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -101,10 +101,13 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 	}
 
 	real = ovl_dentry_upper(dentry);
-	if (real && (!inode || inode == d_inode(real)))
+	if (real && (inode == d_inode(real)))
 		return real;
 
-	real = ovl_dentry_lower(dentry);
+	if (real && !inode && ovl_has_upperdata(d_inode(dentry)))
+		return real;
+
+	real = ovl_dentry_lowerdata(dentry);
 	if (!real)
 		goto bug;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 17/28] ovl: Move some dir related ovl_lookup_single() code in else block
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (15 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real() Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 18/28] ovl: Check redirects for metacopy files Miklos Szeredi
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Move some directory related code in else block.  This is pure code
reorganization and no functionality change.

Next patch enables redirect processing on metacopy files and needs this
change.  By keeping non-functional changes in a separate patch, next patch
looks much smaller and cleaner.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/namei.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index b28b255ea8e2..451cd6effcf6 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -273,17 +273,18 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		d->metacopy = err;
 		d->stop = !d->metacopy;
 		goto out;
-	}
-	if (last_element)
-		d->is_dir = true;
-	if (d->last)
-		goto out;
-
-	if (ovl_is_opaquedir(this)) {
-		d->stop = true;
+	} else {
 		if (last_element)
-			d->opaque = true;
-		goto out;
+			d->is_dir = true;
+		if (d->last)
+			goto out;
+
+		if (ovl_is_opaquedir(this)) {
+			d->stop = true;
+			if (last_element)
+				d->opaque = true;
+			goto out;
+		}
 	}
 	err = ovl_check_redirect(this, d, prelen, post);
 	if (err)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 18/28] ovl: Check redirects for metacopy files
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (16 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 17/28] ovl: Move some dir related ovl_lookup_single() code in else block Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 19/28] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Miklos Szeredi
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Right now we rely on path based lookup for data origin of metacopy upper.
This will work only if upper has not been renamed.  We solved this problem
already for merged directories using redirect.  Use same logic for metacopy
files.

This patch just goes on to check redirects for metacopy files.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/namei.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 451cd6effcf6..e38fa61e08df 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -272,7 +272,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 
 		d->metacopy = err;
 		d->stop = !d->metacopy;
-		goto out;
+		if (!d->metacopy || d->last)
+			goto out;
 	} else {
 		if (last_element)
 			d->is_dir = true;
@@ -874,7 +875,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		if (upperdentry && !d.is_dir) {
 			unsigned int origin_ctr = 0;
 
-			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 19/28] ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (17 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 18/28] ovl: Check redirects for metacopy files Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 20/28] ovl: Add an inode flag OVL_CONST_INO Miklos Szeredi
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Right now OVL_PATH_MERGE is used only for merged directories.  But
conceptually, a metacopy dentry (backed by a lower data dentry) is a merged
entity as well.

So mark metacopy dentries as OVL_PATH_MERGE and ovl_rename() makes use of
this property later to set redirect on a metacopy file.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/util.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 73939e08d8bf..61ace2de3019 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -134,7 +134,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
 		 */
 		if (oe->numlower) {
 			type |= __OVL_PATH_ORIGIN;
-			if (d_is_dir(dentry))
+			if (d_is_dir(dentry) ||
+			    !ovl_has_upperdata(d_inode(dentry)))
 				type |= __OVL_PATH_MERGE;
 		}
 	} else {
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 20/28] ovl: Add an inode flag OVL_CONST_INO
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (18 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 19/28] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 21/28] ovl: Do not set dentry type ORIGIN for broken hardlinks Miklos Szeredi
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Add an ovl_inode flag OVL_CONST_INO.  This flag signifies if inode number
will remain constant over copy up or not.  This flag does not get updated
over copy up and remains unmodifed after setting once.

Next patch in the series will make use of this flag.  It will basically
figure out if dentry is of type ORIGIN or not.  And this can be derived by
this flag.

ORIGIN = (upperdentry && ovl_test_flag(OVL_CONST_INO, inode)).

Suggested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c     | 3 +++
 fs/overlayfs/overlayfs.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d4e643e9098b..ac1d2a581a6e 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -883,6 +883,9 @@ struct inode *ovl_get_inode(struct super_block *sb,
 
 	OVL_I(inode)->redirect = oip->redirect;
 
+	if (bylower)
+		ovl_set_flag(OVL_CONST_INO, inode);
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index dcd600201ec8..bde352e414e7 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -38,6 +38,8 @@ enum ovl_inode_flag {
 	OVL_WHITEOUTS,
 	OVL_INDEX,
 	OVL_UPPERDATA,
+	/* Inode number will remain constant over copy up. */
+	OVL_CONST_INO,
 };
 
 enum ovl_entry_flag {
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 21/28] ovl: Do not set dentry type ORIGIN for broken hardlinks
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (19 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 20/28] ovl: Add an inode flag OVL_CONST_INO Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 22/28] ovl: Set redirect on metacopy files upon rename Miklos Szeredi
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

If a dentry has copy up origin, we set flag OVL_PATH_ORIGIN.  So far this
decision was easy that we had to check only for oe->numlower and if it is
non-zero, we knew there is copy up origin.  (For non-dir we installed
origin dentry in lowerstack[0]).

But we don't create ORGIN xattr for broken hardlinks (index=off).  And with
metacopy feature it is possible that we will install lowerstack[0] but
ORIGIN xattr is not there.  It is data dentry of upper metacopy dentry
which has been found using regular name based lookup or using REDIRECT.  So
with addition of this new case, just presence of oe->numlower is not
sufficient to guarantee that ORIGIN xattr is present.

So to differentiate between two cases, look at OVL_CONST_INO flag.  If this
flag is set and upperdentry is there, that means it can be marked as type
ORIGIN.  OVL_CONST_INO is not set if lower hardlink is broken or will be
broken over copy up.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/util.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 61ace2de3019..1aa9e0c5a327 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -133,7 +133,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
 		 * Non-dir dentry can hold lower dentry of its copy up origin.
 		 */
 		if (oe->numlower) {
-			type |= __OVL_PATH_ORIGIN;
+			if (ovl_test_flag(OVL_CONST_INO, d_inode(dentry)))
+				type |= __OVL_PATH_ORIGIN;
 			if (d_is_dir(dentry) ||
 			    !ovl_has_upperdata(d_inode(dentry)))
 				type |= __OVL_PATH_MERGE;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 22/28] ovl: Set redirect on metacopy files upon rename
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (20 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 21/28] ovl: Do not set dentry type ORIGIN for broken hardlinks Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 23/28] ovl: Set redirect on upper inode when it is linked Miklos Szeredi
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Set redirect on metacopy files upon rename.  This will help find data
dentry in lower dirs.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/dir.c | 66 +++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 20 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 8d8e063e4706..1658961a9762 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -873,13 +873,13 @@ static bool ovl_can_move(struct dentry *dentry)
 		!d_is_dir(dentry) || !ovl_type_merge_or_lower(dentry);
 }
 
-static char *ovl_get_redirect(struct dentry *dentry, bool samedir)
+static char *ovl_get_redirect(struct dentry *dentry, bool abs_redirect)
 {
 	char *buf, *ret;
 	struct dentry *d, *tmp;
 	int buflen = ovl_redirect_max + 1;
 
-	if (samedir) {
+	if (!abs_redirect) {
 		ret = kstrndup(dentry->d_name.name, dentry->d_name.len,
 			       GFP_KERNEL);
 		goto out;
@@ -933,15 +933,43 @@ static char *ovl_get_redirect(struct dentry *dentry, bool samedir)
 	return ret ? ret : ERR_PTR(-ENOMEM);
 }
 
+static bool ovl_need_absolute_redirect(struct dentry *dentry, bool samedir)
+{
+	struct dentry *lowerdentry;
+
+	if (!samedir)
+		return true;
+
+	if (d_is_dir(dentry))
+		return false;
+
+	/*
+	 * For non-dir hardlinked files, we need absolute redirects
+	 * in general as two upper hardlinks could be in different
+	 * dirs. We could put a relative redirect now and convert
+	 * it to absolute redirect later. But when nlink > 1 and
+	 * indexing is on, that means relative redirect needs to be
+	 * converted to absolute during copy up of another lower
+	 * hardllink as well.
+	 *
+	 * So without optimizing too much, just check if lower is
+	 * a hard link or not. If lower is hard link, put absolute
+	 * redirect.
+	 */
+	lowerdentry = ovl_dentry_lower(dentry);
+	return (d_inode(lowerdentry)->i_nlink > 1);
+}
+
 static int ovl_set_redirect(struct dentry *dentry, bool samedir)
 {
 	int err;
 	const char *redirect = ovl_dentry_get_redirect(dentry);
+	bool absolute_redirect = ovl_need_absolute_redirect(dentry, samedir);
 
-	if (redirect && (samedir || redirect[0] == '/'))
+	if (redirect && (!absolute_redirect || redirect[0] == '/'))
 		return 0;
 
-	redirect = ovl_get_redirect(dentry, samedir);
+	redirect = ovl_get_redirect(dentry, absolute_redirect);
 	if (IS_ERR(redirect))
 		return PTR_ERR(redirect);
 
@@ -1117,22 +1145,20 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 		goto out_dput;
 
 	err = 0;
-	if (is_dir) {
-		if (ovl_type_merge_or_lower(old))
-			err = ovl_set_redirect(old, samedir);
-		else if (!old_opaque && ovl_type_merge(new->d_parent))
-			err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
-		if (err)
-			goto out_dput;
-	}
-	if (!overwrite && new_is_dir) {
-		if (ovl_type_merge_or_lower(new))
-			err = ovl_set_redirect(new, samedir);
-		else if (!new_opaque && ovl_type_merge(old->d_parent))
-			err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
-		if (err)
-			goto out_dput;
-	}
+	if (ovl_type_merge_or_lower(old))
+		err = ovl_set_redirect(old, samedir);
+	else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
+		err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
+	if (err)
+		goto out_dput;
+
+	if (!overwrite && ovl_type_merge_or_lower(new))
+		err = ovl_set_redirect(new, samedir);
+	else if (!overwrite && new_is_dir && !new_opaque &&
+		 ovl_type_merge(old->d_parent))
+		err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
+	if (err)
+		goto out_dput;
 
 	err = ovl_do_rename(old_upperdir->d_inode, olddentry,
 			    new_upperdir->d_inode, newdentry, flags);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 23/28] ovl: Set redirect on upper inode when it is linked
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (21 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 22/28] ovl: Set redirect on metacopy files upon rename Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 24/28] ovl: Check redirect on index as well Miklos Szeredi
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

When we create a hardlink to a metacopy upper file, first the redirect on
that inode.  Path based lookup will not work with newly created link and
redirect will solve that issue.

Also use absolute redirect as two hardlinks could be in different
directores and relative redirect will not work.

I have not put any additional locking around setting redirects while
introducing redirects for non-dir files.  For now it feels like existing
locking is sufficient.  If that's not the case, we will have add more
locking.  Following is my rationale about why do I think current locking
seems ok.

Basic problem for non-dir files is that more than on dentry could be
pointing to same inode and in theory only relying on dentry based locks
(d->d_lock) did not seem sufficient.

We set redirect upon rename and upon link creation.  In both the paths for
non-dir file, VFS locks both source and target inodes (->i_rwsem).  That
means vfs rename and link operations on same source and target can't he
happening in parallel (Even if there are multiple dentries pointing to same
inode).  So that probably means that at a time on an inode, only one call
of ovl_set_redirect() could be working and we don't need additional locking
in ovl_set_redirect().

ovl_inode->redirect is initialized only when inode is created new.  That
means it should not race with any other path and setting
ovl_inode->redirect should be fine.

Reading of ovl_inode->redirect happens in ovl_get_redirect() path.  And
this called only in ovl_set_redirect().  And ovl_set_redirect() already
seemed to be protected using ->i_rwsem.  That means ovl_set_redirect() and
ovl_get_redirect() on source/target inode should not make progress in
parallel and is mutually exclusive.  Hence no additional locking required.

Now, only case where ovl_set_redirect() and ovl_get_redirect() could race
seems to be case of absolute redirects where ovl_get_redirect() has to
travel up the tree.  In that case we already take d->d_lock and that should
be sufficient as directories will not have multiple dentries pointing to
same inode.

So given VFS locking and current usage of redirect, current locking around
redirect seems to be ok for non-dir as well.  Once we have the logic to
remove redirect when metacopy file gets copied up, then we probably will
need additional locking.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/dir.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 1658961a9762..7063e0f588cc 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -24,6 +24,8 @@ module_param_named(redirect_max, ovl_redirect_max, ushort, 0644);
 MODULE_PARM_DESC(ovl_redirect_max,
 		 "Maximum length of absolute redirect xattr value");
 
+static int ovl_set_redirect(struct dentry *dentry, bool samedir);
+
 int ovl_cleanup(struct inode *wdir, struct dentry *wdentry)
 {
 	int err;
@@ -656,6 +658,12 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
 	if (err)
 		goto out_drop_write;
 
+	if (ovl_is_metacopy_dentry(old)) {
+		err = ovl_set_redirect(old, false);
+		if (err)
+			goto out_drop_write;
+	}
+
 	err = ovl_nlink_start(old, &locked);
 	if (err)
 		goto out_drop_write;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 24/28] ovl: Check redirect on index as well
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (22 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 23/28] ovl: Set redirect on upper inode when it is linked Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 25/28] ovl: Disbale metacopy for MAP_SHARED mmap() Miklos Szeredi
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

Right now we seem to check redirect only if upperdentry is found.  But it
is possible that there is no upperdentry but later we found an index.

We need to check redirect on index as well and set it in
ovl_inode->redirect.  Otherwise link code can assume that dentry does not
have redirect and place a new one which breaks things.  In my testing
overlay/033 test started failing in xfstests.  Following are the details.

For example do following.

$ mkdir lower upper work merged

 - Make lower dir with 4 links.
  $ echo "foo" > lower/l0.txt
  $ ln  lower/l0.txt lower/l1.txt
  $ ln  lower/l0.txt lower/l2.txt
  $ ln  lower/l0.txt lower/l3.txt

 - Mount with index on and metacopy on.

  $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,\
                        index=on,metacopy=on none merged

 - Link lower

  $ ln merged/l0.txt merged/l4.txt
    (This will metadata copy up of l0.txt and put an absolute redirect
     /l0.txt)

  $ echo 2 > /proc/sys/vm/drop/caches

  $ ls merged/l1.txt
  (Now l1.txt will be looked up.  There is no upper dentry but there is
   lower dentry and index will be found.  We don't check for redirect on
   index, hence ovl_inode->redirect will be NULL.)

 - Link Upper

  $ ln merged/l4.txt merged/l5.txt
  (Lookup of l4.txt will use inode from l1.txt lookup which is still in
   cache.  It has ovl_inode->redirect NULL, hence link will put a new
   redirect and replace /l0.txt with /l4.txt

 - Drop caches.
  echo 2 > /proc/sys/vm/drop_caches

 - List l1.txt and it returns -ESTALE

  $ ls merged/l0.txt

  (It returns stale because, we found a metacopy of l0.txt in upper and it
   has redirect l4.txt but there is no file named l4.txt in lower layer.
   So lower data copy is not found and -ESTALE is returned.)

So problem here is that we did not process redirect on index.  Check
redirect on index as well and then problem is fixed.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/namei.c     | 50 +++++++++++++-----------------------------------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 37 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index e38fa61e08df..2acd494ea9f0 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -31,32 +31,13 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 			      size_t prelen, const char *post)
 {
 	int res;
-	char *s, *next, *buf = NULL;
+	char *buf;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
-	if (res < 0) {
-		if (res == -ENODATA || res == -EOPNOTSUPP)
-			return 0;
-		goto fail;
-	}
-	buf = kzalloc(prelen + res + strlen(post) + 1, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
+	buf = ovl_get_redirect_xattr(dentry, prelen + strlen(post));
+	if (IS_ERR_OR_NULL(buf))
+		return PTR_ERR(buf);
 
-	if (res == 0)
-		goto invalid;
-
-	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
-	if (res < 0)
-		goto fail;
-	if (res == 0)
-		goto invalid;
 	if (buf[0] == '/') {
-		for (s = buf; *s++ == '/'; s = next) {
-			next = strchrnul(s, '/');
-			if (s == next)
-				goto invalid;
-		}
 		/*
 		 * One of the ancestor path elements in an absolute path
 		 * lookup in ovl_lookup_layer() could have been opaque and
@@ -67,9 +48,7 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 		 */
 		d->stop = false;
 	} else {
-		if (strchr(buf, '/') != NULL)
-			goto invalid;
-
+		res = strlen(buf) + 1;
 		memmove(buf + prelen, buf, res);
 		memcpy(buf, d->name.name, prelen);
 	}
@@ -81,16 +60,6 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 	d->name.len = strlen(d->redirect);
 
 	return 0;
-
-err_free:
-	kfree(buf);
-	return 0;
-fail:
-	pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res);
-	goto err_free;
-invalid:
-	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
-	goto err_free;
 }
 
 static int ovl_acceptable(void *ctx, struct dentry *dentry)
@@ -1071,8 +1040,15 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (upperdentry)
 		ovl_dentry_set_upper_alias(dentry);
-	else if (index)
+	else if (index) {
 		upperdentry = dget(index);
+		upperredirect = ovl_get_redirect_xattr(upperdentry, 0);
+		if (IS_ERR(upperredirect)) {
+			err = PTR_ERR(upperredirect);
+			upperredirect = NULL;
+			goto out_free_oe;
+		}
+	}
 
 	if (upperdentry || ctr) {
 		struct ovl_inode_params oip = {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index bde352e414e7..24f1d0e8a178 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -276,6 +276,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
 bool ovl_is_metacopy_dentry(struct dentry *dentry);
+char *ovl_get_redirect_xattr(struct dentry *dentry, int padding);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 1aa9e0c5a327..8cfb62cc8672 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -865,3 +865,53 @@ bool ovl_is_metacopy_dentry(struct dentry *dentry)
 
 	return (oe->numlower > 1);
 }
+
+char *ovl_get_redirect_xattr(struct dentry *dentry, int padding)
+{
+	int res;
+	char *s, *next, *buf = NULL;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return NULL;
+		goto fail;
+	}
+
+	buf = kzalloc(res + padding + 1, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	if (res == 0)
+		goto invalid;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
+	if (res < 0)
+		goto fail;
+	if (res == 0)
+		goto invalid;
+
+	if (buf[0] == '/') {
+		for (s = buf; *s++ == '/'; s = next) {
+			next = strchrnul(s, '/');
+			if (s == next)
+				goto invalid;
+		}
+	} else {
+		if (strchr(buf, '/') != NULL)
+			goto invalid;
+	}
+
+	return buf;
+
+err_free:
+	kfree(buf);
+	return ERR_PTR(res);
+fail:
+	pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res);
+	goto err_free;
+invalid:
+	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
+	res = -EINVAL;
+	goto err_free;
+}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 25/28] ovl: Disbale metacopy for MAP_SHARED mmap()
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (23 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 24/28] ovl: Check redirect on index as well Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 26/28] ovl: Do not do metadata only copy-up for truncate operation Miklos Szeredi
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

When user chose the option of copying up a file when mmap(MAP_SHARED)
happens, then do full copy up and not just metacopy.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c   | 5 +++++
 fs/overlayfs/file.c      | 2 +-
 fs/overlayfs/overlayfs.h | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 8c05e9ad9782..c31f198bed99 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -867,6 +867,11 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
 	return err;
 }
 
+int ovl_copy_up_with_data(struct dentry *dentry)
+{
+	return ovl_copy_up_flags(dentry, O_WRONLY);
+}
+
 int ovl_copy_up(struct dentry *dentry)
 {
 	return ovl_copy_up_flags(dentry, 0);
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index c7738ef492c8..953295774471 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -296,7 +296,7 @@ static int ovl_pre_mmap(struct file *file, unsigned long prot,
 	 * later.
 	 */
 	if ((flag & MAP_SHARED) && ovl_copy_up_shared(file_inode(file)->i_sb))
-		err = ovl_copy_up(file_dentry(file));
+		err = ovl_copy_up_with_data(file_dentry(file));
 
 	return err;
 }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 24f1d0e8a178..93c84929d422 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -414,6 +414,7 @@ extern const struct file_operations ovl_file_operations;
 
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
+int ovl_copy_up_with_data(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
 int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 26/28] ovl: Do not do metadata only copy-up for truncate operation
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (24 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 25/28] ovl: Disbale metacopy for MAP_SHARED mmap() Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 27/28] ovl: Do not do metacopy only for ioctl modifying file attr Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 28/28] ovl: Enable metadata only feature Miklos Szeredi
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

truncate should copy up full file (and not do metacopy only), otherwise it
will be broken.  For example, use truncate to increase size of a file so
that any read beyong existing size will return null bytes.  If we don't
copy up full file, then we end up opening lower file and read from it only
reads upto the old size (and not new size after truncate).  Hence to avoid
such situations, copy up data as well when file size changes.

So far it was being done by d_real(O_WRONLY) call in truncate() path.  Now
that patch has been reverted.  So force full copy up in ovl_setattr() if
size of file is changing.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index ac1d2a581a6e..e31d64206a01 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -19,6 +19,7 @@
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int err;
+	bool full_copy_up = false;
 	struct dentry *upperdentry;
 	const struct cred *old_cred;
 
@@ -36,9 +37,15 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		err = -ETXTBSY;
 		if (atomic_read(&realinode->i_writecount) < 0)
 			goto out_drop_write;
+
+		/* Truncate should trigger data copy up as well */
+		full_copy_up = true;
 	}
 
-	err = ovl_copy_up(dentry);
+	if (!full_copy_up)
+		err = ovl_copy_up(dentry);
+	else
+		err = ovl_copy_up_with_data(dentry);
 	if (!err) {
 		struct inode *winode = NULL;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 27/28] ovl: Do not do metacopy only for ioctl modifying file attr
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (25 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 26/28] ovl: Do not do metadata only copy-up for truncate operation Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  2018-05-29 14:46 ` [PATCH 28/28] ovl: Enable metadata only feature Miklos Szeredi
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

ovl_copy_up() by default will only do metadata only copy up (if enabled).
That means when ovl_real_ioctl() calls ovl_real_file(), it will still get
the lower file (as ovl_real_file() opens data file and not metacopy).  And
that means "chattr +i" will end up modifying lower inode.

There seem to be two ways to solve this.
A. Open metacopy file in ovl_real_ioctl() and do operations on that
B. Force full copy up when FS_IOC_SETFLAGS is called.

I am resorting to option B for now as it feels little safer option.  If
there are performance issues due to this, we can revisit it.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 953295774471..31f32fc1004b 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -389,7 +389,7 @@ static long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		if (ret)
 			return ret;
 
-		ret = ovl_copy_up(file_dentry(file));
+		ret = ovl_copy_up_with_data(file_dentry(file));
 		if (!ret) {
 			ret = ovl_real_ioctl(file, cmd, arg);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 28/28] ovl: Enable metadata only feature
  2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
                   ` (26 preceding siblings ...)
  2018-05-29 14:46 ` [PATCH 27/28] ovl: Do not do metacopy only for ioctl modifying file attr Miklos Szeredi
@ 2018-05-29 14:46 ` Miklos Szeredi
  27 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-29 14:46 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

From: Vivek Goyal <vgoyal@redhat.com>

All the bits are in patches before this.  So it is time to enable the
metadata only copy up feature.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/copy_up.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index c31f198bed99..bdadedf73e51 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -689,9 +689,6 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 {
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 
-	/* TODO: Will enable metacopy in last patch of series */
-	return false;
-
 	if (!ofs->config.metacopy)
 		return false;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2018-05-29 14:45 ` [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup Miklos Szeredi
@ 2018-05-29 20:44   ` Randy Dunlap
  2018-05-30  8:27     ` Miklos Szeredi
  0 siblings, 1 reply; 36+ messages in thread
From: Randy Dunlap @ 2018-05-29 20:44 UTC (permalink / raw)
  To: Miklos Szeredi, linux-unionfs; +Cc: linux-fsdevel, linux-kernel

On 05/29/2018 07:45 AM, Miklos Szeredi wrote:
> From: Vivek Goyal <vgoyal@redhat.com>
> 
> By default metadata only copy up is disabled.  Provide a mount option so
> that users can choose one way or other.
> 
> Also provide a kernel config and module option to enable/disable metacopy
> feature.
> 
> metacopy feature requires redirect_dir=on when upper is present.
> Otherwise, it requires redirect_dir=follow atleast.
> 
> As of now, metacopy does not work with nfs_export=on.  So if both
> metacopy=on and nfs_export=on then nfs_export is disabled.
> 
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  Documentation/filesystems/overlayfs.txt | 30 ++++++++++++++++++++-
>  fs/overlayfs/Kconfig                    | 19 ++++++++++++++
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/super.c                    | 46 ++++++++++++++++++++++++++++++---
>  4 files changed, 92 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index f087bc40c6a5..79be4a77ca08 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -262,6 +262,30 @@ rightmost one and going left.  In the above example lower1 will be the
>  top, lower2 the middle and lower3 the bottom layer.
>  
>  
> +Metadata only copy up
> +--------------------
> +
> +When metadata only copy up feature is enabled, overlayfs will only copy
> +up metadata (as opposed to whole file), when a metadata specific operation
> +like chown/chmod is performed. Full file will be copied up later when
> +file is opened for WRITE operation.
> +
> +In other words, this is delayed data copy up operation and data is copied
> +up when there is a need to actually modify data.
> +
> +There are multiple ways to enable/disable this feature. A config option
> +CONFIG_OVERLAY_FS_METACOPY can be set/unset to enable/disable this feature
> +by default. Or one can enable/disable it at module load time with module
> +parameter metacopy=on/off. Lastly, there is also a per mount option
> +metacopy=on/off to enable/disable this feature per mount.
> +
> +Do not use metacopy=on with untrusted upper/lower directories. Otherwise
> +it is possible that an attacker can create an handcrafted file with

                                              a handcrafted

> +appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower
> +pointed by REDIRECT. This should not be possible on local system as setting
> +"trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible
> +for untrusted layers like from a pen drive.
> +
>  Sharing and copying layers
>  --------------------------
>  
> @@ -280,7 +304,7 @@ though it will not result in a crash or deadlock.
>  Mounting an overlay using an upper layer path, where the upper layer path
>  was previously used by another mounted overlay in combination with a
>  different lower layer path, is allowed, unless the "inodes index" feature
> -is enabled.
> +or "metadata only copy up" feature is enabled.
>  
>  With the "inodes index" feature, on the first time mount, an NFS file
>  handle of the lower layer root directory, along with the UUID of the lower
> @@ -293,6 +317,10 @@ lower root origin, mount will fail with ESTALE.  An overlayfs mount with
>  does not support NFS export, lower filesystem does not have a valid UUID or
>  if the upper filesystem does not support extended attributes.
>  
> +For "metadata only copy up" feature there is no verification mechanism at
> +mount time. So if same upper is mouted with different set of lower, mount

                                   mounted

> +probably will succeed but expect the unexpected later on. So don't do it.
> +
>  It is quite a common practice to copy overlay layers to a different
>  directory tree on the same or different underlying filesystem, and even
>  to a different machine.  With the "inodes index" feature, trying to mount
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index 5d1d40d745c5..e0a090eca65e 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -64,6 +64,7 @@ config OVERLAY_FS_NFS_EXPORT
>  	bool "Overlayfs: turn on NFS export feature by default"
>  	depends on OVERLAY_FS
>  	depends on OVERLAY_FS_INDEX
> +	depends on !OVERLAY_FS_METACOPY
>  	help
>  	  If this config option is enabled then overlay filesystems will use
>  	  the inodes index dir to decode overlay NFS file handles by default.

	  Is that ... dir  ^^^ == directory ?  Please spell it out.

> @@ -124,3 +125,21 @@ config OVERLAY_FS_COPY_UP_SHARED
>  	  To get a maximally backward compatible kernel, disable this option.
>  
>  	  If unsure, say N.
> +
> +config OVERLAY_FS_METACOPY
> +	bool "Overlayfs: turn on metadata only copy up feature by default"
> +	depends on OVERLAY_FS
> +	select OVERLAY_FS_REDIRECT_DIR
> +	help
> +	  If this config option is enabled then overlay filesystems will
> +	  copy up only metadata where appropriate and data copy up will
> +	  happen when a file is opended for WRITE operation. It is still

	                        opened

> +	  possible to turn off this feature globally with the "metacopy=off"
> +	  module option or on a filesystem instance basis with the
> +	  "metacopy=off" mount option.
> +
> +	  Note, that this feature is not backward compatible.  That is,
> +	  mounting an overlay which has metacopy only inodes on a kernel
> +	  that doesn't support this feature will have unexpected results.
> +
> +	  If unsure, say N.


-- 
~Randy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2018-05-29 20:44   ` Randy Dunlap
@ 2018-05-30  8:27     ` Miklos Szeredi
  0 siblings, 0 replies; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-30  8:27 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Tue, May 29, 2018 at 10:44 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
> On 05/29/2018 07:45 AM, Miklos Szeredi wrote:

>> +       If this config option is enabled then overlay filesystems will
>> +       copy up only metadata where appropriate and data copy up will
>> +       happen when a file is opended for WRITE operation. It is still
>
>                                 opened

Thanks for the feedback.  Fixed.

Miklos

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 15/28] ovl: Open file with data except for the case of fsync
  2018-05-29 14:45 ` [PATCH 15/28] ovl: Open file with data except for the case of fsync Miklos Szeredi
@ 2018-05-30 14:30   ` Vivek Goyal
  2018-05-30 15:12     ` Miklos Szeredi
  0 siblings, 1 reply; 36+ messages in thread
From: Vivek Goyal @ 2018-05-30 14:30 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel, linux-kernel

On Tue, May 29, 2018 at 04:45:59PM +0200, Miklos Szeredi wrote:
> From: Vivek Goyal <vgoyal@redhat.com>
> 
> ovl_open() should open file which contains data and not open metacopy
> inode.  With the introduction of metacopy inodes, with current
> implementaion we will end up opening metacopy inode as well.
> 
> But there can be certain circumstances like ovl_fsync() where we want to
> allow opening a metacopy inode instead.
> 
> Hence, change ovl_open_realfile() and add _ovl_open_real() and add extra
> parameter which specifies whether to allow opening metacopy inode or not.
> If this parameter is false, we look for data inode and open that.
> 
> This should allow covering both the cases.
> 
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/file.c | 39 ++++++++++++++++++++++++++++++---------
>  1 file changed, 30 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 266692ce9a9a..c7738ef492c8 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -14,11 +14,20 @@
>  #include <linux/uio.h>
>  #include "overlayfs.h"
>  
> -static struct file *ovl_open_realfile(const struct file *file)
> +static char ovl_whatisit(struct inode *inode, struct inode *realinode)
> +{
> +	if (realinode != ovl_inode_upper(inode))
> +		return 'l';
> +	if (ovl_has_upperdata(inode))
> +		return 'u';
> +	else
> +		return 'm';
> +}
> +
> +static struct file *ovl_open_realfile(const struct file *file,
> +				      struct inode *realinode)
>  {
>  	struct inode *inode = file_inode(file);
> -	struct inode *upperinode = ovl_inode_upper(inode);
> -	struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
>  	struct file *realfile;
>  	const struct cred *old_cred;
>  
> @@ -28,7 +37,7 @@ static struct file *ovl_open_realfile(const struct file *file)
>  	revert_creds(old_cred);
>  
>  	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> -		 file, file, upperinode ? 'u' : 'l', file->f_flags,
> +		 file, file, ovl_whatisit(inode, realinode), file->f_flags,
>  		 realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
>  
>  	return realfile;
> @@ -72,17 +81,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
>  	return 0;
>  }
>  
> -static int ovl_real_fdget(const struct file *file, struct fd *real)
> +static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
> +			       bool allow_meta)
>  {
>  	struct inode *inode = file_inode(file);
> +	struct inode *realinode;
>  
>  	real->flags = 0;
>  	real->file = file->private_data;
>  
> +	if (allow_meta)
> +		realinode = ovl_inode_real(inode);
> +	else
> +		realinode = ovl_inode_realdata(inode);
> +
>  	/* Has it been copied up since we'd opened it? */
> -	if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> +	if (unlikely(file_inode(real->file) != realinode)) {
>  		real->flags = FDPUT_FPUT;
> -		real->file = ovl_open_realfile(file);
> +		real->file = ovl_open_realfile(file, realinode);
>  
>  		return PTR_ERR_OR_ZERO(real->file);
>  	}
> @@ -94,6 +110,11 @@ static int ovl_real_fdget(const struct file *file, struct fd *real)
>  	return 0;
>  }
>  
> +static int ovl_real_fdget(const struct file *file, struct fd *real)
> +{
> +	return ovl_real_fdget_meta(file, real, false);
> +}
> +
>  static int ovl_open(struct inode *inode, struct file *file)
>  {
>  	struct dentry *dentry = file_dentry(file);
> @@ -107,7 +128,7 @@ static int ovl_open(struct inode *inode, struct file *file)
>  	/* No longer need these flags, so don't pass them on to underlying fs */
>  	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
>  
> -	realfile = ovl_open_realfile(file);
> +	realfile = ovl_open_realfile(file, ovl_inode_real(file_inode(file)));

Hi Miklos,

Hmm...so there have been some changes in this patch. My original intention
was that to always open data inode (lower/upper) in ovl_open(). So if upper
inode is a metacopy only, I will open lower inode instead.

But new logic seems to be to always open real inode (that means upper
metacopy inode as well). And that means that later when read happens
on the file we will end up opening lower file, read data and close
lower file.

I am concerned a bit if there are performance implications to this.
This will be hot path for containers.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 15/28] ovl: Open file with data except for the case of fsync
  2018-05-30 14:30   ` Vivek Goyal
@ 2018-05-30 15:12     ` Miklos Szeredi
  2018-05-30 15:48       ` Vivek Goyal
  0 siblings, 1 reply; 36+ messages in thread
From: Miklos Szeredi @ 2018-05-30 15:12 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Wed, May 30, 2018 at 4:30 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, May 29, 2018 at 04:45:59PM +0200, Miklos Szeredi wrote:
>> From: Vivek Goyal <vgoyal@redhat.com>
>>
>> ovl_open() should open file which contains data and not open metacopy
>> inode.  With the introduction of metacopy inodes, with current
>> implementaion we will end up opening metacopy inode as well.
>>
>> But there can be certain circumstances like ovl_fsync() where we want to
>> allow opening a metacopy inode instead.
>>
>> Hence, change ovl_open_realfile() and add _ovl_open_real() and add extra
>> parameter which specifies whether to allow opening metacopy inode or not.
>> If this parameter is false, we look for data inode and open that.
>>
>> This should allow covering both the cases.
>>
>> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>> ---
>>  fs/overlayfs/file.c | 39 ++++++++++++++++++++++++++++++---------
>>  1 file changed, 30 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>> index 266692ce9a9a..c7738ef492c8 100644
>> --- a/fs/overlayfs/file.c
>> +++ b/fs/overlayfs/file.c
>> @@ -14,11 +14,20 @@
>>  #include <linux/uio.h>
>>  #include "overlayfs.h"
>>
>> -static struct file *ovl_open_realfile(const struct file *file)
>> +static char ovl_whatisit(struct inode *inode, struct inode *realinode)
>> +{
>> +     if (realinode != ovl_inode_upper(inode))
>> +             return 'l';
>> +     if (ovl_has_upperdata(inode))
>> +             return 'u';
>> +     else
>> +             return 'm';
>> +}
>> +
>> +static struct file *ovl_open_realfile(const struct file *file,
>> +                                   struct inode *realinode)
>>  {
>>       struct inode *inode = file_inode(file);
>> -     struct inode *upperinode = ovl_inode_upper(inode);
>> -     struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
>>       struct file *realfile;
>>       const struct cred *old_cred;
>>
>> @@ -28,7 +37,7 @@ static struct file *ovl_open_realfile(const struct file *file)
>>       revert_creds(old_cred);
>>
>>       pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
>> -              file, file, upperinode ? 'u' : 'l', file->f_flags,
>> +              file, file, ovl_whatisit(inode, realinode), file->f_flags,
>>                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
>>
>>       return realfile;
>> @@ -72,17 +81,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
>>       return 0;
>>  }
>>
>> -static int ovl_real_fdget(const struct file *file, struct fd *real)
>> +static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
>> +                            bool allow_meta)
>>  {
>>       struct inode *inode = file_inode(file);
>> +     struct inode *realinode;
>>
>>       real->flags = 0;
>>       real->file = file->private_data;
>>
>> +     if (allow_meta)
>> +             realinode = ovl_inode_real(inode);
>> +     else
>> +             realinode = ovl_inode_realdata(inode);
>> +
>>       /* Has it been copied up since we'd opened it? */
>> -     if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
>> +     if (unlikely(file_inode(real->file) != realinode)) {
>>               real->flags = FDPUT_FPUT;
>> -             real->file = ovl_open_realfile(file);
>> +             real->file = ovl_open_realfile(file, realinode);
>>
>>               return PTR_ERR_OR_ZERO(real->file);
>>       }
>> @@ -94,6 +110,11 @@ static int ovl_real_fdget(const struct file *file, struct fd *real)
>>       return 0;
>>  }
>>
>> +static int ovl_real_fdget(const struct file *file, struct fd *real)
>> +{
>> +     return ovl_real_fdget_meta(file, real, false);
>> +}
>> +
>>  static int ovl_open(struct inode *inode, struct file *file)
>>  {
>>       struct dentry *dentry = file_dentry(file);
>> @@ -107,7 +128,7 @@ static int ovl_open(struct inode *inode, struct file *file)
>>       /* No longer need these flags, so don't pass them on to underlying fs */
>>       file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
>>
>> -     realfile = ovl_open_realfile(file);
>> +     realfile = ovl_open_realfile(file, ovl_inode_real(file_inode(file)));

That was meant to be

+     realfile = ovl_open_realfile(file, ovl_inode_realdata(file_inode(file)));

Is that correct?

> Hmm...so there have been some changes in this patch. My original intention
> was that to always open data inode (lower/upper) in ovl_open(). So if upper
> inode is a metacopy only, I will open lower inode instead.
>
> But new logic seems to be to always open real inode (that means upper
> metacopy inode as well). And that means that later when read happens
> on the file we will end up opening lower file, read data and close
> lower file.
>
> I am concerned a bit if there are performance implications to this.
> This will be hot path for containers.

Right.   Unfortunately not detected with automatic testing...

Thanks for spotting!

Miklos

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 15/28] ovl: Open file with data except for the case of fsync
  2018-05-30 15:12     ` Miklos Szeredi
@ 2018-05-30 15:48       ` Vivek Goyal
  0 siblings, 0 replies; 36+ messages in thread
From: Vivek Goyal @ 2018-05-30 15:48 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Wed, May 30, 2018 at 05:12:16PM +0200, Miklos Szeredi wrote:
> >> From: Vivek Goyal <vgoyal@redhat.com>
> >>
> >> ovl_open() should open file which contains data and not open metacopy
> >> inode.  With the introduction of metacopy inodes, with current
> >> implementaion we will end up opening metacopy inode as well.
> >>
> >> But there can be certain circumstances like ovl_fsync() where we want to
> >> allow opening a metacopy inode instead.
> >>
> >> Hence, change ovl_open_realfile() and add _ovl_open_real() and add extra
> >> parameter which specifies whether to allow opening metacopy inode or not.
> >> If this parameter is false, we look for data inode and open that.
> >>
> >> This should allow covering both the cases.
> >>
> >> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> >> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> >> ---
> >>  fs/overlayfs/file.c | 39 ++++++++++++++++++++++++++++++---------
> >>  1 file changed, 30 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> >> index 266692ce9a9a..c7738ef492c8 100644
> >> --- a/fs/overlayfs/file.c
> >> +++ b/fs/overlayfs/file.c
> >> @@ -14,11 +14,20 @@
> >>  #include <linux/uio.h>
> >>  #include "overlayfs.h"
> >>
> >> -static struct file *ovl_open_realfile(const struct file *file)
> >> +static char ovl_whatisit(struct inode *inode, struct inode *realinode)
> >> +{
> >> +     if (realinode != ovl_inode_upper(inode))
> >> +             return 'l';
> >> +     if (ovl_has_upperdata(inode))
> >> +             return 'u';
> >> +     else
> >> +             return 'm';
> >> +}
> >> +
> >> +static struct file *ovl_open_realfile(const struct file *file,
> >> +                                   struct inode *realinode)
> >>  {
> >>       struct inode *inode = file_inode(file);
> >> -     struct inode *upperinode = ovl_inode_upper(inode);
> >> -     struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> >>       struct file *realfile;
> >>       const struct cred *old_cred;
> >>
> >> @@ -28,7 +37,7 @@ static struct file *ovl_open_realfile(const struct file *file)
> >>       revert_creds(old_cred);
> >>
> >>       pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> >> -              file, file, upperinode ? 'u' : 'l', file->f_flags,
> >> +              file, file, ovl_whatisit(inode, realinode), file->f_flags,
> >>                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> >>
> >>       return realfile;
> >> @@ -72,17 +81,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
> >>       return 0;
> >>  }
> >>
> >> -static int ovl_real_fdget(const struct file *file, struct fd *real)
> >> +static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
> >> +                            bool allow_meta)
> >>  {
> >>       struct inode *inode = file_inode(file);
> >> +     struct inode *realinode;
> >>
> >>       real->flags = 0;
> >>       real->file = file->private_data;
> >>
> >> +     if (allow_meta)
> >> +             realinode = ovl_inode_real(inode);
> >> +     else
> >> +             realinode = ovl_inode_realdata(inode);
> >> +
> >>       /* Has it been copied up since we'd opened it? */
> >> -     if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> >> +     if (unlikely(file_inode(real->file) != realinode)) {
> >>               real->flags = FDPUT_FPUT;
> >> -             real->file = ovl_open_realfile(file);
> >> +             real->file = ovl_open_realfile(file, realinode);
> >>
> >>               return PTR_ERR_OR_ZERO(real->file);
> >>       }
> >> @@ -94,6 +110,11 @@ static int ovl_real_fdget(const struct file *file, struct fd *real)
> >>       return 0;
> >>  }
> >>
> >> +static int ovl_real_fdget(const struct file *file, struct fd *real)
> >> +{
> >> +     return ovl_real_fdget_meta(file, real, false);
> >> +}
> >> +
> >>  static int ovl_open(struct inode *inode, struct file *file)
> >>  {
> >>       struct dentry *dentry = file_dentry(file);
> >> @@ -107,7 +128,7 @@ static int ovl_open(struct inode *inode, struct file *file)
> >>       /* No longer need these flags, so don't pass them on to underlying fs */
> >>       file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
> >>
> >> -     realfile = ovl_open_realfile(file);
> >> +     realfile = ovl_open_realfile(file, ovl_inode_real(file_inode(file)));
> 
> That was meant to be
> 
> +     realfile = ovl_open_realfile(file, ovl_inode_realdata(file_inode(file)));
> 
> Is that correct?

Yes, that's correct.

While we are at it, can we also update patch changelog. I see that
_ovl_open_real() is not there anymore.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real()
  2018-05-29 14:46 ` [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real() Miklos Szeredi
@ 2018-05-30 21:05   ` Vivek Goyal
  2018-05-31  4:30     ` Amir Goldstein
  0 siblings, 1 reply; 36+ messages in thread
From: Vivek Goyal @ 2018-05-30 21:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel, linux-kernel

On Tue, May 29, 2018 at 04:46:00PM +0200, Miklos Szeredi wrote:
> From: Vivek Goyal <vgoyal@redhat.com>
> 
> Metacopy dentry/inode is internal to overlay and is never exposed outside
> of it.  Exception is metacopy upper file used for fsync().  Modify d_real()
> to look for dentries/inode which have data, but also allow matching upper
> inode without data for the fsync case.
> 

Hi Miklos,

I am not able to see how in fsync() path d_real() gets called. If we
decide to do fsync() on upper metacopy, then opening upper will not
go through d_real(). And we never issue fsync() on lower.

So I am scratching my head while trying to understand the d_real() and
fsync() connection.

Thanks
Vivek

> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/super.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index e0326a686f45..6687d547ec6b 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -101,10 +101,13 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>  	}
>  
>  	real = ovl_dentry_upper(dentry);
> -	if (real && (!inode || inode == d_inode(real)))
> +	if (real && (inode == d_inode(real)))
>  		return real;
>  
> -	real = ovl_dentry_lower(dentry);
> +	if (real && !inode && ovl_has_upperdata(d_inode(dentry)))
> +		return real;
> +
> +	real = ovl_dentry_lowerdata(dentry);
>  	if (!real)
>  		goto bug;
>  
> -- 
> 2.14.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real()
  2018-05-30 21:05   ` Vivek Goyal
@ 2018-05-31  4:30     ` Amir Goldstein
  0 siblings, 0 replies; 36+ messages in thread
From: Amir Goldstein @ 2018-05-31  4:30 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Thu, May 31, 2018 at 12:05 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, May 29, 2018 at 04:46:00PM +0200, Miklos Szeredi wrote:
>> From: Vivek Goyal <vgoyal@redhat.com>
>>
>> Metacopy dentry/inode is internal to overlay and is never exposed outside
>> of it.  Exception is metacopy upper file used for fsync().  Modify d_real()
>> to look for dentries/inode which have data, but also allow matching upper
>> inode without data for the fsync case.
>>
>
> Hi Miklos,
>
> I am not able to see how in fsync() path d_real() gets called. If we
> decide to do fsync() on upper metacopy, then opening upper will not
> go through d_real(). And we never issue fsync() on lower.
>
> So I am scratching my head while trying to understand the d_real() and
> fsync() connection.
>

fsync() exposes real meta inode to vfs, so file_dentry() needs to
be able to find the real meta upper dentry.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2018-05-31  4:30 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-29 14:45 [PATCH 00/28] overlayfs: Delayed copy up of data Miklos Szeredi
2018-05-29 14:45 ` [PATCH 01/28] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Miklos Szeredi
2018-05-29 14:45 ` [PATCH 02/28] ovl: Move the copy up helpers to copy_up.c Miklos Szeredi
2018-05-29 14:45 ` [PATCH 03/28] ovl: Provide a mount option metacopy=on/off for metadata copyup Miklos Szeredi
2018-05-29 20:44   ` Randy Dunlap
2018-05-30  8:27     ` Miklos Szeredi
2018-05-29 14:45 ` [PATCH 04/28] ovl: During copy up, first copy up metadata and then data Miklos Szeredi
2018-05-29 14:45 ` [PATCH 05/28] ovl: Copy up only metadata during copy up where it makes sense Miklos Szeredi
2018-05-29 14:45 ` [PATCH 06/28] ovl: Add helper ovl_already_copied_up() Miklos Szeredi
2018-05-29 14:45 ` [PATCH 07/28] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Miklos Szeredi
2018-05-29 14:45 ` [PATCH 08/28] ovl: Use out_err instead of out_nomem Miklos Szeredi
2018-05-29 14:45 ` [PATCH 09/28] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Miklos Szeredi
2018-05-29 14:45 ` [PATCH 10/28] ovl: Copy up meta inode data from lowest data inode Miklos Szeredi
2018-05-29 14:45 ` [PATCH 11/28] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Miklos Szeredi
2018-05-29 14:45 ` [PATCH 12/28] ovl: Fix ovl_getattr() to get number of blocks from lower Miklos Szeredi
2018-05-29 14:45 ` [PATCH 13/28] ovl: Store lower data inode in ovl_inode Miklos Szeredi
2018-05-29 14:45 ` [PATCH 14/28] ovl: Add helper ovl_inode_realdata() Miklos Szeredi
2018-05-29 14:45 ` [PATCH 15/28] ovl: Open file with data except for the case of fsync Miklos Szeredi
2018-05-30 14:30   ` Vivek Goyal
2018-05-30 15:12     ` Miklos Szeredi
2018-05-30 15:48       ` Vivek Goyal
2018-05-29 14:46 ` [PATCH 16/28] ovl: Do not expose metacopy only dentry from d_real() Miklos Szeredi
2018-05-30 21:05   ` Vivek Goyal
2018-05-31  4:30     ` Amir Goldstein
2018-05-29 14:46 ` [PATCH 17/28] ovl: Move some dir related ovl_lookup_single() code in else block Miklos Szeredi
2018-05-29 14:46 ` [PATCH 18/28] ovl: Check redirects for metacopy files Miklos Szeredi
2018-05-29 14:46 ` [PATCH 19/28] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Miklos Szeredi
2018-05-29 14:46 ` [PATCH 20/28] ovl: Add an inode flag OVL_CONST_INO Miklos Szeredi
2018-05-29 14:46 ` [PATCH 21/28] ovl: Do not set dentry type ORIGIN for broken hardlinks Miklos Szeredi
2018-05-29 14:46 ` [PATCH 22/28] ovl: Set redirect on metacopy files upon rename Miklos Szeredi
2018-05-29 14:46 ` [PATCH 23/28] ovl: Set redirect on upper inode when it is linked Miklos Szeredi
2018-05-29 14:46 ` [PATCH 24/28] ovl: Check redirect on index as well Miklos Szeredi
2018-05-29 14:46 ` [PATCH 25/28] ovl: Disbale metacopy for MAP_SHARED mmap() Miklos Szeredi
2018-05-29 14:46 ` [PATCH 26/28] ovl: Do not do metadata only copy-up for truncate operation Miklos Szeredi
2018-05-29 14:46 ` [PATCH 27/28] ovl: Do not do metacopy only for ioctl modifying file attr Miklos Szeredi
2018-05-29 14:46 ` [PATCH 28/28] ovl: Enable metadata only feature Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).