All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v14 00/31] overlayfs: Delayed copy up of data
@ 2018-04-26 19:09 Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 01/31] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
                   ` (30 more replies)
  0 siblings, 31 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Hi,

This is V14 of overlayfs metadata only copy-up feature. These patches I
have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw

Patches are also available here.

https://github.com/rhvgoyal/linux/commits/metacopy-v14

I think it made sense to rebase these patches on top of stacking file
operations. i know Miklos is making some changes to the patches. I will
rebase again once Miklos publishes his changes.

I have run unionmount-testsuite and "./check -overlay -g quick" and that
works. Only 4 overlay tests fail, which fail on vanilla kernel too.

Changes from V13.
- Rebased on top of overlayfs-rorw branch and made changes to deal with
  stacking file operations.

- Primarily took care of Amir's comments from V13.

- Dropped support for removing REDIRCT when metacopy file is data copied
  up.

- Dropped all the extra inode locking around redirects. I don't think
  any extra locking is required as of now. Put a rationale for my
  thinking in commit message of patch with following subject.

  "ovl: Set redirect on upper inode when it is linked" 

- Finally did not move setting redirect before lock_rename(), as there
  did not seem to be a need for extra locking so no issues with lock
  ordering, hence no need to move setting redirect before lock_rename.

Any feedback is welcome.

Overview
--------
Metadata only copy up allows for copying up of only metadata and data 
is copied up later when file is opened for WRITE operation. This seems
to have couple of use cases.

- Provide reflink like functionality for overlay filesystem.
- Make full container image chown operations faster.

I am especially interested in second use case at this point of time
where I want to chown container images fast and shift them for the user
namespace container is going to run in.

Thanks
Vivek

Vivek Goyal (31):
  ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  ovl: Move the copy up helpers to copy_up.c
  ovl: Provide a mount option metacopy=on/off for metadata copyup
  ovl: During copy up, first copy up metadata and then data
  ovl: Copy up only metadata during copy up where it makes sense
  ovl: Add helper ovl_already_copied_up()
  ovl: A new xattr OVL_XATTR_METACOPY for file on upper
  ovl: Use out_err instead of out_nomem
  ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  ovl: Copy up meta inode data from lowest data inode
  ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  ovl: Add an helper to get real data dentry
  ovl: Fix ovl_getattr() to get number of blocks from lower
  ovl: Store lower data inode in ovl_inode
  ovl: Add helper ovl_inode_real_data()
  ovl: Always open file/inode which contains data and not metacopy
  ovl: Do not expose metacopy only dentry from d_real()
  ovl: Move some dir related ovl_lookup_single() code in else block
  ovl: Check redirects for metacopy files
  ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  ovl: Add an inode flag OVL_CONST_INO
  ovl: Do not set dentry type ORIGIN for broken hardlinks
  ovl: Set redirect on metacopy files upon rename
  ovl: Set redirect on upper inode when it is linked
  ovl: Check redirect on index as well
  ovl: Allow ovl_open_realfile() to open metacopy inode
  ovl: Issue fsync on upper metacopy inode as well
  ovl: Disbale metacopy for MAP_SHARED mmap()
  ovl: Do not do metadata only copy-up for truncate operation
  ovl: Do not do metacopy only for ioctl modifying file attr
  ovl: Enable metadata only feature

 Documentation/filesystems/overlayfs.txt |  30 +++-
 fs/overlayfs/Kconfig                    |  19 +++
 fs/overlayfs/copy_up.c                  | 156 +++++++++++++++-----
 fs/overlayfs/dir.c                      |  74 +++++++---
 fs/overlayfs/export.c                   |   6 +-
 fs/overlayfs/file.c                     |  65 +++++++--
 fs/overlayfs/inode.c                    | 113 +++++++++------
 fs/overlayfs/namei.c                    | 144 ++++++++++++++-----
 fs/overlayfs/overlayfs.h                |  34 ++++-
 fs/overlayfs/ovl_entry.h                |   2 +
 fs/overlayfs/super.c                    |  53 ++++++-
 fs/overlayfs/util.c                     | 246 +++++++++++++++++++++++++++++++-
 12 files changed, 782 insertions(+), 160 deletions(-)

-- 
2.13.6


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v14 01/31] ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 02/31] ovl: Move the copy up helpers to copy_up.c Vivek Goyal
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_inode->redirect is an inode property and should be initialized
in ovl_get_inode() only when we are adding a new inode to cache. If
inode is already in cache, it is already initialized and we should
not be touching ovl_inode->redirect field.

As of now this is not a problem as redirects are used only for directories
which don't share inode. But soon I want to use redirects for regular files
also and there it can become an issue.

Hence, move ->redirect initialization in ovl_get_inode().

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    | 2 +-
 fs/overlayfs/inode.c     | 5 ++++-
 fs/overlayfs/namei.c     | 9 +--------
 fs/overlayfs/overlayfs.h | 2 +-
 4 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 425a94672300..5b7fee1a81ec 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -305,7 +305,7 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
+	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower, NULL);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index b723327501f7..a3b9124f95d6 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -801,7 +801,7 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
 
 struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower)
+			    unsigned int numlower, char *redirect)
 {
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
@@ -841,6 +841,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 			}
 
 			dput(upperdentry);
+			kfree(redirect);
 			goto out;
 		}
 
@@ -864,6 +865,8 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	if (index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	OVL_I(inode)->redirect = redirect;
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || numlower > 1) ||
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 2dba29eadde6..ac81e2c2c976 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1005,17 +1005,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (upperdentry || ctr) {
 		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
-				      ctr);
+				      ctr, upperredirect);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
-
-		/*
-		 * NB: handle redirected hard links when non-dir redirects
-		 * become possible
-		 */
-		WARN_ON(OVL_I(inode)->redirect);
-		OVL_I(inode)->redirect = upperredirect;
 	}
 
 	revert_creds(old_cred);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index dad54bc8de7d..2b68175f7400 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -354,7 +354,7 @@ struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
 struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower);
+			    unsigned int numlower, char *redirect);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 02/31] ovl: Move the copy up helpers to copy_up.c
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 01/31] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 03/31] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
                   ` (28 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now two copy up helpers are in inode.c. Amir suggested it might
be better to move these to copy_up.c.

There will one more related function which will come in later patch.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 32 ++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c     | 32 --------------------------------
 fs/overlayfs/overlayfs.h |  2 +-
 3 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 8bede0742619..4afc97872c2d 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -828,6 +828,38 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 	return err;
 }
 
+static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
+{
+	/* Copy up of disconnected dentry does not set upper alias */
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) ||
+	     (dentry->d_flags & DCACHE_DISCONNECTED)))
+		return false;
+
+	if (special_file(d_inode(dentry)->i_mode))
+		return false;
+
+	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
+		return false;
+
+	return true;
+}
+
+int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
+{
+	int err = 0;
+
+	if (ovl_open_need_copy_up(dentry, file_flags)) {
+		err = ovl_want_write(dentry);
+		if (!err) {
+			err = ovl_copy_up_flags(dentry, file_flags);
+			ovl_drop_write(dentry);
+		}
+	}
+
+	return err;
+}
+
 int ovl_copy_up(struct dentry *dentry)
 {
 	return ovl_copy_up_flags(dentry, 0);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index a3b9124f95d6..8a25bb569bdd 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -406,38 +406,6 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 	return acl;
 }
 
-static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
-{
-	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) ||
-	     (dentry->d_flags & DCACHE_DISCONNECTED)))
-		return false;
-
-	if (special_file(d_inode(dentry)->i_mode))
-		return false;
-
-	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
-		return false;
-
-	return true;
-}
-
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
-{
-	int err = 0;
-
-	if (ovl_open_need_copy_up(dentry, file_flags)) {
-		err = ovl_want_write(dentry);
-		if (!err) {
-			err = ovl_copy_up_flags(dentry, file_flags);
-			ovl_drop_write(dentry);
-		}
-	}
-
-	return err;
-}
-
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags)
 {
 	if (flags & S_ATIME) {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 2b68175f7400..199a39762d1d 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -345,7 +345,6 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 		  void *value, size_t size);
 ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
 struct posix_acl *ovl_get_acl(struct inode *inode, int type);
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
@@ -394,6 +393,7 @@ extern const struct file_operations ovl_file_operations;
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
+int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
 struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 03/31] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 01/31] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 02/31] ovl: Move the copy up helpers to copy_up.c Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 04/31] ovl: During copy up, first copy up metadata and then data Vivek Goyal
                   ` (27 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

By default metadata only copy up is disabled. Provide a mount option so
that users can choose one way or other.

Also provide a kernel config and module option to enable/disable
metacopy feature.

metacopy feature requires redirect_dir=on when upper is present. Otherwise,
it requires redirect_dir=follow atleast.

As of now, metacopy does not work with nfs_export=on. So if both metacopy=on
and nfs_export=on then nfs_export is disabled.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 Documentation/filesystems/overlayfs.txt | 30 ++++++++++++++++++++-
 fs/overlayfs/Kconfig                    | 19 ++++++++++++++
 fs/overlayfs/ovl_entry.h                |  1 +
 fs/overlayfs/super.c                    | 46 ++++++++++++++++++++++++++++++---
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 095186080d23..d6f393b0c76c 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -266,6 +266,30 @@ rightmost one and going left.  In the above example lower1 will be the
 top, lower2 the middle and lower3 the bottom layer.
 
 
+Metadata only copy up
+--------------------
+
+When metadata only copy up feature is enabled, overlayfs will only copy
+up metadata (as opposed to whole file), when a metadata specific operation
+like chown/chmod is performed. Full file will be copied up later when
+file is opened for WRITE operation.
+
+In other words, this is delayed data copy up operation and data is copied
+up when there is a need to actually modify data.
+
+There are multiple ways to enable/disable this feature. A config option
+CONFIG_OVERLAY_FS_METACOPY can be set/unset to enable/disable this feature
+by default. Or one can enable/disable it at module load time with module
+parameter metacopy=on/off. Lastly, there is also a per mount option
+metacopy=on/off to enable/disable this feature per mount.
+
+Do not use metacopy=on with untrusted upper/lower directories. Otherwise
+it is possible that an attacker can create an handcrafted file with
+appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower
+pointed by REDIRECT. This should not be possible on local system as setting
+"trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible
+for untrusted layers like from a pen drive.
+
 Sharing and copying layers
 --------------------------
 
@@ -284,7 +308,7 @@ though it will not result in a crash or deadlock.
 Mounting an overlay using an upper layer path, where the upper layer path
 was previously used by another mounted overlay in combination with a
 different lower layer path, is allowed, unless the "inodes index" feature
-is enabled.
+or "metadata only copy up" feature is enabled.
 
 With the "inodes index" feature, on the first time mount, an NFS file
 handle of the lower layer root directory, along with the UUID of the lower
@@ -297,6 +321,10 @@ lower root origin, mount will fail with ESTALE.  An overlayfs mount with
 does not support NFS export, lower filesystem does not have a valid UUID or
 if the upper filesystem does not support extended attributes.
 
+For "metadata only copy up" feature there is no verification mechanism at
+mount time. So if same upper is mouted with different set of lower, mount
+probably will succeed but expect the unexpected later on. So don't do it.
+
 It is quite a common practice to copy overlay layers to a different
 directory tree on the same or different underlying filesystem, and even
 to a different machine.  With the "inodes index" feature, trying to mount
diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 991c0a5a0e00..2e75a35e2726 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -64,6 +64,7 @@ config OVERLAY_FS_NFS_EXPORT
 	bool "Overlayfs: turn on NFS export feature by default"
 	depends on OVERLAY_FS
 	depends on OVERLAY_FS_INDEX
+	depends on !OVERLAY_FS_METACOPY
 	help
 	  If this config option is enabled then overlay filesystems will use
 	  the inodes index dir to decode overlay NFS file handles by default.
@@ -124,3 +125,21 @@ config OVERLAY_FS_COPY_UP_SHARED
 	 To get a maximally backward compatible kernel, disable this option.
 
 	 If unsure, say N.
+
+config OVERLAY_FS_METACOPY
+	bool "Overlayfs: turn on metadata only copy up feature by default"
+	depends on OVERLAY_FS
+	select OVERLAY_FS_REDIRECT_DIR
+	help
+	  If this config option is enabled then overlay filesystems will
+	  copy up only metadata where appropriate and data copy up will
+	  happen when a file is opended for WRITE operation. It is still
+	  possible to turn off this feature globally with the "metacopy=off"
+	  module option or on a filesystem instance basis with the
+	  "metacopy=off" mount option.
+
+	  Note, that this feature is not backward compatible.  That is,
+	  mounting an overlay which has metacopy only inodes on a kernel
+	  that doesn't support this feature will have unexpected results.
+
+	  If unsure, say N.
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 3bea47c63fd9..da65118a0567 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -20,6 +20,7 @@ struct ovl_config {
 	bool nfs_export;
 	bool copy_up_shared;
 	int xino;
+	bool metacopy;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ab1642fcece3..f26b79b3fae1 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -70,6 +70,11 @@ static void ovl_entry_stack_free(struct ovl_entry *oe)
 		dput(oe->lowerstack[i].dentry);
 }
 
+static bool ovl_metacopy_def = IS_ENABLED(CONFIG_OVERLAY_FS_METACOPY);
+module_param_named(metacopy, ovl_metacopy_def, bool, 0644);
+MODULE_PARM_DESC(ovl_metacopy_def,
+		 "Default to on or off for the metadata only copy up feature");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -351,6 +356,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.copy_up_shared != ovl_copy_up_shared_def)
 		seq_printf(m, ",copy_up_shared=%s",
 			   ofs->config.copy_up_shared ? "on" : "off");
+	if (ofs->config.metacopy != ovl_metacopy_def)
+		seq_printf(m, ",metacopy=%s",
+			   ofs->config.metacopy ? "on" : "off");
 	return 0;
 }
 
@@ -390,6 +398,8 @@ enum {
 	OPT_XINO_AUTO,
 	OPT_COPY_UP_SHARED_ON,
 	OPT_COPY_UP_SHARED_OFF,
+	OPT_METACOPY_ON,
+	OPT_METACOPY_OFF,
 	OPT_ERR,
 };
 
@@ -408,6 +418,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_AUTO,			"xino=auto"},
 	{OPT_COPY_UP_SHARED_ON,		"copy_up_shared=on"},
 	{OPT_COPY_UP_SHARED_OFF,	"copy_up_shared=off"},
+	{OPT_METACOPY_ON,		"metacopy=on"},
+	{OPT_METACOPY_OFF,		"metacopy=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -460,6 +472,7 @@ static int ovl_parse_redirect_mode(struct ovl_config *config, const char *mode)
 static int ovl_parse_opt(char *opt, struct ovl_config *config)
 {
 	char *p;
+	int err;
 
 	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
 	if (!config->redirect_mode)
@@ -542,6 +555,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->copy_up_shared = false;
 			break;
 
+		case OPT_METACOPY_ON:
+			config->metacopy = true;
+			break;
+
+		case OPT_METACOPY_OFF:
+			config->metacopy = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -556,7 +577,20 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 		config->workdir = NULL;
 	}
 
-	return ovl_parse_redirect_mode(config, config->redirect_mode);
+	err = ovl_parse_redirect_mode(config, config->redirect_mode);
+	if (err)
+		return err;
+
+	/* metacopy feature with upper requires redirect_dir=on */
+	if (config->upperdir && config->metacopy && !config->redirect_dir) {
+		pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=on\", falling back to metacopy=off.\n");
+		config->metacopy = false;
+	} else if (config->metacopy && !config->redirect_follow) {
+		pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=follow\" on non-upper mount, falling back to metacopy=off.\n");
+		config->metacopy = false;
+	}
+
+	return 0;
 }
 
 #define OVL_WORKDIR_NAME "work"
@@ -1030,7 +1064,8 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 	if (err) {
 		ofs->noxattr = true;
 		ofs->config.index = false;
-		pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off.\n");
+		ofs->config.metacopy = false;
+		pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off and metacopy=off.\n");
 		err = 0;
 	} else {
 		vfs_removexattr(ofs->workdir, OVL_XATTR_OPAQUE);
@@ -1052,7 +1087,6 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 		pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
 		ofs->config.nfs_export = false;
 	}
-
 out:
 	mnt_drop_write(mnt);
 	return err;
@@ -1364,6 +1398,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ofs->config.nfs_export = ovl_nfs_export_def;
 	ofs->config.xino = ovl_xino_def();
 	ofs->config.copy_up_shared = ovl_copy_up_shared_def;
+	ofs->config.metacopy = ovl_metacopy_def;
 	err = ovl_parse_opt((char *) data, &ofs->config);
 	if (err)
 		goto out_err;
@@ -1434,6 +1469,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		}
 	}
 
+	if (ofs->config.metacopy && ofs->config.nfs_export) {
+		pr_warn("overlayfs: NFS export is not supported with metadata only copy up, falling back to nfs_export=off.\n");
+		ofs->config.nfs_export = false;
+	}
+
 	if (ofs->config.nfs_export)
 		sb->s_export_op = &ovl_export_operations;
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 04/31] ovl: During copy up, first copy up metadata and then data
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (2 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 03/31] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 05/31] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Just a little re-ordering of code. This helps with next patch where after
copying up metadata, we skip data copying step, if needed.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 36 +++++++++++++++++-------------------
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 4afc97872c2d..7e2dc41ad693 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -536,28 +536,10 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 {
 	int err;
 
-	if (S_ISREG(c->stat.mode)) {
-		struct path upperpath;
-
-		ovl_path_upper(c->dentry, &upperpath);
-		BUG_ON(upperpath.dentry != NULL);
-		upperpath.dentry = temp;
-
-		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
-		if (err)
-			return err;
-	}
-
 	err = ovl_copy_xattr(c->lowerpath.dentry, temp);
 	if (err)
 		return err;
 
-	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
-	inode_unlock(temp->d_inode);
-	if (err)
-		return err;
-
 	/*
 	 * Store identifier of lower inode in upper inode xattr to
 	 * allow lookup of the copy up origin inode.
@@ -571,7 +553,23 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	return 0;
+	if (S_ISREG(c->stat.mode)) {
+		struct path upperpath;
+
+		ovl_path_upper(c->dentry, &upperpath);
+		BUG_ON(upperpath.dentry != NULL);
+		upperpath.dentry = temp;
+
+		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+		if (err)
+			return err;
+	}
+
+	inode_lock(temp->d_inode);
+	err = ovl_set_attr(temp, &c->stat);
+	inode_unlock(temp->d_inode);
+
+	return err;
 }
 
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 05/31] ovl: Copy up only metadata during copy up where it makes sense
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (3 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 04/31] ovl: During copy up, first copy up metadata and then data Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 06/31] ovl: Add helper ovl_already_copied_up() Vivek Goyal
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

If it makes sense to copy up only metadata during copy up, do it. This
is done for regular files which are not opened for WRITE.

Right now ->metacopy is set to 0 always. Last patch in the series will
remove the hard coded statement and enable metacopy feature.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7e2dc41ad693..38b332d7c7e4 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -416,6 +416,7 @@ struct ovl_copy_up_ctx {
 	bool tmpfile;
 	bool origin;
 	bool indexed;
+	bool metacopy;
 };
 
 static int ovl_link_up(struct ovl_copy_up_ctx *c)
@@ -553,7 +554,7 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	if (S_ISREG(c->stat.mode)) {
+	if (S_ISREG(c->stat.mode) && !c->metacopy) {
 		struct path upperpath;
 
 		ovl_path_upper(c->dentry, &upperpath);
@@ -708,6 +709,26 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 	return err;
 }
 
+static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
+				  int flags)
+{
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+
+	/* TODO: Will enable metacopy in last patch of series */
+	return false;
+
+	if (!ofs->config.metacopy)
+		return false;
+
+	if (!S_ISREG(mode))
+		return false;
+
+	if (flags && ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC)))
+		return false;
+
+	return true;
+}
+
 static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   int flags)
 {
@@ -729,6 +750,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	if (err)
 		return err;
 
+	ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags);
+
 	if (parent) {
 		ovl_path_upper(parent, &parentpath);
 		ctx.destdir = parentpath.dentry;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 06/31] ovl: Add helper ovl_already_copied_up()
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (4 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 05/31] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 07/31] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Vivek Goyal
                   ` (24 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

There are couple of places where we need to know if file is already copied
up (in lockless manner). Right now its open coded and there are only
two conditions to check. Soon this patch series will introduce another
condition to check and Amir wants to introduce one more. So introduce
a helper instead to check this so that code is easier to read.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 20 ++------------------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 26 +++++++++++++++++++++++++-
 3 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 38b332d7c7e4..6ac3220b0834 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -810,21 +810,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		struct dentry *next;
 		struct dentry *parent = NULL;
 
-		/*
-		 * Check if copy-up has happened as well as for upper alias (in
-		 * case of hard links) is there.
-		 *
-		 * Both checks are lockless:
-		 *  - false negatives: will recheck under oi->lock
-		 *  - false positives:
-		 *    + ovl_dentry_upper() uses memory barriers to ensure the
-		 *      upper dentry is up-to-date
-		 *    + ovl_dentry_has_upper_alias() relies on locking of
-		 *      upper parent i_rwsem to prevent reordering copy-up
-		 *      with rename.
-		 */
-		if (ovl_dentry_upper(dentry) &&
-		    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+		if (ovl_already_copied_up(dentry))
 			break;
 
 		next = dget(dentry);
@@ -852,9 +838,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) ||
-	     (dentry->d_flags & DCACHE_DISCONNECTED)))
+	if (ovl_already_copied_up(dentry))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 199a39762d1d..390881752e70 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -244,6 +244,7 @@ bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
 int ovl_copy_up_start(struct dentry *dentry);
 void ovl_copy_up_end(struct dentry *dentry);
+bool ovl_already_copied_up(struct dentry *dentry);
 bool ovl_check_origin_xattr(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 586f8d7440bb..ce15e211a33b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -369,13 +369,37 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
+bool ovl_already_copied_up(struct dentry *dentry)
+{
+	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
+
+	/*
+	 * Check if copy-up has happened as well as for upper alias (in
+	 * case of hard links) is there.
+	 *
+	 * Both checks are lockless:
+	 *  - false negatives: will recheck under oi->lock
+	 *  - false positives:
+	 *    + ovl_dentry_upper() uses memory barriers to ensure the
+	 *      upper dentry is up-to-date
+	 *    + ovl_dentry_has_upper_alias() relies on locking of
+	 *      upper parent i_rwsem to prevent reordering copy-up
+	 *      with rename.
+	 */
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+		return true;
+
+	return false;
+}
+
 int ovl_copy_up_start(struct dentry *dentry)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_dentry_has_upper_alias(dentry)) {
+	if (!err && ovl_already_copied_up(dentry)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 07/31] ovl: A new xattr OVL_XATTR_METACOPY for file on upper
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (5 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 06/31] ovl: Add helper ovl_already_copied_up() Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 08/31] ovl: Use out_err instead of out_nomem Vivek Goyal
                   ` (23 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Now we will have the capability to have upper inodes which might be only
metadata copy up and data is still on lower inode. So add a new xattr
OVL_XATTR_METACOPY to distinguish between two cases.

Presence of OVL_XATTR_METACOPY reflects that file has been copied up
metadata only and and data will be copied up later from lower origin.
So this xattr is set when a metadata copy takes place and cleared when
data copy takes place.

We also use a bit in ovl_inode->flags to cache OVL_UPPERDATA which reflects
whether ovl inode has data or not (as opposed to metadata only copy up).

If a file is copied up metadata only and later when same file is opened
for WRITE, then data copy up takes place. We copy up data, remove METACOPY
xattr and then set the UPPERDATA flag in ovl_inode->flags. While all
these operations happen with oi->lock held, read side of oi->flags can be
lockless. That is another thread on another cpu can check if UPPERDATA
flag is set or not.

So this gives us an ordering requirement w.r.t UPPERDATA flag. That is, if
another cpu sees UPPERDATA flag set, then it should be guaranteed that
effects of data copy up and remove xattr operations are also visible.

For example.

	CPU1				CPU2
ovl_open()				acquire(oi->lock)
 ovl_open_maybe_copy_up()                ovl_copy_up_data()
  open_open_need_copy_up()		 vfs_removexattr()
   ovl_already_copied_up()
    ovl_dentry_needs_data_copy_up()	 ovl_set_flag(OVL_UPPERDATA)
     ovl_test_flag(OVL_UPPERDATA)       release(oi->lock)

Say CPU2 is copying up data and in the end sets UPPERDATA flag. But if
CPU1 perceives the effects of setting UPPERDATA flag but not the effects
of preceeding operations (ex. upper that is not fully copied up), it will be
a problem.

Hence this patch introduces smp_wmb() on setting UPPERDATA flag operation
and smp_rmb() on UPPERDATA flag test operation.

May be some other lock or barrier is already covering it. But I am not sure
what that is and is it obvious enough that we will not break it in future.

So hence trying to be safe here and introducing barriers explicitly for
UPPERDATA flag/bit.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 56 ++++++++++++++++++++++++++++++----
 fs/overlayfs/dir.c       |  1 +
 fs/overlayfs/overlayfs.h | 18 +++++++++--
 fs/overlayfs/super.c     |  1 +
 fs/overlayfs/util.c      | 78 +++++++++++++++++++++++++++++++++++++++++++++---
 5 files changed, 143 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 6ac3220b0834..7453913ca61b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -195,6 +195,16 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 	return error;
 }
 
+static int ovl_set_size(struct dentry *upperdentry, struct kstat *stat)
+{
+	struct iattr attr = {
+		.ia_valid = ATTR_SIZE,
+		.ia_size = stat->size,
+	};
+
+	return notify_change(upperdentry, &attr, NULL);
+}
+
 static int ovl_set_timestamps(struct dentry *upperdentry, struct kstat *stat)
 {
 	struct iattr attr = {
@@ -566,8 +576,18 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
+	if (c->metacopy) {
+		err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
+					 NULL, 0, -EOPNOTSUPP);
+		if (err)
+			return err;
+	}
+
 	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
+	if (c->metacopy)
+		err = ovl_set_size(temp, &c->stat);
+	if (!err)
+		err = ovl_set_attr(temp, &c->stat);
 	inode_unlock(temp->d_inode);
 
 	return err;
@@ -605,6 +625,8 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
 	if (err)
 		goto out_cleanup;
 
+	if (!c->metacopy)
+		ovl_set_upperdata(d_inode(c->dentry));
 	inode = d_inode(c->dentry);
 	ovl_inode_update(inode, newdentry);
 	if (S_ISDIR(inode->i_mode))
@@ -729,6 +751,28 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 	return true;
 }
 
+/* Copy up data of an inode which was copied up metadata only in the past. */
+static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
+{
+	struct path upperpath;
+	int err;
+
+	ovl_path_upper(c->dentry, &upperpath);
+	if (WARN_ON(upperpath.dentry == NULL))
+		return -EIO;
+
+	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	if (err)
+		return err;
+
+	err = vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY);
+	if (err)
+		return err;
+
+	ovl_set_upperdata(d_inode(c->dentry));
+	return err;
+}
+
 static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   int flags)
 {
@@ -775,7 +819,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	}
 	ovl_do_check_copy_up(ctx.lowerpath.dentry);
 
-	err = ovl_copy_up_start(dentry);
+	err = ovl_copy_up_start(dentry, flags);
 	/* err < 0: interrupted, err > 0: raced with another copy-up */
 	if (unlikely(err)) {
 		if (err > 0)
@@ -785,6 +829,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			err = ovl_do_copy_up(&ctx);
 		if (!err && parent && !ovl_dentry_has_upper_alias(dentry))
 			err = ovl_link_up(&ctx);
+		if (!err && ovl_dentry_needs_data_copy_up_locked(dentry, flags))
+			err = ovl_copy_up_meta_inode_data(&ctx);
 		ovl_copy_up_end(dentry);
 	}
 	do_delayed_call(&done);
@@ -810,7 +856,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		struct dentry *next;
 		struct dentry *parent = NULL;
 
-		if (ovl_already_copied_up(dentry))
+		if (ovl_already_copied_up(dentry, flags))
 			break;
 
 		next = dget(dentry);
@@ -838,13 +884,13 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_already_copied_up(dentry))
+	if (ovl_already_copied_up(dentry, flags))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
 		return false;
 
-	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
+	if (!ovl_open_flags_need_copy_up(flags))
 		return false;
 
 	return true;
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index cd0fa2363723..34745ccea682 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -189,6 +189,7 @@ static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 	ovl_dentry_version_inc(dentry->d_parent, false);
 	ovl_dentry_set_upper_alias(dentry);
 	if (!hardlink) {
+		ovl_set_upperdata(inode);
 		ovl_inode_update(inode, newdentry);
 		ovl_copyattr(newdentry->d_inode, inode);
 	} else {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 390881752e70..e21156425361 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -29,6 +29,7 @@ enum ovl_path_type {
 #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure"
 #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink"
 #define OVL_XATTR_UPPER OVL_XATTR_PREFIX "upper"
+#define OVL_XATTR_METACOPY OVL_XATTR_PREFIX "metacopy"
 
 enum ovl_inode_flag {
 	/* Pure upper dir that may contain non pure upper entries */
@@ -36,6 +37,7 @@ enum ovl_inode_flag {
 	/* Non-merge dir that may contain whiteout entries */
 	OVL_WHITEOUTS,
 	OVL_INDEX,
+	OVL_UPPERDATA,
 };
 
 enum ovl_entry_flag {
@@ -197,6 +199,14 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
 	return ret;
 }
 
+static inline bool ovl_open_flags_need_copy_up(int flags)
+{
+	if (!flags)
+		return false;
+
+	return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
+}
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
@@ -232,6 +242,10 @@ bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry);
 bool ovl_dentry_has_upper_alias(struct dentry *dentry);
 void ovl_dentry_set_upper_alias(struct dentry *dentry);
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags);
+bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags);
+bool ovl_has_upperdata(struct inode *inode);
+void ovl_set_upperdata(struct inode *inode);
 bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
@@ -242,9 +256,9 @@ void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
-int ovl_copy_up_start(struct dentry *dentry);
+int ovl_copy_up_start(struct dentry *dentry, int flags);
 void ovl_copy_up_end(struct dentry *dentry);
-bool ovl_already_copied_up(struct dentry *dentry);
+bool ovl_already_copied_up(struct dentry *dentry, int flags);
 bool ovl_check_origin_xattr(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index f26b79b3fae1..6f5e7ccc3d8f 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1503,6 +1503,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	/* Root is always merge -> can have whiteouts */
 	ovl_set_flag(OVL_WHITEOUTS, d_inode(root_dentry));
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
+	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
 		       ovl_dentry_lower(root_dentry));
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index ce15e211a33b..3b64ecf5cb9b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -279,6 +279,62 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
 	ovl_dentry_set_flag(OVL_E_UPPER_ALIAS, dentry);
 }
 
+static bool ovl_should_check_upperdata(struct inode *inode)
+{
+	if (!S_ISREG(inode->i_mode))
+		return false;
+
+	if (!ovl_inode_lower(inode))
+		return false;
+
+	return true;
+}
+
+bool ovl_has_upperdata(struct inode *inode)
+{
+	if (!ovl_should_check_upperdata(inode))
+		return true;
+
+	if (!ovl_test_flag(OVL_UPPERDATA, inode))
+		return false;
+	/*
+	 * Pairs with smp_wmb() in ovl_set_upperdata(). Main user of
+	 * ovl_has_upperdata() is ovl_copy_up_meta_inode_data(). Make sure
+	 * if setting of OVL_UPPERDATA is visible, then effects of writes
+	 * before that are visible too.
+	 */
+	smp_rmb();
+	return true;
+}
+
+void ovl_set_upperdata(struct inode *inode)
+{
+	/*
+	 * Pairs with smp_rmb() in ovl_has_upperdata(). Make sure
+	 * if OVL_UPPERDATA flag is visible, then effects of write operations
+	 * before it are visible as well.
+	 */
+	smp_wmb();
+	ovl_set_flag(OVL_UPPERDATA, inode);
+}
+
+/* Caller should hold ovl_inode->lock */
+bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
+{
+	if (!ovl_open_flags_need_copy_up(flags))
+		return false;
+
+	return !ovl_test_flag(OVL_UPPERDATA, d_inode(dentry));
+}
+
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags)
+{
+	if (!ovl_open_flags_need_copy_up(flags))
+		return false;
+
+	return !ovl_has_upperdata(d_inode(dentry));
+}
+
 bool ovl_redirect_dir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -369,7 +425,20 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
-bool ovl_already_copied_up(struct dentry *dentry)
+/* Caller should hold ovl_inode->lock */
+static bool ovl_already_copied_up_locked(struct dentry *dentry, int flags)
+{
+	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
+
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected) &&
+	    !ovl_dentry_needs_data_copy_up_locked(dentry, flags))
+		return true;
+
+	return false;
+}
+
+bool ovl_already_copied_up(struct dentry *dentry, int flags)
 {
 	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
 
@@ -387,19 +456,20 @@ bool ovl_already_copied_up(struct dentry *dentry)
 	 *      with rename.
 	 */
 	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected) &&
+	    !ovl_dentry_needs_data_copy_up(dentry, flags))
 		return true;
 
 	return false;
 }
 
-int ovl_copy_up_start(struct dentry *dentry)
+int ovl_copy_up_start(struct dentry *dentry, int flags)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_already_copied_up(dentry)) {
+	if (!err && ovl_already_copied_up_locked(dentry, flags)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 08/31] ovl: Use out_err instead of out_nomem
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (6 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 07/31] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now we use goto out_nomem which assumes error code is -ENOMEM. But
there are other errors returned like -ESTALE as well. So instead of out_nomem,
use out_err which will do ERR_PTR(err). That way one can putt error code
in err and jump to out_err.

This just code reorganization and no change of functionality.

I am about to add more code and this organization helps laying more code
and error paths on top of it.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 8a25bb569bdd..59b1c84076ec 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -778,6 +778,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	int fsid = bylower ? lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
+	int err = -ENOMEM;
 
 	if (!realinode)
 		realinode = d_inode(lowerdentry);
@@ -795,7 +796,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 		inode = iget5_locked(sb, (unsigned long) key,
 				     ovl_inode_test, ovl_inode_set, key);
 		if (!inode)
-			goto out_nomem;
+			goto out_err;
 		if (!(inode->i_state & I_NEW)) {
 			/*
 			 * Verify that the underlying files stored in the inode
@@ -804,8 +805,8 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 			if (!ovl_verify_inode(inode, lowerdentry, upperdentry,
 					      true)) {
 				iput(inode);
-				inode = ERR_PTR(-ESTALE);
-				goto out;
+				err = -ESTALE;
+				goto out_err;
 			}
 
 			dput(upperdentry);
@@ -821,8 +822,10 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	} else {
 		/* Lower hardlink that will be broken on copy up */
 		inode = new_inode(sb);
-		if (!inode)
-			goto out_nomem;
+		if (!inode) {
+			err = -ENOMEM;
+			goto out_err;
+		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
 	ovl_inode_init(inode, upperdentry, lowerdentry);
@@ -848,7 +851,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 out:
 	return inode;
 
-out_nomem:
-	inode = ERR_PTR(-ENOMEM);
+out_err:
+	inode = ERR_PTR(err);
 	goto out;
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (7 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 08/31] ovl: Use out_err instead of out_nomem Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-28  8:14   ` Amir Goldstein
  2018-04-26 19:09 ` [PATCH v14 10/31] ovl: Copy up meta inode data from lowest data inode Vivek Goyal
                   ` (21 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
It also allows for presence of metacopy dentries in lower layer.

During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
set OVL_UPPERDATA bit in flags.

We don't support metacopy feature with nfs_export. So in nfs_export code,
we set OVL_UPPERDATA flag set unconditionally if upper inode exists.

Do not follow metacopy origin if we find a metacopy only inode and metacopy
feature is not enabled for that mount. Like redirect, this can have security
implications where an attacker could hand craft upper and try to gain
access to file on lower which it should not have to begin with.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    |   3 ++
 fs/overlayfs/inode.c     |  11 ++++-
 fs/overlayfs/namei.c     | 102 ++++++++++++++++++++++++++++++++++++++++-------
 fs/overlayfs/overlayfs.h |   1 +
 fs/overlayfs/util.c      |  22 ++++++++++
 5 files changed, 124 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 5b7fee1a81ec..e1f6546b6c84 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -311,6 +311,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 		return ERR_CAST(inode);
 	}
 
+	if (upper)
+		ovl_set_flag(OVL_UPPERDATA, inode);
+
 	dentry = d_find_any_alias(inode);
 	if (!dentry) {
 		dentry = d_alloc_anon(inode->i_sb);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 59b1c84076ec..40509b7a50d2 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -776,7 +776,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
 	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
 	int fsid = bylower ? lowerpath->layer->fsid : 0;
-	bool is_dir;
+	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = -ENOMEM;
 
@@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	if (index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	if (upperdentry) {
+		err = ovl_check_metacopy_xattr(upperdentry);
+		if (err < 0)
+			goto out_err;
+		metacopy = err;
+		if (!metacopy)
+			ovl_set_flag(OVL_UPPERDATA, inode);
+	}
+
 	OVL_I(inode)->redirect = redirect;
 
 	/* Check for non-merge dir that may have whiteouts */
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index ac81e2c2c976..682f0d402986 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -24,6 +24,7 @@ struct ovl_lookup_data {
 	bool stop;
 	bool last;
 	char *redirect;
+	bool metacopy;
 };
 
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
@@ -253,19 +254,29 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		goto put_and_out;
 	}
 	if (!d_can_lookup(this)) {
-		d->stop = true;
-		if (d->is_dir)
+		if (d->is_dir) {
+			d->stop = true;
 			goto put_and_out;
-
+		}
 		/*
 		 * NB: handle failure to lookup non-last element when non-dir
 		 * redirects become possible
 		 */
 		WARN_ON(!last_element);
+		err = ovl_check_metacopy_xattr(this);
+		if (err < 0)
+			goto out_err;
+		d->stop = !err;
+		d->metacopy = !!err;
 		goto out;
 	}
-	if (last_element)
+	if (last_element) {
+		if (d->metacopy) {
+			err = -ESTALE;
+			goto out_err;
+		}
 		d->is_dir = true;
+	}
 	if (d->last)
 		goto out;
 
@@ -823,7 +834,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct ovl_path *stack = NULL;
+	struct ovl_path *stack = NULL, *origin_path = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *origin = NULL;
 	struct dentry *index = NULL;
@@ -834,6 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	bool metacopy = false;
 	struct ovl_lookup_data d = {
 		.name = dentry->d_name,
 		.is_dir = false,
@@ -841,6 +853,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		.stop = false,
 		.last = ofs->config.redirect_follow ? false : !poe->numlower,
 		.redirect = NULL,
+		.metacopy = false,
 	};
 
 	if (dentry->d_name.len > ofs->namelen)
@@ -859,7 +872,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			goto out;
 		}
 		if (upperdentry && !d.is_dir) {
-			BUG_ON(!d.stop || d.redirect);
+			unsigned int origin_ctr = 0;
+			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
@@ -870,9 +884,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			 * number - it's the same as if we held a reference
 			 * to a dentry in lower layer that was moved under us.
 			 */
-			err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
+			err = ovl_check_origin(ofs, upperdentry, &origin_path,
+					       &origin_ctr);
 			if (err)
 				goto out_put_upper;
+
+			if (d.metacopy)
+				metacopy = true;
 		}
 
 		if (d.redirect) {
@@ -913,7 +931,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		 * If no origin fh is stored in upper of a merge dir, store fh
 		 * of lower dir and set upper parent "impure".
 		 */
-		if (upperdentry && !ctr && !ofs->noxattr) {
+		if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
 			err = ovl_fix_origin(dentry, this, upperdentry);
 			if (err) {
 				dput(this);
@@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		 * When "verify_lower" feature is enabled, do not merge with a
 		 * lower dir that does not match a stored origin xattr. In any
 		 * case, only verified origin is used for index lookup.
+		 *
+		 * For non-dir dentry, make sure dentry found by lookup
+		 * matches the origin stored in upper. Otherwise its an
+		 * error.
 		 */
-		if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
+		if (upperdentry && !ctr &&
+		    ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
+		     (!d.is_dir && origin_path))) {
 			err = ovl_verify_origin(upperdentry, this, false);
 			if (err) {
 				dput(this);
-				break;
+				if (d.is_dir)
+					break;
+				goto out_put;
 			}
-
-			/* Bless lower dir as verified origin */
+			/* Bless lower as verified origin */
 			origin = this;
 		}
 
+		if (d.metacopy)
+			metacopy = true;
+		/*
+		 * Do not store intermediate metacopy dentries in chain,
+		 * except top most lower metacopy dentry
+		 */
+		if (d.metacopy && ctr) {
+			dput(this);
+			continue;
+		}
+
 		stack[ctr].dentry = this;
 		stack[ctr].layer = lower.layer;
 		ctr++;
@@ -968,13 +1004,43 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 	}
 
+	if (metacopy) {
+		/*
+		 * Found a metacopy dentry but did not find corresponding
+		 * data dentry
+		 */
+		if (d.metacopy) {
+			err = -ESTALE;
+			goto out_put;
+		}
+
+		err = -EPERM;
+		if (!ofs->config.metacopy) {
+			pr_warn_ratelimited("overlay: refusing to follow"
+					    " metacopy origin for (%pd2)\n",
+					    dentry);
+			goto out_put;
+		}
+	} else if (!d.is_dir && upperdentry && !ctr && origin_path) {
+		if (WARN_ON(stack != NULL)) {
+			err = -EIO;
+			goto out_put;
+		}
+		stack = origin_path;
+		ctr = 1;
+		origin_path = NULL;
+	}
+
 	/*
 	 * Lookup index by lower inode and verify it matches upper inode.
 	 * We only trust dir index if we verified that lower dir matches
 	 * origin, otherwise dir index entries may be inconsistent and we
-	 * ignore them. Always lookup index of non-dir and non-upper.
+	 * ignore them.
+	 * For non-dir metacopy dentry, we already set origin if we verified
+	 * that lower matched origin. If not, skip looking up index.
+	 * Always lookup index of non-dir non-metacopy and non-upper.
 	 */
-	if (ctr && (!upperdentry || !d.is_dir))
+	if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
 		origin = stack[0].dentry;
 
 	if (origin && ovl_indexdir(dentry->d_sb) &&
@@ -1012,6 +1078,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	revert_creds(old_cred);
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
@@ -1026,6 +1096,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		dput(stack[i].dentry);
 	kfree(stack);
 out_put_upper:
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(upperdentry);
 	kfree(upperredirect);
 out:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e21156425361..790314079950 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -275,6 +275,7 @@ int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 void ovl_copytimes(struct inode *inode);
+int ovl_check_metacopy_xattr(struct dentry *dentry);
 
 static inline void ovl_copytimes_with_parent(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 3b64ecf5cb9b..a32509814f54 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -789,3 +789,25 @@ void ovl_copytimes(struct inode *inode)
 		inode->i_ctime = upperinode->i_ctime;
 	}
 }
+
+/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
+int ovl_check_metacopy_xattr(struct dentry *dentry)
+{
+	int res;
+
+	/* Only regular files can have metacopy xattr */
+	if (!S_ISREG(d_inode(dentry)->i_mode))
+		return 0;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto out;
+	}
+
+	return 1;
+out:
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	return res;
+}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 10/31] ovl: Copy up meta inode data from lowest data inode
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (8 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 11/31] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Vivek Goyal
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

So far lower could not be a meta inode. So whenever it was time to copy
up data of a meta inode, we could copy it up from top most lower dentry.

But now lower itself can be a metacopy inode. That means data copy up
needs to take place from a data inode in metacopy inode chain. Find
lower data inode in the chain and use that for data copy up.

Introduced a helper called ovl_path_lowerdata() to find the lower
data inode chain.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 14 ++++++++++----
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 14 ++++++++++++++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7453913ca61b..5a865a4cf3d7 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -565,13 +565,15 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 	}
 
 	if (S_ISREG(c->stat.mode) && !c->metacopy) {
-		struct path upperpath;
+		struct path upperpath, datapath;
 
 		ovl_path_upper(c->dentry, &upperpath);
 		BUG_ON(upperpath.dentry != NULL);
 		upperpath.dentry = temp;
 
-		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+		ovl_path_lowerdata(c->dentry, &datapath);
+		BUG_ON(datapath.dentry == NULL);
+		err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size);
 		if (err)
 			return err;
 	}
@@ -754,14 +756,18 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 /* Copy up data of an inode which was copied up metadata only in the past. */
 static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
 {
-	struct path upperpath;
+	struct path upperpath, datapath;
 	int err;
 
 	ovl_path_upper(c->dentry, &upperpath);
 	if (WARN_ON(upperpath.dentry == NULL))
 		return -EIO;
 
-	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	ovl_path_lowerdata(c->dentry, &datapath);
+	if (WARN_ON(datapath.dentry == NULL))
+		return -EIO;
+
+	err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size);
 	if (err)
 		return err;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 790314079950..cb6fad91147d 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -223,6 +223,7 @@ bool ovl_dentry_weird(struct dentry *dentry);
 enum ovl_path_type ovl_path_type(struct dentry *dentry);
 void ovl_path_upper(struct dentry *dentry, struct path *path);
 void ovl_path_lower(struct dentry *dentry, struct path *path);
+void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index a32509814f54..42b24308e016 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -164,6 +164,20 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
 	}
 }
 
+void ovl_path_lowerdata(struct dentry *dentry, struct path *path)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+	int idx = oe->numlower - 1;
+
+	if (!oe->numlower) {
+		*path = (struct path) { };
+		return;
+	}
+
+	path->mnt = oe->lowerstack[idx].layer->mnt;
+	path->dentry = oe->lowerstack[idx].dentry;
+}
+
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
 {
 	enum ovl_path_type type = ovl_path_type(dentry);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 11/31] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (9 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 10/31] ovl: Copy up meta inode data from lowest data inode Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 12/31] ovl: Add an helper to get real " Vivek Goyal
                   ` (19 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Now we have the notion of data dentry and metacopy dentry. ovl_dentry_lower()
will return uppermost lower dentry, but it could be either data or metacopy
dentry. Now we support metacopy dentries in lower layers so it is possible
that lowerstack[0] is metacopy dentry while lowerstack[1] is actual data
dentry.

So add an helper which returns lowest most dentry which is supposed to be
data dentry.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index cb6fad91147d..08df75f473fe 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -227,6 +227,7 @@ void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
+struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
 struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 42b24308e016..7029036b9aa3 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -209,6 +209,20 @@ struct ovl_layer *ovl_layer_lower(struct dentry *dentry)
 	return oe->numlower ? oe->lowerstack[0].layer : NULL;
 }
 
+/*
+ * ovl_dentry_lower() could return either a data dentry or metacopy dentry
+ * dependig on what is stored in lowerstack[0]. At times we need to find
+ * lower dentry which has data (and not metacopy dentry). This helper
+ * returns the lower data dentry.
+ */
+struct dentry *ovl_dentry_lowerdata(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+	int idx = oe->numlower - 1;
+
+	return idx >= 0 ? oe->lowerstack[idx].dentry : NULL;
+}
+
 struct dentry *ovl_dentry_real(struct dentry *dentry)
 {
 	return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 12/31] ovl: Add an helper to get real data dentry
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (10 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 11/31] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 20:33   ` Amir Goldstein
  2018-04-26 19:09 ` [PATCH v14 13/31] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
                   ` (18 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_dentry_real() returns the real dentry. (Either upper or lower). We
also need an helper to ignore metacopy dentries and return "real data"
dentry. This helper returns an upper/lower dentry which contains data.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 08df75f473fe..5077d2992d07 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -230,6 +230,7 @@ struct dentry *ovl_dentry_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
 struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
+struct dentry *ovl_dentry_real_data(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 7029036b9aa3..2bef013ca432 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -228,6 +228,18 @@ struct dentry *ovl_dentry_real(struct dentry *dentry)
 	return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
 }
 
+/* Return real dentry which contains data. Skip metacopy dentries */
+struct dentry *ovl_dentry_real_data(struct dentry *dentry)
+{
+	struct dentry *upperdentry;
+
+	upperdentry = ovl_dentry_upper(dentry);
+	if (upperdentry && ovl_has_upperdata(d_inode(dentry)))
+		return upperdentry;
+
+	return ovl_dentry_lowerdata(dentry);
+}
+
 struct dentry *ovl_i_dentry_upper(struct inode *inode)
 {
 	return ovl_upperdentry_dereference(OVL_I(inode));
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 13/31] ovl: Fix ovl_getattr() to get number of blocks from lower
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (11 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 12/31] ovl: Add an helper to get real " Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 19:09 ` [PATCH v14 14/31] ovl: Store lower data inode in ovl_inode Vivek Goyal
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

If an inode has been copied up metadata only, then we need to query the
number of blocks from lower and fill up the stat->st_blocks.

We need to be careful about races where we are doing stat on one cpu and
data copy up is taking place on other cpu. We want to return
stat->st_blocks either from lower or stable upper and not something in
between. Hence, ovl_has_upperdata() is called first to figure out whether
block reporting will take place from lower or upper.

We now support metacopy dentries in middle layer. That means number of
blocks reporting needs to come from lowest data dentry and this could
be different from lower dentry. Hence we end up making a separate
vfs_getxattr() call for metacopy dentries to get number of blocks.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c     | 35 ++++++++++++++++++++++++++++++++++-
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 16 ++++++++++++++++
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 40509b7a50d2..9d17f15480de 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -154,6 +154,9 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	bool samefs = ovl_same_sb(dentry->d_sb);
 	struct ovl_layer *lower_layer = NULL;
 	int err;
+	bool metacopy_blocks = false;
+
+	metacopy_blocks = ovl_is_metacopy_dentry(dentry);
 
 	type = ovl_path_real(dentry, &realpath);
 	old_cred = ovl_override_creds(dentry->d_sb);
@@ -175,7 +178,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			lower_layer = ovl_layer_lower(dentry);
 		} else if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
-			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
+			u32 lowermask = STATX_INO | STATX_BLOCKS |
+					(!is_dir ? STATX_NLINK : 0);
 
 			ovl_path_lower(dentry, &realpath);
 			err = vfs_getattr(&realpath, &lowerstat,
@@ -204,6 +208,35 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 				stat->ino = lowerstat.ino;
 				lower_layer = ovl_layer_lower(dentry);
 			}
+
+			/*
+			 * If we are querying a metacopy dentry and lower
+			 * dentry is data dentry, then use the blocks we
+			 * queried just now. We don't have to do additional
+			 * vfs_getattr(). If lower itself is metacopy, then
+			 * additional vfs_getattr() is unavoidable.
+			 */
+			if (metacopy_blocks &&
+			    realpath.dentry == ovl_dentry_lowerdata(dentry)) {
+				stat->blocks = lowerstat.blocks;
+				metacopy_blocks = false;
+			}
+		}
+
+		if (metacopy_blocks) {
+			/*
+			 * If lower is not same as lowerdata or if there was
+			 * no origin on upper, we can end up here.
+			 */
+			struct kstat lowerdatastat;
+			u32 lowermask = STATX_BLOCKS;
+
+			ovl_path_lowerdata(dentry, &realpath);
+			err = vfs_getattr(&realpath, &lowerdatastat,
+					  lowermask, flags);
+			if (err)
+				goto out;
+			stat->blocks = lowerdatastat.blocks;
 		}
 	}
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 5077d2992d07..29b28ee50b0f 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -279,6 +279,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 void ovl_copytimes(struct inode *inode);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
+bool ovl_is_metacopy_dentry(struct dentry *dentry);
 
 static inline void ovl_copytimes_with_parent(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 2bef013ca432..338de6bdae07 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -851,3 +851,19 @@ int ovl_check_metacopy_xattr(struct dentry *dentry)
 	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
 	return res;
 }
+
+bool ovl_is_metacopy_dentry(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	if (!d_is_reg(dentry))
+		return false;
+
+	if (ovl_dentry_upper(dentry)) {
+		if (!ovl_has_upperdata(d_inode(dentry)))
+			return true;
+		return false;
+	}
+
+	return (oe->numlower > 1);
+}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 14/31] ovl: Store lower data inode in ovl_inode
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (12 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 13/31] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 20:31   ` Amir Goldstein
  2018-04-26 19:09 ` [PATCH v14 15/31] ovl: Add helper ovl_inode_real_data() Vivek Goyal
                   ` (16 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now ovl_inode stores inode pointer for lower inode. This helps
with quickly getting lower inode given overlay inode (ovl_inode_lower()).

Now with metadata only copy-up, we can have metacopy inode in middle
layer as well and inode containing data can be different from ->lower.
I need to be able to open the real file in ovl_open_realfile() and
for that I need to quickly find the lower data inode.

Hence store lower data inode also in ovl_inode. Also provide an
helper ovl_inode_lowerdata() to access this field.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    |  3 ++-
 fs/overlayfs/inode.c     |  5 +++--
 fs/overlayfs/namei.c     |  5 ++++-
 fs/overlayfs/overlayfs.h |  6 ++++--
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/super.c     |  4 +++-
 fs/overlayfs/util.c      | 10 +++++++++-
 7 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index e1f6546b6c84..d43d926ae68e 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -305,7 +305,8 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower, NULL);
+	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower, NULL,
+			      NULL);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 9d17f15480de..a6cd57741a1f 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -802,7 +802,8 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
 
 struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower, char *redirect)
+			    unsigned int numlower, char *redirect,
+			    struct dentry *lowerdata)
 {
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
@@ -861,7 +862,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
-	ovl_inode_init(inode, upperdentry, lowerdentry);
+	ovl_inode_init(inode, upperdentry, lowerdentry, lowerdata);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 682f0d402986..b25e32cf5916 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1070,8 +1070,11 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		upperdentry = dget(index);
 
 	if (upperdentry || ctr) {
+		struct dentry *lowerdata = NULL;
+		if (ctr > 1 && !d.is_dir)
+			lowerdata = stack[ctr - 1].dentry;
 		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
-				      ctr, upperredirect);
+				      ctr, upperredirect, lowerdata);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 29b28ee50b0f..29c3498a7e61 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -234,6 +234,7 @@ struct dentry *ovl_dentry_real_data(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
+struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
@@ -253,7 +254,7 @@ bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry);
+		    struct dentry *lowerdentry, struct dentry *lowerdata);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
 void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
@@ -373,7 +374,8 @@ struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
 struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower, char *redirect);
+			    unsigned int numlower, char *redirect,
+			    struct dentry *lowerdata);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index da65118a0567..83e031e31ff9 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -97,6 +97,7 @@ struct ovl_inode {
 	struct inode vfs_inode;
 	struct dentry *__upperdentry;
 	struct inode *lower;
+	struct inode *lowerdata;
 
 	/* synchronize copy up and more */
 	struct mutex lock;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 6f5e7ccc3d8f..b81ec9d22ac5 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -182,6 +182,7 @@ static struct inode *ovl_alloc_inode(struct super_block *sb)
 	oi->flags = 0;
 	oi->__upperdentry = NULL;
 	oi->lower = NULL;
+	oi->lowerdata = NULL;
 	mutex_init(&oi->lock);
 
 	return &oi->vfs_inode;
@@ -200,6 +201,7 @@ static void ovl_destroy_inode(struct inode *inode)
 
 	dput(oi->__upperdentry);
 	iput(oi->lower);
+	iput(oi->lowerdata);
 	kfree(oi->redirect);
 	ovl_dir_cache_free(inode);
 	mutex_destroy(&oi->lock);
@@ -1505,7 +1507,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
 	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
-		       ovl_dentry_lower(root_dentry));
+		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 338de6bdae07..58aa14a02bb0 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -262,6 +262,12 @@ struct inode *ovl_inode_real(struct inode *inode)
 	return ovl_inode_upper(inode) ?: ovl_inode_lower(inode);
 }
 
+/* Return inode which containers lower data. Do not return metacopy */
+struct inode *ovl_inode_lowerdata(struct inode *inode)
+{
+	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
+}
+
 
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
@@ -396,7 +402,7 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
 }
 
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry)
+		    struct dentry *lowerdentry, struct dentry *lowerdata)
 {
 	struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
 
@@ -404,6 +410,8 @@ void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 		OVL_I(inode)->__upperdentry = upperdentry;
 	if (lowerdentry)
 		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
+	if (lowerdata)
+		OVL_I(inode)->lowerdata = igrab(d_inode(lowerdata));
 
 	ovl_copyattr(realinode, inode);
 	ovl_copyflags(realinode, inode);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 15/31] ovl: Add helper ovl_inode_real_data()
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (13 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 14/31] ovl: Store lower data inode in ovl_inode Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-26 20:35   ` Amir Goldstein
  2018-04-26 19:09 ` [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy Vivek Goyal
                   ` (15 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Add an helper to retrieve real data inode associated with overlay inode.
This helper will ignore all metacopy inodes and will return only the
real inode which has data.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 29c3498a7e61..e24af484967f 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -236,6 +236,7 @@ struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
 struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
+struct inode *ovl_inode_real_data(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
 void ovl_dentry_set_flag(unsigned long flag, struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 58aa14a02bb0..c7a1bd40dc6a 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -268,6 +268,18 @@ struct inode *ovl_inode_lowerdata(struct inode *inode)
 	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
 }
 
+/* Return real inode which contains data. Does not return metacopy inode */
+struct inode *ovl_inode_real_data(struct inode *inode)
+{
+	struct inode *upperinode;
+
+	upperinode = ovl_inode_upper(inode);
+	if (upperinode && ovl_has_upperdata(inode))
+		return upperinode;
+
+	return ovl_inode_lowerdata(inode);
+}
+
 
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (14 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 15/31] ovl: Add helper ovl_inode_real_data() Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-28  8:49   ` Amir Goldstein
  2018-04-26 19:09 ` [PATCH v14 17/31] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
                   ` (14 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_open() should open file which contains data and not open metacopy
inode.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 23638d8ebab5..558a859b2658 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -18,18 +18,26 @@ static struct file *ovl_open_realfile(const struct file *file)
 {
 	struct inode *inode = file_inode(file);
 	struct inode *upperinode = ovl_inode_upper(inode);
-	struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
+	struct inode *realinode;
 	struct file *realfile;
+	bool upperreal = false;
 	const struct cred *old_cred;
 
+	if (upperinode && ovl_has_upperdata(inode))
+		upperreal = true;
+
+	/* Always open file which contains data. Do not open metacopy. */
+	realinode = upperreal ? upperinode : ovl_inode_lowerdata(inode);
+
 	old_cred = ovl_override_creds(inode->i_sb);
 	realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
 			     realinode, current_cred(), false);
 	revert_creds(old_cred);
 
 	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
-		 file, file, upperinode ? 'u' : 'l', file->f_flags,
-		 realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
+		 file, file, upperreal ? 'u' : 'l',
+		 file->f_flags, realfile,
+		 IS_ERR(realfile) ? 0 : realfile->f_flags);
 
 	return realfile;
 }
@@ -80,7 +88,7 @@ static int ovl_real_file(const struct file *file, struct fd *real)
 	real->file = file->private_data;
 
 	/* Has it been copied up since we'd opened it? */
-	if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
+	if (unlikely(file_inode(real->file) != ovl_inode_real_data(inode))) {
 		real->flags = FDPUT_FPUT;
 		real->file = ovl_open_realfile(file);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 17/31] ovl: Do not expose metacopy only dentry from d_real()
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (15 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy Vivek Goyal
@ 2018-04-26 19:09 ` Vivek Goyal
  2018-04-28  7:36   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 18/31] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
                   ` (13 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:09 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Metacopy dentry/inode is internal to overlay and is never exposed
outside of it. Modify d_real() to look for only dentries/inode which
have data and which are not metacopy.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index b81ec9d22ac5..4efb7cd76cc1 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -100,7 +100,7 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 	if (!d_is_reg(dentry))
 		goto bug;
 
-	real = ovl_dentry_real(dentry);
+	real = ovl_dentry_real_data(dentry);
 	if (inode == d_inode(real))
 		return real;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 18/31] ovl: Move some dir related ovl_lookup_single() code in else block
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (16 preceding siblings ...)
  2018-04-26 19:09 ` [PATCH v14 17/31] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  7:34   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 19/31] ovl: Check redirects for metacopy files Vivek Goyal
                   ` (12 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Move some directory related code in else block. This is pure code
reorganization and no functionality change.

Next patch enables redirect processing on metacopy files and needs this
change.  By keeping non-functional changes in a separate patch, next patch
looks much smaller and cleaner.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/namei.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index b25e32cf5916..880c7faea41a 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -269,22 +269,23 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		d->stop = !err;
 		d->metacopy = !!err;
 		goto out;
-	}
-	if (last_element) {
-		if (d->metacopy) {
-			err = -ESTALE;
-			goto out_err;
+	} else {
+		if (last_element) {
+			if (d->metacopy) {
+				err = -ESTALE;
+				goto out_err;
+			}
+			d->is_dir = true;
 		}
-		d->is_dir = true;
-	}
-	if (d->last)
-		goto out;
+		if (d->last)
+			goto out;
 
-	if (ovl_is_opaquedir(this)) {
-		d->stop = true;
-		if (last_element)
-			d->opaque = true;
-		goto out;
+		if (ovl_is_opaquedir(this)) {
+			d->stop = true;
+			if (last_element)
+				d->opaque = true;
+			goto out;
+		}
 	}
 	err = ovl_check_redirect(this, d, prelen, post);
 	if (err)
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 19/31] ovl: Check redirects for metacopy files
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (17 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 18/31] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:32   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 20/31] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Vivek Goyal
                   ` (11 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now we rely on path based lookup for data origin of metacopy upper.
This will work only if upper has not been renamed. We solved this problem
already for merged directories using redirect. Use same logic for metacopy
files.

This patch just goes on to check redirects for metacopy files.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/namei.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 880c7faea41a..41cbde97a212 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -268,7 +268,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 			goto out_err;
 		d->stop = !err;
 		d->metacopy = !!err;
-		goto out;
+		if (!d->metacopy || d->last)
+			goto out;
 	} else {
 		if (last_element) {
 			if (d->metacopy) {
@@ -874,7 +875,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 		if (upperdentry && !d.is_dir) {
 			unsigned int origin_ctr = 0;
-			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 20/31] ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (18 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 19/31] ovl: Check redirects for metacopy files Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-26 19:10 ` [PATCH v14 21/31] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now OVL_PATH_MERGE is used only for merged directories.
But conceptually, a metacopy dentry (backed by a lower data dentry) is
a merged entity as well.

So mark metacopy dentries as OVL_PATH_MERGE and ovl_rename() makes use
of this property later to set redirect on a metacopy file.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/util.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index c7a1bd40dc6a..f85ff5714852 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -134,7 +134,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
 		 */
 		if (oe->numlower) {
 			type |= __OVL_PATH_ORIGIN;
-			if (d_is_dir(dentry))
+			if (d_is_dir(dentry) ||
+			    !ovl_has_upperdata(d_inode(dentry)))
 				type |= __OVL_PATH_MERGE;
 		}
 	} else {
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 21/31] ovl: Add an inode flag OVL_CONST_INO
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (19 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 20/31] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:33   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 22/31] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
                   ` (9 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Add an ovl_inode flag OVL_CONST_INO. This flag signifies if inode number
will remain constant over copy up or not. This flag does not get updated
over copy up and remains unmodifed after setting once.

Next patch in the series will make use of this flag. It will basically
figure out if dentry is of type ORIGIN or not. And this can be derived
by this flag.

ORIGIN = (upperdentry && ovl_test_flag(OVL_CONST_INO, inode)).

Suggested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c     | 3 +++
 fs/overlayfs/overlayfs.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index a6cd57741a1f..19e96d4717bd 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -881,6 +881,9 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 
 	OVL_I(inode)->redirect = redirect;
 
+	if (bylower)
+		ovl_set_flag(OVL_CONST_INO, inode);
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || numlower > 1) ||
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e24af484967f..e2c6a2f5addf 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -38,6 +38,8 @@ enum ovl_inode_flag {
 	OVL_WHITEOUTS,
 	OVL_INDEX,
 	OVL_UPPERDATA,
+	/* Inode number will remain constant over copy up. */
+	OVL_CONST_INO,
 };
 
 enum ovl_entry_flag {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 22/31] ovl: Do not set dentry type ORIGIN for broken hardlinks
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (20 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 21/31] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  9:14   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 23/31] ovl: Set redirect on metacopy files upon rename Vivek Goyal
                   ` (8 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

If a dentry has copy up origin, we set flag OVL_PATH_ORIGIN. So far
this decision was easy that we had to check only for oe->numlower
and if it is non-zero, we knew there is copy up origin. (For non-dir
we installed origin dentry in lowerstack[0]).

But we don't create ORGIN xattr for broken hardlinks (index=off). And
with metacopy feature it is possible that we will install lowerstack[0]
but ORIGIN xattr is not there. It is data dentry of upper metacopy dentry
which has been found using regular name based lookup or using REDIRECT.
So with addition of this new case, just presence of oe->numlower is
not sufficient to guarantee that ORIGIN xattr is present.

So two differentiate between two cases, look at OVL_CONST_INO flag. If
this flag is set and upperdentry is there, that means ORIGIN xattr
must be present. OVL_CONST_INO is not set if lower hardlink is broken
or will be broken over copy up.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/util.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f85ff5714852..69acce1c9026 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -133,7 +133,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
 		 * Non-dir dentry can hold lower dentry of its copy up origin.
 		 */
 		if (oe->numlower) {
-			type |= __OVL_PATH_ORIGIN;
+			if (ovl_test_flag(OVL_CONST_INO, d_inode(dentry)))
+				type |= __OVL_PATH_ORIGIN;
 			if (d_is_dir(dentry) ||
 			    !ovl_has_upperdata(d_inode(dentry)))
 				type |= __OVL_PATH_MERGE;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 23/31] ovl: Set redirect on metacopy files upon rename
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (21 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 22/31] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:25   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 24/31] ovl: Set redirect on upper inode when it is linked Vivek Goyal
                   ` (7 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Set redirect on metacopy files upon rename. This will help find data dentry
in lower dirs.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/dir.c | 54 ++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 34745ccea682..1acea6887b05 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -890,6 +890,30 @@ static int ovl_set_redirect(struct dentry *dentry, bool samedir)
 	return err;
 }
 
+static bool ovl_rel_redirect(struct dentry *dentry, bool samedir)
+{
+	struct dentry *lowerdentry;
+
+	if (d_is_dir(dentry) || !samedir)
+		return samedir;
+
+	/*
+	 * For non-dir hardlinked files, we need absolute redirects
+	 * in general as two upper hardlinks could be in different
+	 * dirs. We could put a relative redirect now and convert
+	 * it to absolute redirect later. But when nlink > 1 and
+	 * indexing is on, that means relative redirect needs to be
+	 * converted to absolute during copy up of another lower
+	 * hardllink as well.
+	 *
+	 * So without optimizing too much, just check if lower is
+	 * a hard link or not. If lower is hard link, put absolute
+	 * redirect.
+	 */
+	lowerdentry = ovl_dentry_lower(dentry);
+	return (d_inode(lowerdentry)->i_nlink > 1 ? false : true);
+}
+
 static int ovl_rename(struct inode *olddir, struct dentry *old,
 		      struct inode *newdir, struct dentry *new,
 		      unsigned int flags)
@@ -1045,22 +1069,20 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 		goto out_dput;
 
 	err = 0;
-	if (is_dir) {
-		if (ovl_type_merge_or_lower(old))
-			err = ovl_set_redirect(old, samedir);
-		else if (!old_opaque && ovl_type_merge(new->d_parent))
-			err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
-		if (err)
-			goto out_dput;
-	}
-	if (!overwrite && new_is_dir) {
-		if (ovl_type_merge_or_lower(new))
-			err = ovl_set_redirect(new, samedir);
-		else if (!new_opaque && ovl_type_merge(old->d_parent))
-			err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
-		if (err)
-			goto out_dput;
-	}
+	if (ovl_type_merge_or_lower(old))
+		err = ovl_set_redirect(old, ovl_rel_redirect(old, samedir));
+	else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
+		err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
+	if (err)
+		goto out_dput;
+
+	if (!overwrite && ovl_type_merge_or_lower(new))
+		err = ovl_set_redirect(new, ovl_rel_redirect(new, samedir));
+	else if (!overwrite && new_is_dir && !new_opaque &&
+		 ovl_type_merge(old->d_parent))
+		err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
+	if (err)
+		goto out_dput;
 
 	err = ovl_do_rename(old_upperdir->d_inode, olddentry,
 			    new_upperdir->d_inode, newdentry, flags);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 24/31] ovl: Set redirect on upper inode when it is linked
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (22 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 23/31] ovl: Set redirect on metacopy files upon rename Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:40   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 25/31] ovl: Check redirect on index as well Vivek Goyal
                   ` (6 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

When we create a hardlink to a metacopy upper file, first the redirect
on that inode. Path based lookup will not work with newly created link
and redirect will solve that issue.

Also use absolute redirect as two hardlinks could be in different directores
and relative redirect will not work.

I have not put any additional locking around setting redirects while
introducing redirects for non-dir files. For now it feels like existing
locking is sufficient. If that's not the case, we will have add more
locking. Following is my rationale about why do I think current locking
seems ok.

Basic problem for non-dir files is that more than on dentry could be
pointing to same inode and in theory only relying on dentry based locks
(d->d_lock) did not seem sufficient.

We set redirect upon rename and upon link creation. In both the paths
for non-dir file, VFS locks both source and target inodes (->i_rwsem).
That means vfs rename and link operations on same source and target
can't he happening in parallel (Even if there are multiple dentries
pointing to same inode). So that probably means that at a time on an inode,
only one call of ovl_set_redirect() could be working and we don't need
additional locking in ovl_set_redirect().

ovl_inode->redirect is initialized only when inode is created new. That
means it should not race with any other path and setting ovl_inode->redirect
should be fine.

Reading of ovl_inode->redirect happens in ovl_get_redirect() path. And this
called only in ovl_set_redirect(). And ovl_set_redirect() already seemed
to be protected using ->i_rwsem. That means ovl_set_redirect() and
ovl_get_redirect() on source/target inode should not make progress in
parallel and is mutually exclusive. Hence no additional locking required.

Now, only case where ovl_set_redirect() and ovl_get_redirect() could race
seems to be case of absolute redirects where ovl_get_redirect() has to
travel up the tree. In that case we already take d->d_lock and that should
be sufficient as directories will not have multiple dentries pointing to
same inode.

So given VFS locking and current usage of redirect, current locking around
redirect seems to be ok for non-dir as well. Once we have the logic to
remove redirect when metacopy file gets copied up, then we probably will
need additional locking.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/dir.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 1acea6887b05..546ad7d34160 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -24,6 +24,8 @@ module_param_named(redirect_max, ovl_redirect_max, ushort, 0644);
 MODULE_PARM_DESC(ovl_redirect_max,
 		 "Maximum length of absolute redirect xattr value");
 
+static int ovl_set_redirect(struct dentry *dentry, bool samedir);
+
 int ovl_cleanup(struct inode *wdir, struct dentry *wdentry)
 {
 	int err;
@@ -468,6 +470,9 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 	const struct cred *old_cred;
 	struct cred *override_cred;
 	struct dentry *parent = dentry->d_parent;
+	struct dentry *hardlink_upper;
+
+	hardlink_upper = hardlink ? ovl_dentry_upper(hardlink) : NULL;
 
 	err = ovl_copy_up(parent);
 	if (err)
@@ -502,12 +507,19 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 		put_cred(override_creds(override_cred));
 		put_cred(override_cred);
 
+		if (hardlink && ovl_is_metacopy_dentry(hardlink)) {
+			err = ovl_set_redirect(hardlink, false);
+			if (err)
+				goto out_revert_creds;
+		}
+
+
 		if (!ovl_dentry_is_whiteout(dentry))
 			err = ovl_create_upper(dentry, inode, attr,
-						hardlink);
+						hardlink_upper);
 		else
 			err = ovl_create_over_whiteout(dentry, inode, attr,
-							hardlink);
+							hardlink_upper);
 		ovl_copytimes_with_parent(dentry);
 	}
 out_revert_creds:
@@ -603,8 +615,7 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
 	inode = d_inode(old);
 	ihold(inode);
 
-	err = ovl_create_or_link(new, inode, NULL, ovl_dentry_upper(old),
-				 ovl_type_origin(old));
+	err = ovl_create_or_link(new, inode, NULL, old, ovl_type_origin(old));
 	if (err)
 		iput(inode);
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 25/31] ovl: Check redirect on index as well
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (23 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 24/31] ovl: Set redirect on upper inode when it is linked Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  9:26   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 26/31] ovl: Allow ovl_open_realfile() to open metacopy inode Vivek Goyal
                   ` (5 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now we seem to check redirect only if upperdentry is found. But it
is possible that there is no upperdentry but later we found an index.

We need to check redirect on index as well and set it in ovl_inode->redirect.
Otherwise link code can assume that dentry does not have redirect and
place a new one which breaks things. In my testing overlay/033 test
started failing in xfstests. Following are the details.

For example do following.

$ mkdir lower upper work merged

- Make lower dir with 4 links.
  $ echo "foo" > lower/l0.txt
  $ ln  lower/l0.txt lower/l1.txt
  $ ln  lower/l0.txt lower/l2.txt
  $ ln  lower/l0.txt lower/l3.txt

- Mount with index on and metacopy on.

  $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,index=on,metacopy=on none merged

- Link lower

  $ ln merged/l0.txt merged/l4.txt
    (This will metadata copy up of l0.txt and put an absolute redirect
     /l0.txt)

  $ echo 2 > /proc/sys/vm/drop/caches

  $ ls merged/l1.txt
  (Now l1.txt will be looked up. There is no upper dentry but there is
   lower dentry and index will be found. We don't check for redirect on
   index, hence ovl_inode->redirect will be NULL.)

- Link Upper

  $ ln merged/l4.txt merged/l5.txt
  (Lookup of l4.txt will use inode from l1.txt lookup which is still in
   cache. It has ovl_inode->redirect NULL, hence link will put a new
   redirect and replace /l0.txt with /l4.txt

- Drop caches.
  echo 2 > /proc/sys/vm/drop_caches

- List l1.txt and it returns -ESTALE

  $ ls merged/l0.txt

  (It returns stale because, we found a metacopy of l0.txt in upper
   and it has redirect l4.txt but there is no file named l4.txt in
   lower layer. So lower data copy is not found and -ESTALE is returned.)

So problem here is that we did not process redirect on index. Check
redirect on index as well and then problem is fixed.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/namei.c     |  9 ++++++++-
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 41cbde97a212..c6ad8117a1f5 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1067,8 +1067,15 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (upperdentry)
 		ovl_dentry_set_upper_alias(dentry);
-	else if (index)
+	else if (index) {
 		upperdentry = dget(index);
+		upperredirect = ovl_get_redirect_xattr(upperdentry);
+		if (IS_ERR(upperredirect)) {
+			err = PTR_ERR(upperredirect);
+			upperredirect = NULL;
+			goto out_free_oe;
+		}
+	}
 
 	if (upperdentry || ctr) {
 		struct dentry *lowerdata = NULL;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e2c6a2f5addf..a0f3a6fa3029 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -284,6 +284,7 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 void ovl_copytimes(struct inode *inode);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
 bool ovl_is_metacopy_dentry(struct dentry *dentry);
+char *ovl_get_redirect_xattr(struct dentry *dentry);
 
 static inline void ovl_copytimes_with_parent(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 69acce1c9026..1a5691a9e425 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -889,3 +889,45 @@ bool ovl_is_metacopy_dentry(struct dentry *dentry)
 
 	return (oe->numlower > 1);
 }
+
+char *ovl_get_redirect_xattr(struct dentry *dentry)
+{
+	int res;
+	char *s, *next, *buf = NULL;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return NULL;
+		return ERR_PTR(res);
+	}
+
+	buf = kzalloc(res + 1, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
+	if (res < 0) {
+		kfree(buf);
+		return ERR_PTR(res);
+        }
+	if (res == 0)
+		goto invalid;
+
+	if (buf[0] == '/') {
+		for (s = buf; *s++ == '/'; s = next) {
+			next = strchrnul(s, '/');
+			if (s == next)
+				goto invalid;
+		}
+	} else {
+		if (strchr(buf, '/') != NULL)
+			goto invalid;
+	}
+
+	return buf;
+invalid:
+	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
+	kfree(buf);
+	return ERR_PTR(-EINVAL);
+}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 26/31] ovl: Allow ovl_open_realfile() to open metacopy inode
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (24 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 25/31] ovl: Check redirect on index as well Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  9:05   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 27/31] ovl: Issue fsync on upper metacopy inode as well Vivek Goyal
                   ` (4 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Most of the time we need to open inode containing data (and not metacopy)
but when fsync happens, in that case we need to make sure upper metacopy
inode is fsynced too. In that case we need to open metacopy inode
temporarily.

Add a parameter to ovl_open_realfile() which specifies whether to open
data inode or metacopy inode. Later fsync patch will make use of this
functionality.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 558a859b2658..08387639ba6e 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -14,7 +14,7 @@
 #include <linux/uio.h>
 #include "overlayfs.h"
 
-static struct file *ovl_open_realfile(const struct file *file)
+static struct file *ovl_open_realfile(const struct file *file, bool meta)
 {
 	struct inode *inode = file_inode(file);
 	struct inode *upperinode = ovl_inode_upper(inode);
@@ -28,7 +28,13 @@ static struct file *ovl_open_realfile(const struct file *file)
 
 	/* Always open file which contains data. Do not open metacopy. */
 	realinode = upperreal ? upperinode : ovl_inode_lowerdata(inode);
-
+	if (upperinode && (meta || ovl_has_upperdata(inode))) {
+		realinode = upperinode;
+		upperreal = true;
+	} else {
+		realinode = meta ? ovl_inode_lower(inode) :
+				 ovl_inode_lowerdata(inode);
+	}
 	old_cred = ovl_override_creds(inode->i_sb);
 	realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
 			     realinode, current_cred(), false);
@@ -80,17 +86,23 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
 	return 0;
 }
 
-static int ovl_real_file(const struct file *file, struct fd *real)
+static int _ovl_real_file(const struct file *file, struct fd *real, bool meta)
 {
 	struct inode *inode = file_inode(file);
+	struct inode *real_inode;
 
 	real->flags = 0;
 	real->file = file->private_data;
 
+	if (meta)
+		real_inode = ovl_inode_real(inode);
+	else
+		real_inode = ovl_inode_real_data(inode);
+
 	/* Has it been copied up since we'd opened it? */
-	if (unlikely(file_inode(real->file) != ovl_inode_real_data(inode))) {
+	if (unlikely(file_inode(real->file) != real_inode)) {
 		real->flags = FDPUT_FPUT;
-		real->file = ovl_open_realfile(file);
+		real->file = ovl_open_realfile(file, meta);
 
 		return PTR_ERR_OR_ZERO(real->file);
 	}
@@ -102,6 +114,16 @@ static int ovl_real_file(const struct file *file, struct fd *real)
 	return 0;
 }
 
+static int ovl_real_file(const struct file *file, struct fd *real)
+{
+	return _ovl_real_file(file, real, false);
+}
+
+static int ovl_real_meta_file(const struct file *file, struct fd *real)
+{
+	return _ovl_real_file(file, real, true);
+}
+
 static int ovl_open(struct inode *inode, struct file *file)
 {
 	struct dentry *dentry = file_dentry(file);
@@ -115,7 +137,7 @@ static int ovl_open(struct inode *inode, struct file *file)
 	/* No longer need these flags, so don't pass them on to underlying fs */
 	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
 
-	realfile = ovl_open_realfile(file);
+	realfile = ovl_open_realfile(file, false);
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 27/31] ovl: Issue fsync on upper metacopy inode as well
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (25 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 26/31] ovl: Allow ovl_open_realfile() to open metacopy inode Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:54   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 28/31] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
                   ` (3 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

When a fsync happens, it calls fsync on real underlying file. If a file
has been metadata only copied up on upper, then file containing data
is the real file and fsync will be called on lower data file.

We should be issuing fsync on upper metacopy inode as well to make sure
upper inode has been made persistent on disk.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 08387639ba6e..be531b79d2e5 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -259,18 +259,31 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 
 static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 {
-	struct fd real;
+	struct fd real, metareal = {0};
 	const struct cred *old_cred;
 	int ret;
+	struct inode *upperinode = NULL;
 
 	ret = ovl_real_file(file, &real);
 	if (ret)
 		return ret;
 
+	/* If we have upper metacopy, issue additional fsync on it */
+	upperinode = ovl_inode_upper(file_inode(file));
+	if (!datasync && upperinode && (file_inode(real.file) != upperinode)) {
+		ret = ovl_real_meta_file(file, &metareal);
+		if (ret)
+			goto put_real;
+	}
+
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = vfs_fsync_range(real.file, start, end, datasync);
+	if (!ret && metareal.file)
+		ret = vfs_fsync(metareal.file, datasync);
 	revert_creds(old_cred);
 
+	fdput(metareal);
+put_real:
 	fdput(real);
 
 	return ret;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 28/31] ovl: Disbale metacopy for MAP_SHARED mmap()
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (26 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 27/31] ovl: Issue fsync on upper metacopy inode as well Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  9:28   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 29/31] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
                   ` (2 subsequent siblings)
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

When user chose the option of copying up a file when mmap(MAP_SHARED)
happens, then do full copy up and not just metacopy.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index be531b79d2e5..d61c8355b892 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -304,7 +304,7 @@ static int ovl_pre_mmap(struct file *file, unsigned long prot,
 	 * later.
 	 */
 	if ((flag & MAP_SHARED) && ovl_copy_up_shared(file_inode(file)->i_sb))
-		err = ovl_copy_up(file_dentry(file));
+		err = ovl_copy_up_flags(file_dentry(file), O_WRONLY);
 
 	return err;
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 29/31] ovl: Do not do metadata only copy-up for truncate operation
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (27 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 28/31] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:35   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 30/31] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
  2018-04-26 19:10 ` [PATCH v14 31/31] ovl: Enable metadata only feature Vivek Goyal
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

truncate should copy up full file (and not do metacopy only), otherwise
it will be broken. For example, use truncate to increase size of a file
so that any read beyong existing size will return null bytes. If we
don't copy up full file, then we end up opening lower file and read
from it only reads upto the old size (and not new size after truncate).
Hence to avoid such situations, copy up data as well when file size
changes.

So far it was being done by d_real(O_WRONLY) call in truncate()
path. Now that patch has been reverted. So force full copy up in
ovl_setattr() if size of file is changing.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 19e96d4717bd..0802b67e0776 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -18,7 +18,7 @@
 
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
-	int err;
+	int err, copy_up_flags = 0;
 	struct dentry *upperdentry;
 	const struct cred *old_cred;
 
@@ -45,9 +45,12 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		err = -ETXTBSY;
 		if (atomic_read(&realinode->i_writecount) < 0)
 			goto out_drop_write;
+
+		/* Truncate should trigger data copy up as well */
+		copy_up_flags = O_WRONLY;
 	}
 
-	err = ovl_copy_up(dentry);
+	err = ovl_copy_up_flags(dentry, copy_up_flags);
 	if (!err) {
 		struct inode *winode = NULL;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 30/31] ovl: Do not do metacopy only for ioctl modifying file attr
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (28 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 29/31] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  2018-04-28  8:31   ` Amir Goldstein
  2018-04-26 19:10 ` [PATCH v14 31/31] ovl: Enable metadata only feature Vivek Goyal
  30 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_copy_up() by default will only do metadata only copy up (if enabled).
That means when ovl_real_ioctl() calls ovl_real_file(), it will still
get the lower file (as ovl_real_file() opens data file and not metacopy).
And that means "chattr +i" will end up modifying lower inode.

There seem to be two ways to solve this.
A. Open metacopy file in ovl_real_ioctl() and do operations on that
B. Force full copy up when FS_IOC_SETFLAGS is called.

I am resorting to option B for now as it feels little safer option. If
there are performance issues due to this, we can revisit it.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index d61c8355b892..9a9efaac71f4 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -397,7 +397,7 @@ long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		if (ret)
 			return ret;
 
-		ret = ovl_copy_up(file_dentry(file));
+		ret = ovl_copy_up_flags(file_dentry(file), O_WRONLY);
 		if (!ret) {
 			ret = ovl_real_ioctl(file, cmd, arg);
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v14 31/31] ovl: Enable metadata only feature
  2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (29 preceding siblings ...)
  2018-04-26 19:10 ` [PATCH v14 30/31] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
@ 2018-04-26 19:10 ` Vivek Goyal
  30 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-26 19:10 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

All the bits are in patches before this. So it is time to enable the
metadata only copy up feature.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 5a865a4cf3d7..c38ebd8ea916 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -738,9 +738,6 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 {
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 
-	/* TODO: Will enable metacopy in last patch of series */
-	return false;
-
 	if (!ofs->config.metacopy)
 		return false;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 14/31] ovl: Store lower data inode in ovl_inode
  2018-04-26 19:09 ` [PATCH v14 14/31] ovl: Store lower data inode in ovl_inode Vivek Goyal
@ 2018-04-26 20:31   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-26 20:31 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Right now ovl_inode stores inode pointer for lower inode. This helps
> with quickly getting lower inode given overlay inode (ovl_inode_lower()).
>
> Now with metadata only copy-up, we can have metacopy inode in middle
> layer as well and inode containing data can be different from ->lower.
> I need to be able to open the real file in ovl_open_realfile() and
> for that I need to quickly find the lower data inode.
>
> Hence store lower data inode also in ovl_inode. Also provide an
> helper ovl_inode_lowerdata() to access this field.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/export.c    |  3 ++-
>  fs/overlayfs/inode.c     |  5 +++--
>  fs/overlayfs/namei.c     |  5 ++++-
>  fs/overlayfs/overlayfs.h |  6 ++++--
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/super.c     |  4 +++-
>  fs/overlayfs/util.c      | 10 +++++++++-
>  7 files changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index e1f6546b6c84..d43d926ae68e 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -305,7 +305,8 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>         if (d_is_dir(upper ?: lower))
>                 return ERR_PTR(-EIO);
>
> -       inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower, NULL);
> +       inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower, NULL,
> +                             NULL);
>         if (IS_ERR(inode)) {
>                 dput(upper);
>                 return ERR_CAST(inode);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 9d17f15480de..a6cd57741a1f 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -802,7 +802,8 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
>
>  struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>                             struct ovl_path *lowerpath, struct dentry *index,
> -                           unsigned int numlower, char *redirect)
> +                           unsigned int numlower, char *redirect,
> +                           struct dentry *lowerdata)
>  {
>         struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
>         struct inode *inode;
> @@ -861,7 +862,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>                 }
>         }
>         ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
> -       ovl_inode_init(inode, upperdentry, lowerdentry);
> +       ovl_inode_init(inode, upperdentry, lowerdentry, lowerdata);
>
>         if (upperdentry && ovl_is_impuredir(upperdentry))
>                 ovl_set_flag(OVL_IMPURE, inode);
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 682f0d402986..b25e32cf5916 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1070,8 +1070,11 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 upperdentry = dget(index);
>
>         if (upperdentry || ctr) {
> +               struct dentry *lowerdata = NULL;
> +               if (ctr > 1 && !d.is_dir)
> +                       lowerdata = stack[ctr - 1].dentry;
>                 inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
> -                                     ctr, upperredirect);
> +                                     ctr, upperredirect, lowerdata);
>                 err = PTR_ERR(inode);
>                 if (IS_ERR(inode))
>                         goto out_free_oe;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 29b28ee50b0f..29c3498a7e61 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -234,6 +234,7 @@ struct dentry *ovl_dentry_real_data(struct dentry *dentry);
>  struct dentry *ovl_i_dentry_upper(struct inode *inode);
>  struct inode *ovl_inode_upper(struct inode *inode);
>  struct inode *ovl_inode_lower(struct inode *inode);
> +struct inode *ovl_inode_lowerdata(struct inode *inode);
>  struct inode *ovl_inode_real(struct inode *inode);
>  struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
>  void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
> @@ -253,7 +254,7 @@ bool ovl_redirect_dir(struct super_block *sb);
>  const char *ovl_dentry_get_redirect(struct dentry *dentry);
>  void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
>  void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
> -                   struct dentry *lowerdentry);
> +                   struct dentry *lowerdentry, struct dentry *lowerdata);
>  void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
>  void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
>  u64 ovl_dentry_version_get(struct dentry *dentry);
> @@ -373,7 +374,8 @@ struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
>                                bool is_upper);
>  struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>                             struct ovl_path *lowerpath, struct dentry *index,
> -                           unsigned int numlower, char *redirect);
> +                           unsigned int numlower, char *redirect,
> +                           struct dentry *lowerdata);

I don't like the fact that the argument list keeps on growing.
It might be the right time to add an "inode context" or extend struct
ovl_lookup_data to include also all or many of the arguments passed
into ovl_get_inode().
You don't have to to that if you don't want to...

>  static inline void ovl_copyattr(struct inode *from, struct inode *to)
>  {
>         to->i_uid = from->i_uid;
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index da65118a0567..83e031e31ff9 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -97,6 +97,7 @@ struct ovl_inode {
>         struct inode vfs_inode;
>         struct dentry *__upperdentry;
>         struct inode *lower;
> +       struct inode *lowerdata;

Please union this field with struct ovl_dir_cache *cache,
but need to be careful when freeing.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 12/31] ovl: Add an helper to get real data dentry
  2018-04-26 19:09 ` [PATCH v14 12/31] ovl: Add an helper to get real " Vivek Goyal
@ 2018-04-26 20:33   ` Amir Goldstein
  2018-04-30 13:06     ` Vivek Goyal
  0 siblings, 1 reply; 67+ messages in thread
From: Amir Goldstein @ 2018-04-26 20:33 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> ovl_dentry_real() returns the real dentry. (Either upper or lower). We
> also need an helper to ignore metacopy dentries and return "real data"
> dentry. This helper returns an upper/lower dentry which contains data.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

+suggestion

> ---
>  fs/overlayfs/overlayfs.h |  1 +
>  fs/overlayfs/util.c      | 12 ++++++++++++
>  2 files changed, 13 insertions(+)
>
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 08df75f473fe..5077d2992d07 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -230,6 +230,7 @@ struct dentry *ovl_dentry_lower(struct dentry *dentry);
>  struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
>  struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
>  struct dentry *ovl_dentry_real(struct dentry *dentry);
> +struct dentry *ovl_dentry_real_data(struct dentry *dentry);
>  struct dentry *ovl_i_dentry_upper(struct inode *inode);
>  struct inode *ovl_inode_upper(struct inode *inode);
>  struct inode *ovl_inode_lower(struct inode *inode);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 7029036b9aa3..2bef013ca432 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -228,6 +228,18 @@ struct dentry *ovl_dentry_real(struct dentry *dentry)
>         return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
>  }
>
> +/* Return real dentry which contains data. Skip metacopy dentries */
> +struct dentry *ovl_dentry_real_data(struct dentry *dentry)

The name real_data is a bit inconsistent with _upperdata and
_lowerdata. Don't you think _realdata would be more coherent.

> +{
> +       struct dentry *upperdentry;
> +
> +       upperdentry = ovl_dentry_upper(dentry);
> +       if (upperdentry && ovl_has_upperdata(d_inode(dentry)))
> +               return upperdentry;
> +
> +       return ovl_dentry_lowerdata(dentry);
> +}
> +
>  struct dentry *ovl_i_dentry_upper(struct inode *inode)
>  {
>         return ovl_upperdentry_dereference(OVL_I(inode));
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 15/31] ovl: Add helper ovl_inode_real_data()
  2018-04-26 19:09 ` [PATCH v14 15/31] ovl: Add helper ovl_inode_real_data() Vivek Goyal
@ 2018-04-26 20:35   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-26 20:35 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Add an helper to retrieve real data inode associated with overlay inode.
> This helper will ignore all metacopy inodes and will return only the
> real inode which has data.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Suggest to rename to ovl_inode_realdata

> ---
>  fs/overlayfs/overlayfs.h |  1 +
>  fs/overlayfs/util.c      | 12 ++++++++++++
>  2 files changed, 13 insertions(+)
>
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 29c3498a7e61..e24af484967f 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -236,6 +236,7 @@ struct inode *ovl_inode_upper(struct inode *inode);
>  struct inode *ovl_inode_lower(struct inode *inode);
>  struct inode *ovl_inode_lowerdata(struct inode *inode);
>  struct inode *ovl_inode_real(struct inode *inode);
> +struct inode *ovl_inode_real_data(struct inode *inode);
>  struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
>  void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
>  void ovl_dentry_set_flag(unsigned long flag, struct dentry *dentry);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 58aa14a02bb0..c7a1bd40dc6a 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -268,6 +268,18 @@ struct inode *ovl_inode_lowerdata(struct inode *inode)
>         return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
>  }
>
> +/* Return real inode which contains data. Does not return metacopy inode */
> +struct inode *ovl_inode_real_data(struct inode *inode)
> +{
> +       struct inode *upperinode;
> +
> +       upperinode = ovl_inode_upper(inode);
> +       if (upperinode && ovl_has_upperdata(inode))
> +               return upperinode;
> +
> +       return ovl_inode_lowerdata(inode);
> +}
> +
>
>  struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
>  {
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 18/31] ovl: Move some dir related ovl_lookup_single() code in else block
  2018-04-26 19:10 ` [PATCH v14 18/31] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
@ 2018-04-28  7:34   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  7:34 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Move some directory related code in else block. This is pure code
> reorganization and no functionality change.
>
> Next patch enables redirect processing on metacopy files and needs this
> change.  By keeping non-functional changes in a separate patch, next patch
> looks much smaller and cleaner.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 17/31] ovl: Do not expose metacopy only dentry from d_real()
  2018-04-26 19:09 ` [PATCH v14 17/31] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
@ 2018-04-28  7:36   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  7:36 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Metacopy dentry/inode is internal to overlay and is never exposed
> outside of it. Modify d_real() to look for only dentries/inode which
> have data and which are not metacopy.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Looks fine, but not adding reviewed-by for patches that
are about to change after rebase over newer version of -rorw.

> ---
>  fs/overlayfs/super.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index b81ec9d22ac5..4efb7cd76cc1 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -100,7 +100,7 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>         if (!d_is_reg(dentry))
>                 goto bug;
>
> -       real = ovl_dentry_real(dentry);
> +       real = ovl_dentry_real_data(dentry);
>         if (inode == d_inode(real))
>                 return real;
>
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-26 19:09 ` [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
@ 2018-04-28  8:14   ` Amir Goldstein
  2018-04-30 14:02     ` Vivek Goyal
  2018-04-30 19:32     ` Vivek Goyal
  0 siblings, 2 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:14 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> It also allows for presence of metacopy dentries in lower layer.
>
> During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> set OVL_UPPERDATA bit in flags.
>
> We don't support metacopy feature with nfs_export. So in nfs_export code,
> we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
>
> Do not follow metacopy origin if we find a metacopy only inode and metacopy
> feature is not enabled for that mount. Like redirect, this can have security
> implications where an attacker could hand craft upper and try to gain
> access to file on lower which it should not have to begin with.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Except for one comment below...

> ---
>  fs/overlayfs/export.c    |   3 ++
>  fs/overlayfs/inode.c     |  11 ++++-
>  fs/overlayfs/namei.c     | 102 ++++++++++++++++++++++++++++++++++++++++-------
>  fs/overlayfs/overlayfs.h |   1 +
>  fs/overlayfs/util.c      |  22 ++++++++++
>  5 files changed, 124 insertions(+), 15 deletions(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 5b7fee1a81ec..e1f6546b6c84 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -311,6 +311,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>                 return ERR_CAST(inode);
>         }
>
> +       if (upper)
> +               ovl_set_flag(OVL_UPPERDATA, inode);
> +
>         dentry = d_find_any_alias(inode);
>         if (!dentry) {
>                 dentry = d_alloc_anon(inode->i_sb);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 59b1c84076ec..40509b7a50d2 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -776,7 +776,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>         struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
>         bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
>         int fsid = bylower ? lowerpath->layer->fsid : 0;
> -       bool is_dir;
> +       bool is_dir, metacopy = false;
>         unsigned long ino = 0;
>         int err = -ENOMEM;
>
> @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>         if (index)
>                 ovl_set_flag(OVL_INDEX, inode);
>
> +       if (upperdentry) {
> +               err = ovl_check_metacopy_xattr(upperdentry);
> +               if (err < 0)
> +                       goto out_err;
> +               metacopy = err;
> +               if (!metacopy)
> +                       ovl_set_flag(OVL_UPPERDATA, inode);
> +       }
> +

There is no reason to ovl_check_metacopy_xattr again here, right?
so it should get the same treatment as upperredirect.
This would make 3 new arguments to ovl_get_inode added by your patch set.

How about initializing this struct during lookup and passing it into
ovl_get_inode()?:

struct ovl_inode_info {
        struct ovl_dir_cache *cache;
        const char *redirect;
        u64 version;
        unsigned long flags;
        struct dentry *__upperdentry;
        struct inode *lower;
};

struct ovl_inode {
        struct ovl_inode_info info;
        struct inode vfs_inode;

        /* synchronize copy up and more */
        struct mutex lock;
};

static inline struct ovl_inode_info *OVL_I_INFO(struct inode *inode)
{
        return &OVL_I(inode)->info;
}

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 23/31] ovl: Set redirect on metacopy files upon rename
  2018-04-26 19:10 ` [PATCH v14 23/31] ovl: Set redirect on metacopy files upon rename Vivek Goyal
@ 2018-04-28  8:25   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:25 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Set redirect on metacopy files upon rename. This will help find data dentry
> in lower dirs.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Minus some nits below..

> ---
>  fs/overlayfs/dir.c | 54 ++++++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 38 insertions(+), 16 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 34745ccea682..1acea6887b05 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -890,6 +890,30 @@ static int ovl_set_redirect(struct dentry *dentry, bool samedir)
>         return err;
>  }
>
> +static bool ovl_rel_redirect(struct dentry *dentry, bool samedir)

If this needs to be a helper at all it should be called from within
ovl_sel_redirect() instead of all call sites and name should be
something like ovl_need_absolute_redirect()

> +{
> +       struct dentry *lowerdentry;
> +
> +       if (d_is_dir(dentry) || !samedir)
> +               return samedir;
> +
> +       /*
> +        * For non-dir hardlinked files, we need absolute redirects
> +        * in general as two upper hardlinks could be in different
> +        * dirs. We could put a relative redirect now and convert
> +        * it to absolute redirect later. But when nlink > 1 and
> +        * indexing is on, that means relative redirect needs to be
> +        * converted to absolute during copy up of another lower
> +        * hardllink as well.
> +        *
> +        * So without optimizing too much, just check if lower is
> +        * a hard link or not. If lower is hard link, put absolute
> +        * redirect.
> +        */
> +       lowerdentry = ovl_dentry_lower(dentry);
> +       return (d_inode(lowerdentry)->i_nlink > 1 ? false : true);

return d_inode(lowerdentry)->i_nlink == 1

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 30/31] ovl: Do not do metacopy only for ioctl modifying file attr
  2018-04-26 19:10 ` [PATCH v14 30/31] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
@ 2018-04-28  8:31   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:31 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> ovl_copy_up() by default will only do metadata only copy up (if enabled).
> That means when ovl_real_ioctl() calls ovl_real_file(), it will still
> get the lower file (as ovl_real_file() opens data file and not metacopy).
> And that means "chattr +i" will end up modifying lower inode.
>
> There seem to be two ways to solve this.
> A. Open metacopy file in ovl_real_ioctl() and do operations on that
> B. Force full copy up when FS_IOC_SETFLAGS is called.
>
> I am resorting to option B for now as it feels little safer option. If
> there are performance issues due to this, we can revisit it.

I agree with your choice. This doesn't seem like a performance
sensitive use case, so simplicity is better.

>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

+ nit

> ---
>  fs/overlayfs/file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index d61c8355b892..9a9efaac71f4 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -397,7 +397,7 @@ long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
>                 if (ret)
>                         return ret;
>
> -               ret = ovl_copy_up(file_dentry(file));
> +               ret = ovl_copy_up_flags(file_dentry(file), O_WRONLY);

2 options: either add a comment or a wrapper with meaningful name like
ovl_copy_up_with_data.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 19/31] ovl: Check redirects for metacopy files
  2018-04-26 19:10 ` [PATCH v14 19/31] ovl: Check redirects for metacopy files Vivek Goyal
@ 2018-04-28  8:32   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:32 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Right now we rely on path based lookup for data origin of metacopy upper.
> This will work only if upper has not been renamed. We solved this problem
> already for merged directories using redirect. Use same logic for metacopy
> files.
>
> This patch just goes on to check redirects for metacopy files.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>


> ---
>  fs/overlayfs/namei.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 880c7faea41a..41cbde97a212 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -268,7 +268,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
>                         goto out_err;
>                 d->stop = !err;
>                 d->metacopy = !!err;
> -               goto out;
> +               if (!d->metacopy || d->last)
> +                       goto out;
>         } else {
>                 if (last_element) {
>                         if (d->metacopy) {
> @@ -874,7 +875,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 }
>                 if (upperdentry && !d.is_dir) {
>                         unsigned int origin_ctr = 0;
> -                       BUG_ON(d.redirect);
>                         /*
>                          * Lookup copy up origin by decoding origin file handle.
>                          * We may get a disconnected dentry, which is fine,
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 21/31] ovl: Add an inode flag OVL_CONST_INO
  2018-04-26 19:10 ` [PATCH v14 21/31] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
@ 2018-04-28  8:33   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:33 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Add an ovl_inode flag OVL_CONST_INO. This flag signifies if inode number
> will remain constant over copy up or not. This flag does not get updated
> over copy up and remains unmodifed after setting once.
>
> Next patch in the series will make use of this flag. It will basically
> figure out if dentry is of type ORIGIN or not. And this can be derived
> by this flag.
>
> ORIGIN = (upperdentry && ovl_test_flag(OVL_CONST_INO, inode)).
>
> Suggested-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/inode.c     | 3 +++
>  fs/overlayfs/overlayfs.h | 2 ++
>  2 files changed, 5 insertions(+)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index a6cd57741a1f..19e96d4717bd 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -881,6 +881,9 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>
>         OVL_I(inode)->redirect = redirect;
>
> +       if (bylower)
> +               ovl_set_flag(OVL_CONST_INO, inode);
> +
>         /* Check for non-merge dir that may have whiteouts */
>         if (is_dir) {
>                 if (((upperdentry && lowerdentry) || numlower > 1) ||
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index e24af484967f..e2c6a2f5addf 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -38,6 +38,8 @@ enum ovl_inode_flag {
>         OVL_WHITEOUTS,
>         OVL_INDEX,
>         OVL_UPPERDATA,
> +       /* Inode number will remain constant over copy up. */
> +       OVL_CONST_INO,
>  };
>
>  enum ovl_entry_flag {
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 29/31] ovl: Do not do metadata only copy-up for truncate operation
  2018-04-26 19:10 ` [PATCH v14 29/31] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
@ 2018-04-28  8:35   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:35 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> truncate should copy up full file (and not do metacopy only), otherwise
> it will be broken. For example, use truncate to increase size of a file
> so that any read beyong existing size will return null bytes. If we
> don't copy up full file, then we end up opening lower file and read
> from it only reads upto the old size (and not new size after truncate).
> Hence to avoid such situations, copy up data as well when file size
> changes.
>
> So far it was being done by d_real(O_WRONLY) call in truncate()
> path. Now that patch has been reverted. So force full copy up in
> ovl_setattr() if size of file is changing.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/inode.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 19e96d4717bd..0802b67e0776 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -18,7 +18,7 @@
>
>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  {
> -       int err;
> +       int err, copy_up_flags = 0;
>         struct dentry *upperdentry;
>         const struct cred *old_cred;
>
> @@ -45,9 +45,12 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>                 err = -ETXTBSY;
>                 if (atomic_read(&realinode->i_writecount) < 0)
>                         goto out_drop_write;
> +
> +               /* Truncate should trigger data copy up as well */
> +               copy_up_flags = O_WRONLY;
>         }
>
> -       err = ovl_copy_up(dentry);
> +       err = ovl_copy_up_flags(dentry, copy_up_flags);
>         if (!err) {
>                 struct inode *winode = NULL;
>
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 24/31] ovl: Set redirect on upper inode when it is linked
  2018-04-26 19:10 ` [PATCH v14 24/31] ovl: Set redirect on upper inode when it is linked Vivek Goyal
@ 2018-04-28  8:40   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:40 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> When we create a hardlink to a metacopy upper file, first the redirect
> on that inode. Path based lookup will not work with newly created link
> and redirect will solve that issue.
>
> Also use absolute redirect as two hardlinks could be in different directores
> and relative redirect will not work.
>
> I have not put any additional locking around setting redirects while
> introducing redirects for non-dir files. For now it feels like existing
> locking is sufficient. If that's not the case, we will have add more
> locking. Following is my rationale about why do I think current locking
> seems ok.
>
> Basic problem for non-dir files is that more than on dentry could be
> pointing to same inode and in theory only relying on dentry based locks
> (d->d_lock) did not seem sufficient.
>
> We set redirect upon rename and upon link creation. In both the paths
> for non-dir file, VFS locks both source and target inodes (->i_rwsem).
> That means vfs rename and link operations on same source and target
> can't he happening in parallel (Even if there are multiple dentries
> pointing to same inode). So that probably means that at a time on an inode,
> only one call of ovl_set_redirect() could be working and we don't need
> additional locking in ovl_set_redirect().
>
> ovl_inode->redirect is initialized only when inode is created new. That
> means it should not race with any other path and setting ovl_inode->redirect
> should be fine.
>
> Reading of ovl_inode->redirect happens in ovl_get_redirect() path. And this
> called only in ovl_set_redirect(). And ovl_set_redirect() already seemed
> to be protected using ->i_rwsem. That means ovl_set_redirect() and
> ovl_get_redirect() on source/target inode should not make progress in
> parallel and is mutually exclusive. Hence no additional locking required.
>
> Now, only case where ovl_set_redirect() and ovl_get_redirect() could race
> seems to be case of absolute redirects where ovl_get_redirect() has to
> travel up the tree. In that case we already take d->d_lock and that should
> be sufficient as directories will not have multiple dentries pointing to
> same inode.
>
> So given VFS locking and current usage of redirect, current locking around
> redirect seems to be ok for non-dir as well. Once we have the logic to
> remove redirect when metacopy file gets copied up, then we probably will
> need additional locking.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/dir.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 1acea6887b05..546ad7d34160 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -24,6 +24,8 @@ module_param_named(redirect_max, ovl_redirect_max, ushort, 0644);
>  MODULE_PARM_DESC(ovl_redirect_max,
>                  "Maximum length of absolute redirect xattr value");
>
> +static int ovl_set_redirect(struct dentry *dentry, bool samedir);
> +
>  int ovl_cleanup(struct inode *wdir, struct dentry *wdentry)
>  {
>         int err;
> @@ -468,6 +470,9 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>         const struct cred *old_cred;
>         struct cred *override_cred;
>         struct dentry *parent = dentry->d_parent;
> +       struct dentry *hardlink_upper;
> +
> +       hardlink_upper = hardlink ? ovl_dentry_upper(hardlink) : NULL;
>
>         err = ovl_copy_up(parent);
>         if (err)
> @@ -502,12 +507,19 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>                 put_cred(override_creds(override_cred));
>                 put_cred(override_cred);
>
> +               if (hardlink && ovl_is_metacopy_dentry(hardlink)) {
> +                       err = ovl_set_redirect(hardlink, false);
> +                       if (err)
> +                               goto out_revert_creds;
> +               }
> +
> +
>                 if (!ovl_dentry_is_whiteout(dentry))
>                         err = ovl_create_upper(dentry, inode, attr,
> -                                               hardlink);
> +                                               hardlink_upper);
>                 else
>                         err = ovl_create_over_whiteout(dentry, inode, attr,
> -                                                       hardlink);
> +                                                       hardlink_upper);
>                 ovl_copytimes_with_parent(dentry);
>         }
>  out_revert_creds:
> @@ -603,8 +615,7 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
>         inode = d_inode(old);
>         ihold(inode);
>
> -       err = ovl_create_or_link(new, inode, NULL, ovl_dentry_upper(old),
> -                                ovl_type_origin(old));
> +       err = ovl_create_or_link(new, inode, NULL, old, ovl_type_origin(old));
>         if (err)
>                 iput(inode);
>
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy
  2018-04-26 19:09 ` [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy Vivek Goyal
@ 2018-04-28  8:49   ` Amir Goldstein
  2018-05-03 20:31     ` Vivek Goyal
  0 siblings, 1 reply; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:49 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> ovl_open() should open file which contains data and not open metacopy
> inode.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

This looks fine, but I don't see a reason to separate it from 26/31
first allow only data open, then allow also meta open..
Anyway, I am not sure about the cleanest way to sort all the cases.

> ---
>  fs/overlayfs/file.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 23638d8ebab5..558a859b2658 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -18,18 +18,26 @@ static struct file *ovl_open_realfile(const struct file *file)
>  {
>         struct inode *inode = file_inode(file);
>         struct inode *upperinode = ovl_inode_upper(inode);
> -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> +       struct inode *realinode;
>         struct file *realfile;
> +       bool upperreal = false;
>         const struct cred *old_cred;
>
> +       if (upperinode && ovl_has_upperdata(inode))
> +               upperreal = true;
> +
> +       /* Always open file which contains data. Do not open metacopy. */
> +       realinode = upperreal ? upperinode : ovl_inode_lowerdata(inode);
> +
>         old_cred = ovl_override_creds(inode->i_sb);
>         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
>                              realinode, current_cred(), false);
>         revert_creds(old_cred);
>
>         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> +                file, file, upperreal ? 'u' : 'l',
> +                file->f_flags, realfile,
> +                IS_ERR(realfile) ? 0 : realfile->f_flags);
>
>         return realfile;
>  }
> @@ -80,7 +88,7 @@ static int ovl_real_file(const struct file *file, struct fd *real)
>         real->file = file->private_data;
>
>         /* Has it been copied up since we'd opened it? */
> -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> +       if (unlikely(file_inode(real->file) != ovl_inode_real_data(inode))) {
>                 real->flags = FDPUT_FPUT;
>                 real->file = ovl_open_realfile(file);
>
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 27/31] ovl: Issue fsync on upper metacopy inode as well
  2018-04-26 19:10 ` [PATCH v14 27/31] ovl: Issue fsync on upper metacopy inode as well Vivek Goyal
@ 2018-04-28  8:54   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  8:54 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> When a fsync happens, it calls fsync on real underlying file. If a file
> has been metadata only copied up on upper, then file containing data
> is the real file and fsync will be called on lower data file.
>
> We should be issuing fsync on upper metacopy inode as well to make sure
> upper inode has been made persistent on disk.

This is papering over a bug in Miklos's patch.
fsync should ONLY ever sync upper inode.
same as a bug that was recently fixed in ovl_dir_fsync().

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 26/31] ovl: Allow ovl_open_realfile() to open metacopy inode
  2018-04-26 19:10 ` [PATCH v14 26/31] ovl: Allow ovl_open_realfile() to open metacopy inode Vivek Goyal
@ 2018-04-28  9:05   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  9:05 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Most of the time we need to open inode containing data (and not metacopy)
> but when fsync happens, in that case we need to make sure upper metacopy
> inode is fsynced too. In that case we need to open metacopy inode
> temporarily.
>
> Add a parameter to ovl_open_realfile() which specifies whether to open
> data inode or metacopy inode. Later fsync patch will make use of this
> functionality.

I don't think fsync patch is needed so
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/file.c | 34 ++++++++++++++++++++++++++++------
>  1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 558a859b2658..08387639ba6e 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -14,7 +14,7 @@
>  #include <linux/uio.h>
>  #include "overlayfs.h"
>
> -static struct file *ovl_open_realfile(const struct file *file)
> +static struct file *ovl_open_realfile(const struct file *file, bool meta)
>  {
>         struct inode *inode = file_inode(file);
>         struct inode *upperinode = ovl_inode_upper(inode);
> @@ -28,7 +28,13 @@ static struct file *ovl_open_realfile(const struct file *file)
>
>         /* Always open file which contains data. Do not open metacopy. */
>         realinode = upperreal ? upperinode : ovl_inode_lowerdata(inode);
> -
> +       if (upperinode && (meta || ovl_has_upperdata(inode))) {
> +               realinode = upperinode;
> +               upperreal = true;
> +       } else {
> +               realinode = meta ? ovl_inode_lower(inode) :
> +                                ovl_inode_lowerdata(inode);
> +       }
>         old_cred = ovl_override_creds(inode->i_sb);
>         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
>                              realinode, current_cred(), false);
> @@ -80,17 +86,23 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
>         return 0;
>  }
>
> -static int ovl_real_file(const struct file *file, struct fd *real)
> +static int _ovl_real_file(const struct file *file, struct fd *real, bool meta)

I don't think you need to change this function.
I think you need to use ovl_open_realfile() from ovl_fsync_file()
if you realize this is metacopy case.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 22/31] ovl: Do not set dentry type ORIGIN for broken hardlinks
  2018-04-26 19:10 ` [PATCH v14 22/31] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
@ 2018-04-28  9:14   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  9:14 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> If a dentry has copy up origin, we set flag OVL_PATH_ORIGIN. So far
> this decision was easy that we had to check only for oe->numlower
> and if it is non-zero, we knew there is copy up origin. (For non-dir
> we installed origin dentry in lowerstack[0]).
>
> But we don't create ORGIN xattr for broken hardlinks (index=off). And
> with metacopy feature it is possible that we will install lowerstack[0]
> but ORIGIN xattr is not there. It is data dentry of upper metacopy dentry
> which has been found using regular name based lookup or using REDIRECT.
> So with addition of this new case, just presence of oe->numlower is
> not sufficient to guarantee that ORIGIN xattr is present.
>
> So two differentiate between two cases, look at OVL_CONST_INO flag. If

typo: to differentiate..

> this flag is set and upperdentry is there, that means ORIGIN xattr
> must be present.


This statement is not accurate. For dir we set OVL_CONST_INO
regardless of of ORIGIN xattr, but that is ok because dir is not
a broken hardlink...

> OVL_CONST_INO is not set if lower hardlink is broken
> or will be broken over copy up.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/util.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index f85ff5714852..69acce1c9026 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -133,7 +133,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
>                  * Non-dir dentry can hold lower dentry of its copy up origin.
>                  */
>                 if (oe->numlower) {
> -                       type |= __OVL_PATH_ORIGIN;
> +                       if (ovl_test_flag(OVL_CONST_INO, d_inode(dentry)))
> +                               type |= __OVL_PATH_ORIGIN;
>                         if (d_is_dir(dentry) ||
>                             !ovl_has_upperdata(d_inode(dentry)))
>                                 type |= __OVL_PATH_MERGE;
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 25/31] ovl: Check redirect on index as well
  2018-04-26 19:10 ` [PATCH v14 25/31] ovl: Check redirect on index as well Vivek Goyal
@ 2018-04-28  9:26   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  9:26 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Right now we seem to check redirect only if upperdentry is found. But it
> is possible that there is no upperdentry but later we found an index.
>
> We need to check redirect on index as well and set it in ovl_inode->redirect.
> Otherwise link code can assume that dentry does not have redirect and
> place a new one which breaks things. In my testing overlay/033 test
> started failing in xfstests. Following are the details.
>
> For example do following.
>
> $ mkdir lower upper work merged
>
> - Make lower dir with 4 links.
>   $ echo "foo" > lower/l0.txt
>   $ ln  lower/l0.txt lower/l1.txt
>   $ ln  lower/l0.txt lower/l2.txt
>   $ ln  lower/l0.txt lower/l3.txt
>
> - Mount with index on and metacopy on.
>
>   $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,index=on,metacopy=on none merged
>
> - Link lower
>
>   $ ln merged/l0.txt merged/l4.txt
>     (This will metadata copy up of l0.txt and put an absolute redirect
>      /l0.txt)
>
>   $ echo 2 > /proc/sys/vm/drop/caches
>
>   $ ls merged/l1.txt
>   (Now l1.txt will be looked up. There is no upper dentry but there is
>    lower dentry and index will be found. We don't check for redirect on
>    index, hence ovl_inode->redirect will be NULL.)
>
> - Link Upper
>
>   $ ln merged/l4.txt merged/l5.txt
>   (Lookup of l4.txt will use inode from l1.txt lookup which is still in
>    cache. It has ovl_inode->redirect NULL, hence link will put a new
>    redirect and replace /l0.txt with /l4.txt
>
> - Drop caches.
>   echo 2 > /proc/sys/vm/drop_caches
>
> - List l1.txt and it returns -ESTALE
>
>   $ ls merged/l0.txt
>
>   (It returns stale because, we found a metacopy of l0.txt in upper
>    and it has redirect l4.txt but there is no file named l4.txt in
>    lower layer. So lower data copy is not found and -ESTALE is returned.)
>
> So problem here is that we did not process redirect on index. Check
> redirect on index as well and then problem is fixed.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

After addressing comment below

> ---
>  fs/overlayfs/namei.c     |  9 ++++++++-
>  fs/overlayfs/overlayfs.h |  1 +
>  fs/overlayfs/util.c      | 42 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 51 insertions(+), 1 deletion(-)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 41cbde97a212..c6ad8117a1f5 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1067,8 +1067,15 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>
>         if (upperdentry)
>                 ovl_dentry_set_upper_alias(dentry);
> -       else if (index)
> +       else if (index) {
>                 upperdentry = dget(index);
> +               upperredirect = ovl_get_redirect_xattr(upperdentry);
> +               if (IS_ERR(upperredirect)) {
> +                       err = PTR_ERR(upperredirect);
> +                       upperredirect = NULL;
> +                       goto out_free_oe;
> +               }
> +       }
>
>         if (upperdentry || ctr) {
>                 struct dentry *lowerdata = NULL;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index e2c6a2f5addf..a0f3a6fa3029 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -284,6 +284,7 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
>  void ovl_copytimes(struct inode *inode);
>  int ovl_check_metacopy_xattr(struct dentry *dentry);
>  bool ovl_is_metacopy_dentry(struct dentry *dentry);
> +char *ovl_get_redirect_xattr(struct dentry *dentry);
>
>  static inline void ovl_copytimes_with_parent(struct dentry *dentry)
>  {
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 69acce1c9026..1a5691a9e425 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -889,3 +889,45 @@ bool ovl_is_metacopy_dentry(struct dentry *dentry)
>
>         return (oe->numlower > 1);
>  }
> +
> +char *ovl_get_redirect_xattr(struct dentry *dentry)
> +{
> +       int res;
> +       char *s, *next, *buf = NULL;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
> +       if (res < 0) {
> +               if (res == -ENODATA || res == -EOPNOTSUPP)
> +                       return NULL;
> +               return ERR_PTR(res);
> +       }
> +
> +       buf = kzalloc(res + 1, GFP_KERNEL);
> +       if (!buf)
> +               return ERR_PTR(-ENOMEM);
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
> +       if (res < 0) {
> +               kfree(buf);
> +               return ERR_PTR(res);
> +        }
> +       if (res == 0)
> +               goto invalid;
> +
> +       if (buf[0] == '/') {
> +               for (s = buf; *s++ == '/'; s = next) {
> +                       next = strchrnul(s, '/');
> +                       if (s == next)
> +                               goto invalid;
> +               }
> +       } else {
> +               if (strchr(buf, '/') != NULL)
> +                       goto invalid;
> +       }
> +
> +       return buf;
> +invalid:
> +       pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
> +       kfree(buf);
> +       return ERR_PTR(-EINVAL);
> +}
> --

Please add argument 'postlen' or 'padding' and use this helper
from ovl_check_redirect() instead of duplicating code.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 28/31] ovl: Disbale metacopy for MAP_SHARED mmap()
  2018-04-26 19:10 ` [PATCH v14 28/31] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
@ 2018-04-28  9:28   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-28  9:28 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 12:10 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> When user chose the option of copying up a file when mmap(MAP_SHARED)
> happens, then do full copy up and not just metacopy.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

+ nit

> ---
>  fs/overlayfs/file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index be531b79d2e5..d61c8355b892 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -304,7 +304,7 @@ static int ovl_pre_mmap(struct file *file, unsigned long prot,
>          * later.
>          */
>         if ((flag & MAP_SHARED) && ovl_copy_up_shared(file_inode(file)->i_sb))
> -               err = ovl_copy_up(file_dentry(file));
> +               err = ovl_copy_up_flags(file_dentry(file), O_WRONLY);

Either add comment or use wrapper ovl_full_copy_up().

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 12/31] ovl: Add an helper to get real data dentry
  2018-04-26 20:33   ` Amir Goldstein
@ 2018-04-30 13:06     ` Vivek Goyal
  0 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-04-30 13:06 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Thu, Apr 26, 2018 at 01:33:50PM -0700, Amir Goldstein wrote:

[..]
> > +/* Return real dentry which contains data. Skip metacopy dentries */
> > +struct dentry *ovl_dentry_real_data(struct dentry *dentry)
> 
> The name real_data is a bit inconsistent with _upperdata and
> _lowerdata. Don't you think _realdata would be more coherent.

_realdata sounds good. I was wondering about that too. Will change.

Vivek

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-28  8:14   ` Amir Goldstein
@ 2018-04-30 14:02     ` Vivek Goyal
  2018-04-30 18:08       ` Amir Goldstein
  2018-04-30 19:32     ` Vivek Goyal
  1 sibling, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-30 14:02 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Sat, Apr 28, 2018 at 01:14:24AM -0700, Amir Goldstein wrote:
> On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> > It also allows for presence of metacopy dentries in lower layer.
> >
> > During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> > set OVL_UPPERDATA bit in flags.
> >
> > We don't support metacopy feature with nfs_export. So in nfs_export code,
> > we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
> >
> > Do not follow metacopy origin if we find a metacopy only inode and metacopy
> > feature is not enabled for that mount. Like redirect, this can have security
> > implications where an attacker could hand craft upper and try to gain
> > access to file on lower which it should not have to begin with.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> 
> Except for one comment below...
> 
> > ---
> >  fs/overlayfs/export.c    |   3 ++
> >  fs/overlayfs/inode.c     |  11 ++++-
> >  fs/overlayfs/namei.c     | 102 ++++++++++++++++++++++++++++++++++++++++-------
> >  fs/overlayfs/overlayfs.h |   1 +
> >  fs/overlayfs/util.c      |  22 ++++++++++
> >  5 files changed, 124 insertions(+), 15 deletions(-)
> >
> > diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> > index 5b7fee1a81ec..e1f6546b6c84 100644
> > --- a/fs/overlayfs/export.c
> > +++ b/fs/overlayfs/export.c
> > @@ -311,6 +311,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
> >                 return ERR_CAST(inode);
> >         }
> >
> > +       if (upper)
> > +               ovl_set_flag(OVL_UPPERDATA, inode);
> > +
> >         dentry = d_find_any_alias(inode);
> >         if (!dentry) {
> >                 dentry = d_alloc_anon(inode->i_sb);
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index 59b1c84076ec..40509b7a50d2 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -776,7 +776,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> >         struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
> >         bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
> >         int fsid = bylower ? lowerpath->layer->fsid : 0;
> > -       bool is_dir;
> > +       bool is_dir, metacopy = false;
> >         unsigned long ino = 0;
> >         int err = -ENOMEM;
> >
> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> >         if (index)
> >                 ovl_set_flag(OVL_INDEX, inode);
> >
> > +       if (upperdentry) {
> > +               err = ovl_check_metacopy_xattr(upperdentry);
> > +               if (err < 0)
> > +                       goto out_err;
> > +               metacopy = err;
> > +               if (!metacopy)
> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
> > +       }
> > +
> 
> There is no reason to ovl_check_metacopy_xattr again here, right?

I think we need to check metacopy here otherwise it becomes racy. For
example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
ovl_lookup(l1) will think metacopy is on while another thread on another
cpu might have trigged copy up, remove metacopy xattr. And it is possible
that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
iget5_locked(), it will get a new inode and it will initialize inode
with wrong information.

I had done similar thing for REDIRECT, but once we removed logic to
remove REDIRECT on copy up, I felt I did not have to check redirect
again here.

In general, I feel that once we have the inode lock, we should check
metacopy and redirect both and then initialize inode. And not rely
on information which was checked outside the lock and might have
been stale by now.


> so it should get the same treatment as upperredirect.
> This would make 3 new arguments to ovl_get_inode added by your patch set.
> 
> How about initializing this struct during lookup and passing it into
> ovl_get_inode()?:

I will look into introducing ovl_inode_info.

Vivek

> 
> struct ovl_inode_info {
>         struct ovl_dir_cache *cache;
>         const char *redirect;
>         u64 version;
>         unsigned long flags;
>         struct dentry *__upperdentry;
>         struct inode *lower;
> };
> 
> struct ovl_inode {
>         struct ovl_inode_info info;
>         struct inode vfs_inode;
> 
>         /* synchronize copy up and more */
>         struct mutex lock;
> };
> 
> static inline struct ovl_inode_info *OVL_I_INFO(struct inode *inode)
> {
>         return &OVL_I(inode)->info;
> }
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-30 14:02     ` Vivek Goyal
@ 2018-04-30 18:08       ` Amir Goldstein
  2018-05-01 14:37         ` Amir Goldstein
  2018-05-02 19:09         ` Vivek Goyal
  0 siblings, 2 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-30 18:08 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, Apr 30, 2018 at 5:02 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Sat, Apr 28, 2018 at 01:14:24AM -0700, Amir Goldstein wrote:
>> On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
>> > It also allows for presence of metacopy dentries in lower layer.
>> >
>> > During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
>> > set OVL_UPPERDATA bit in flags.
>> >
>> > We don't support metacopy feature with nfs_export. So in nfs_export code,
>> > we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
>> >
>> > Do not follow metacopy origin if we find a metacopy only inode and metacopy
>> > feature is not enabled for that mount. Like redirect, this can have security
>> > implications where an attacker could hand craft upper and try to gain
>> > access to file on lower which it should not have to begin with.
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>>
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>
>> Except for one comment below...
>>
>> > ---
>> >  fs/overlayfs/export.c    |   3 ++
>> >  fs/overlayfs/inode.c     |  11 ++++-
>> >  fs/overlayfs/namei.c     | 102 ++++++++++++++++++++++++++++++++++++++++-------
>> >  fs/overlayfs/overlayfs.h |   1 +
>> >  fs/overlayfs/util.c      |  22 ++++++++++
>> >  5 files changed, 124 insertions(+), 15 deletions(-)
>> >
>> > diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
>> > index 5b7fee1a81ec..e1f6546b6c84 100644
>> > --- a/fs/overlayfs/export.c
>> > +++ b/fs/overlayfs/export.c
>> > @@ -311,6 +311,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>> >                 return ERR_CAST(inode);
>> >         }
>> >
>> > +       if (upper)
>> > +               ovl_set_flag(OVL_UPPERDATA, inode);
>> > +
>> >         dentry = d_find_any_alias(inode);
>> >         if (!dentry) {
>> >                 dentry = d_alloc_anon(inode->i_sb);
>> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>> > index 59b1c84076ec..40509b7a50d2 100644
>> > --- a/fs/overlayfs/inode.c
>> > +++ b/fs/overlayfs/inode.c
>> > @@ -776,7 +776,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>> >         struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
>> >         bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
>> >         int fsid = bylower ? lowerpath->layer->fsid : 0;
>> > -       bool is_dir;
>> > +       bool is_dir, metacopy = false;
>> >         unsigned long ino = 0;
>> >         int err = -ENOMEM;
>> >
>> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>> >         if (index)
>> >                 ovl_set_flag(OVL_INDEX, inode);
>> >
>> > +       if (upperdentry) {
>> > +               err = ovl_check_metacopy_xattr(upperdentry);
>> > +               if (err < 0)
>> > +                       goto out_err;
>> > +               metacopy = err;
>> > +               if (!metacopy)
>> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
>> > +       }
>> > +
>>
>> There is no reason to ovl_check_metacopy_xattr again here, right?
>
> I think we need to check metacopy here otherwise it becomes racy. For
> example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
> ovl_lookup(l1) will think metacopy is on while another thread on another
> cpu might have trigged copy up, remove metacopy xattr. And it is possible
> that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
> iget5_locked(), it will get a new inode and it will initialize inode
> with wrong information.
>
> I had done similar thing for REDIRECT, but once we removed logic to
> remove REDIRECT on copy up, I felt I did not have to check redirect
> again here.
>
> In general, I feel that once we have the inode lock, we should check
> metacopy and redirect both and then initialize inode. And not rely
> on information which was checked outside the lock and might have
> been stale by now.
>

Hmmm... so as you once already said, we have a race with INDEX as well.
In general, I feel it would be better to do a minimal sanity check inside
inode lock for INDEX and METACOPY (maybe REDIRECT) and make
sure they are consistent with the info in ovl_inode_info.
If they are not, return -ESTALE and that may result in re-calling
lookup.

Maybe Miklos has a better idea?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-28  8:14   ` Amir Goldstein
  2018-04-30 14:02     ` Vivek Goyal
@ 2018-04-30 19:32     ` Vivek Goyal
  2018-04-30 20:21       ` Amir Goldstein
  1 sibling, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-04-30 19:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Sat, Apr 28, 2018 at 01:14:24AM -0700, Amir Goldstein wrote:

[..]
> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> >         if (index)
> >                 ovl_set_flag(OVL_INDEX, inode);
> >
> > +       if (upperdentry) {
> > +               err = ovl_check_metacopy_xattr(upperdentry);
> > +               if (err < 0)
> > +                       goto out_err;
> > +               metacopy = err;
> > +               if (!metacopy)
> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
> > +       }
> > +
> 
> There is no reason to ovl_check_metacopy_xattr again here, right?
> so it should get the same treatment as upperredirect.
> This would make 3 new arguments to ovl_get_inode added by your patch set.
> 
> How about initializing this struct during lookup and passing it into
> ovl_get_inode()?:

How about just creating another structure to pass in parameters to
ovl_get_inode() and then let it extract relevant info and initialize
inode.  Something like this.

---
 fs/overlayfs/export.c    |    6 ++++--
 fs/overlayfs/inode.c     |   26 +++++++++++++-------------
 fs/overlayfs/namei.c     |    6 ++++--
 fs/overlayfs/overlayfs.h |    5 +----
 fs/overlayfs/ovl_entry.h |   10 ++++++++++
 5 files changed, 32 insertions(+), 21 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/ovl_entry.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/ovl_entry.h	2018-04-30 14:45:57.945867987 -0400
+++ rhvgoyal-linux/fs/overlayfs/ovl_entry.h	2018-04-30 14:53:48.275867987 -0400
@@ -103,6 +103,16 @@ struct ovl_inode {
 	struct mutex lock;
 };
 
+struct ovl_inode_params {
+	struct super_block *sb;
+	struct dentry *upperdentry;
+	struct ovl_path *lowerpath;
+	struct dentry *index;
+	unsigned int numlower;
+	char *redirect;
+	struct dentry *lowerdata;
+};
+
 static inline struct ovl_inode *OVL_I(struct inode *inode)
 {
 	return container_of(inode, struct ovl_inode, vfs_inode);
Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-04-30 14:32:37.608867987 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-04-30 15:27:28.393867987 -0400
@@ -1072,10 +1072,12 @@ struct dentry *ovl_lookup(struct inode *
 
 	if (upperdentry || ctr) {
 		struct dentry *lowerdata = NULL;
+		struct ovl_inode_params oip = {dentry->d_sb, upperdentry, stack,					       index, ctr, upperredirect};
 		if (ctr > 1 && !d.is_dir)
 			lowerdata = stack[ctr - 1].dentry;
-		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
-				      ctr, upperredirect, lowerdata);
+
+		oip.lowerdata = lowerdata;
+		inode = ovl_get_inode(&oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2018-04-30 14:45:51.749867987 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2018-04-30 15:20:20.009867987 -0400
@@ -800,16 +800,16 @@ static bool ovl_hash_bylower(struct supe
 	return true;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower, char *redirect,
-			    struct dentry *lowerdata)
+struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 {
+	struct dentry *upperdentry = oip->upperdentry;
+	struct ovl_path *lowerpath = oip->lowerpath;
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
 	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
-	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
-	int fsid = bylower ? lowerpath->layer->fsid : 0;
+	bool bylower = ovl_hash_bylower(oip->sb, upperdentry, lowerdentry,
+					oip->index);
+	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
 	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = -ENOMEM;
@@ -827,7 +827,7 @@ struct inode *ovl_get_inode(struct super
 						      upperdentry);
 		unsigned int nlink = is_dir ? 1 : realinode->i_nlink;
 
-		inode = iget5_locked(sb, (unsigned long) key,
+		inode = iget5_locked(oip->sb, (unsigned long) key,
 				     ovl_inode_test, ovl_inode_set, key);
 		if (!inode)
 			goto out_err;
@@ -844,7 +844,7 @@ struct inode *ovl_get_inode(struct super
 			}
 
 			dput(upperdentry);
-			kfree(redirect);
+			kfree(oip->redirect);
 			goto out;
 		}
 
@@ -855,19 +855,19 @@ struct inode *ovl_get_inode(struct super
 		ino = key->i_ino;
 	} else {
 		/* Lower hardlink that will be broken on copy up */
-		inode = new_inode(sb);
+		inode = new_inode(oip->sb);
 		if (!inode) {
 			err = -ENOMEM;
 			goto out_err;
 		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
-	ovl_inode_init(inode, upperdentry, lowerdentry, lowerdata);
+	ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
 
-	if (index)
+	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
 	if (upperdentry) {
@@ -879,14 +879,14 @@ struct inode *ovl_get_inode(struct super
 			ovl_set_flag(OVL_UPPERDATA, inode);
 	}
 
-	OVL_I(inode)->redirect = redirect;
+	OVL_I(inode)->redirect = oip->redirect;
 
 	if (bylower)
 		ovl_set_flag(OVL_CONST_INO, inode);
 
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
-		if (((upperdentry && lowerdentry) || numlower > 1) ||
+		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
 		    ovl_check_origin_xattr(upperdentry ?: lowerdentry)) {
 			ovl_set_flag(OVL_WHITEOUTS, inode);
 		}
Index: rhvgoyal-linux/fs/overlayfs/overlayfs.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/overlayfs.h	2018-04-30 14:32:37.608867987 -0400
+++ rhvgoyal-linux/fs/overlayfs/overlayfs.h	2018-04-30 15:07:09.204867987 -0400
@@ -375,10 +375,7 @@ bool ovl_is_private_xattr(const char *na
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower, char *redirect,
-			    struct dentry *lowerdata);
+struct inode *ovl_get_inode(struct ovl_inode_params *oip);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
Index: rhvgoyal-linux/fs/overlayfs/export.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/export.c	2018-04-30 09:41:53.656867987 -0400
+++ rhvgoyal-linux/fs/overlayfs/export.c	2018-04-30 15:26:45.957867987 -0400
@@ -300,13 +300,15 @@ static struct dentry *ovl_obtain_alias(s
 	struct dentry *dentry;
 	struct inode *inode;
 	struct ovl_entry *oe;
+	struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower,
+				       NULL, NULL};
 
 	/* We get overlay directory dentries with ovl_lookup_real() */
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower, NULL,
-			      NULL);
+	oip.upperdentry = dget(upper);
+	inode = ovl_get_inode(&oip);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-30 19:32     ` Vivek Goyal
@ 2018-04-30 20:21       ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-04-30 20:21 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, Apr 30, 2018 at 10:32 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Sat, Apr 28, 2018 at 01:14:24AM -0700, Amir Goldstein wrote:
>
> [..]
>> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>> >         if (index)
>> >                 ovl_set_flag(OVL_INDEX, inode);
>> >
>> > +       if (upperdentry) {
>> > +               err = ovl_check_metacopy_xattr(upperdentry);
>> > +               if (err < 0)
>> > +                       goto out_err;
>> > +               metacopy = err;
>> > +               if (!metacopy)
>> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
>> > +       }
>> > +
>>
>> There is no reason to ovl_check_metacopy_xattr again here, right?
>> so it should get the same treatment as upperredirect.
>> This would make 3 new arguments to ovl_get_inode added by your patch set.
>>
>> How about initializing this struct during lookup and passing it into
>> ovl_get_inode()?:
>
> How about just creating another structure to pass in parameters to
> ovl_get_inode() and then let it extract relevant info and initialize
> inode.  Something like this.

Its definitely better than current state of affairs.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-30 18:08       ` Amir Goldstein
@ 2018-05-01 14:37         ` Amir Goldstein
  2018-05-03 15:12           ` Vivek Goyal
  2018-05-02 19:09         ` Vivek Goyal
  1 sibling, 1 reply; 67+ messages in thread
From: Amir Goldstein @ 2018-05-01 14:37 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, Apr 30, 2018 at 9:08 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Apr 30, 2018 at 5:02 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
[...]
>>
>> I think we need to check metacopy here otherwise it becomes racy. For
>> example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
>> ovl_lookup(l1) will think metacopy is on while another thread on another
>> cpu might have trigged copy up, remove metacopy xattr. And it is possible
>> that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
>> iget5_locked(), it will get a new inode and it will initialize inode
>> with wrong information.
>>
>> I had done similar thing for REDIRECT, but once we removed logic to
>> remove REDIRECT on copy up, I felt I did not have to check redirect
>> again here.
>>
>> In general, I feel that once we have the inode lock, we should check
>> metacopy and redirect both and then initialize inode. And not rely
>> on information which was checked outside the lock and might have
>> been stale by now.
>>
>
> Hmmm... so as you once already said, we have a race with INDEX as well.
> In general, I feel it would be better to do a minimal sanity check inside
> inode lock for INDEX and METACOPY (maybe REDIRECT) and make
> sure they are consistent with the info in ovl_inode_info.
> If they are not, return -ESTALE and that may result in re-calling
> lookup.
>

After thinking about it some more, I believe the way we should do it
is implement ovl_iget_locked() and call it from ovl_lookup() *before*
we lookup index.

Then, *after* looking up index,
If this turns out to be a broken hardlink, we throw away the
inode that we got by lower and get a new inode hashed by upper.

Then we can initialize the inode with another helper
or directly in ovl_lookup() and unlock_new_inode().

w.r.t METACOPY things will get complicated if we remove
METACOPY xattr and the upper does not have ORIGIN.
In that case we may have an inode that is already hashed by lower,
but for a new lookup that does not see METACOPY xattr it
looks up the inode in cache by upper and this is not good.

There is one solution that you will surely not like -
instead of removing METACOPY xattr set METACOPY=n.
That means we still need to lookup lower layers for origin inode,
but we do set UPPERDATA flag.

There may be other solutions to the problem, but I cannot think
of any right now.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-04-30 18:08       ` Amir Goldstein
  2018-05-01 14:37         ` Amir Goldstein
@ 2018-05-02 19:09         ` Vivek Goyal
  2018-05-02 19:40           ` Vivek Goyal
  1 sibling, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-05-02 19:09 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, Apr 30, 2018 at 09:08:01PM +0300, Amir Goldstein wrote:
[..]
> >> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> >> >         if (index)
> >> >                 ovl_set_flag(OVL_INDEX, inode);
> >> >
> >> > +       if (upperdentry) {
> >> > +               err = ovl_check_metacopy_xattr(upperdentry);
> >> > +               if (err < 0)
> >> > +                       goto out_err;
> >> > +               metacopy = err;
> >> > +               if (!metacopy)
> >> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
> >> > +       }
> >> > +
> >>
> >> There is no reason to ovl_check_metacopy_xattr again here, right?
> >
> > I think we need to check metacopy here otherwise it becomes racy. For
> > example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
> > ovl_lookup(l1) will think metacopy is on while another thread on another
> > cpu might have trigged copy up, remove metacopy xattr. And it is possible
> > that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
> > iget5_locked(), it will get a new inode and it will initialize inode
> > with wrong information.
> >
> > I had done similar thing for REDIRECT, but once we removed logic to
> > remove REDIRECT on copy up, I felt I did not have to check redirect
> > again here.
> >
> > In general, I feel that once we have the inode lock, we should check
> > metacopy and redirect both and then initialize inode. And not rely
> > on information which was checked outside the lock and might have
> > been stale by now.
> >
> 
> Hmmm... so as you once already said, we have a race with INDEX as well.

I am not sure about INDEX race. I remember talking about INDEX flag race
in ovl_inode but w.r.t ovl_get_inode(), what's the race you are seeing
with index. Once ovl_lookup() has found index, can that index go away by
the time ovl_get_inode() is called? If not, then there should not be
any race.

Vivek

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-02 19:09         ` Vivek Goyal
@ 2018-05-02 19:40           ` Vivek Goyal
  2018-05-02 19:57             ` Amir Goldstein
  0 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-05-02 19:40 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, May 02, 2018 at 03:09:57PM -0400, Vivek Goyal wrote:
> On Mon, Apr 30, 2018 at 09:08:01PM +0300, Amir Goldstein wrote:
> [..]
> > >> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> > >> >         if (index)
> > >> >                 ovl_set_flag(OVL_INDEX, inode);
> > >> >
> > >> > +       if (upperdentry) {
> > >> > +               err = ovl_check_metacopy_xattr(upperdentry);
> > >> > +               if (err < 0)
> > >> > +                       goto out_err;
> > >> > +               metacopy = err;
> > >> > +               if (!metacopy)
> > >> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
> > >> > +       }
> > >> > +
> > >>
> > >> There is no reason to ovl_check_metacopy_xattr again here, right?
> > >
> > > I think we need to check metacopy here otherwise it becomes racy. For
> > > example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
> > > ovl_lookup(l1) will think metacopy is on while another thread on another
> > > cpu might have trigged copy up, remove metacopy xattr. And it is possible
> > > that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
> > > iget5_locked(), it will get a new inode and it will initialize inode
> > > with wrong information.
> > >
> > > I had done similar thing for REDIRECT, but once we removed logic to
> > > remove REDIRECT on copy up, I felt I did not have to check redirect
> > > again here.
> > >
> > > In general, I feel that once we have the inode lock, we should check
> > > metacopy and redirect both and then initialize inode. And not rely
> > > on information which was checked outside the lock and might have
> > > been stale by now.
> > >
> > 
> > Hmmm... so as you once already said, we have a race with INDEX as well.
> 
> I am not sure about INDEX race. I remember talking about INDEX flag race
> in ovl_inode but w.r.t ovl_get_inode(), what's the race you are seeing
> with index. Once ovl_lookup() has found index, can that index go away by
> the time ovl_get_inode() is called? If not, then there should not be
> any race.

May be you are referring to race when ovl_lookup() looks for index
its not there (all lower), and then thread gets blocked, other cpu
triggers a copy up and creates index, flushs ovl_inode and now first
thread continues and does ovl_get_inode() and does not set OVL_INDEX
flag (Despite the fact there is index)?

Thanks
Vivek

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-02 19:40           ` Vivek Goyal
@ 2018-05-02 19:57             ` Amir Goldstein
  2018-05-03 14:33               ` Vivek Goyal
  0 siblings, 1 reply; 67+ messages in thread
From: Amir Goldstein @ 2018-05-02 19:57 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Wed, May 2, 2018 at 10:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, May 02, 2018 at 03:09:57PM -0400, Vivek Goyal wrote:
>> On Mon, Apr 30, 2018 at 09:08:01PM +0300, Amir Goldstein wrote:
>> [..]
>> > >> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>> > >> >         if (index)
>> > >> >                 ovl_set_flag(OVL_INDEX, inode);
>> > >> >
>> > >> > +       if (upperdentry) {
>> > >> > +               err = ovl_check_metacopy_xattr(upperdentry);
>> > >> > +               if (err < 0)
>> > >> > +                       goto out_err;
>> > >> > +               metacopy = err;
>> > >> > +               if (!metacopy)
>> > >> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
>> > >> > +       }
>> > >> > +
>> > >>
>> > >> There is no reason to ovl_check_metacopy_xattr again here, right?
>> > >
>> > > I think we need to check metacopy here otherwise it becomes racy. For
>> > > example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
>> > > ovl_lookup(l1) will think metacopy is on while another thread on another
>> > > cpu might have trigged copy up, remove metacopy xattr. And it is possible
>> > > that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
>> > > iget5_locked(), it will get a new inode and it will initialize inode
>> > > with wrong information.
>> > >
>> > > I had done similar thing for REDIRECT, but once we removed logic to
>> > > remove REDIRECT on copy up, I felt I did not have to check redirect
>> > > again here.
>> > >
>> > > In general, I feel that once we have the inode lock, we should check
>> > > metacopy and redirect both and then initialize inode. And not rely
>> > > on information which was checked outside the lock and might have
>> > > been stale by now.
>> > >
>> >
>> > Hmmm... so as you once already said, we have a race with INDEX as well.
>>
>> I am not sure about INDEX race. I remember talking about INDEX flag race
>> in ovl_inode but w.r.t ovl_get_inode(), what's the race you are seeing
>> with index. Once ovl_lookup() has found index, can that index go away by
>> the time ovl_get_inode() is called? If not, then there should not be
>> any race.
>
> May be you are referring to race when ovl_lookup() looks for index
> its not there (all lower), and then thread gets blocked, other cpu
> triggers a copy up and creates index, flushs ovl_inode and now first
> thread continues and does ovl_get_inode() and does not set OVL_INDEX
> flag (Despite the fact there is index)?
>

Yes, that's what I figured when you described the potential METACOPY race.

Besides that I am still worried about corners we missed in setup of
metacopy=on,index=off.

I may be confused about whether lower with nlink=1 with index=off
after metacopy up uses upper st_ino and hashed by upper inode?
Need to understand whether the "hashed by" state is consistent with
metacopy=on,index=off in all cases.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-02 19:57             ` Amir Goldstein
@ 2018-05-03 14:33               ` Vivek Goyal
  2018-05-03 20:05                 ` Amir Goldstein
  0 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-05-03 14:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Wed, May 02, 2018 at 10:57:16PM +0300, Amir Goldstein wrote:
> On Wed, May 2, 2018 at 10:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, May 02, 2018 at 03:09:57PM -0400, Vivek Goyal wrote:
> >> On Mon, Apr 30, 2018 at 09:08:01PM +0300, Amir Goldstein wrote:
> >> [..]
> >> > >> > @@ -836,6 +836,15 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> >> > >> >         if (index)
> >> > >> >                 ovl_set_flag(OVL_INDEX, inode);
> >> > >> >
> >> > >> > +       if (upperdentry) {
> >> > >> > +               err = ovl_check_metacopy_xattr(upperdentry);
> >> > >> > +               if (err < 0)
> >> > >> > +                       goto out_err;
> >> > >> > +               metacopy = err;
> >> > >> > +               if (!metacopy)
> >> > >> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
> >> > >> > +       }
> >> > >> > +
> >> > >>
> >> > >> There is no reason to ovl_check_metacopy_xattr again here, right?
> >> > >
> >> > > I think we need to check metacopy here otherwise it becomes racy. For
> >> > > example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
> >> > > ovl_lookup(l1) will think metacopy is on while another thread on another
> >> > > cpu might have trigged copy up, remove metacopy xattr. And it is possible
> >> > > that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
> >> > > iget5_locked(), it will get a new inode and it will initialize inode
> >> > > with wrong information.
> >> > >
> >> > > I had done similar thing for REDIRECT, but once we removed logic to
> >> > > remove REDIRECT on copy up, I felt I did not have to check redirect
> >> > > again here.
> >> > >
> >> > > In general, I feel that once we have the inode lock, we should check
> >> > > metacopy and redirect both and then initialize inode. And not rely
> >> > > on information which was checked outside the lock and might have
> >> > > been stale by now.
> >> > >
> >> >
> >> > Hmmm... so as you once already said, we have a race with INDEX as well.
> >>
> >> I am not sure about INDEX race. I remember talking about INDEX flag race
> >> in ovl_inode but w.r.t ovl_get_inode(), what's the race you are seeing
> >> with index. Once ovl_lookup() has found index, can that index go away by
> >> the time ovl_get_inode() is called? If not, then there should not be
> >> any race.
> >
> > May be you are referring to race when ovl_lookup() looks for index
> > its not there (all lower), and then thread gets blocked, other cpu
> > triggers a copy up and creates index, flushs ovl_inode and now first
> > thread continues and does ovl_get_inode() and does not set OVL_INDEX
> > flag (Despite the fact there is index)?
> >
> 
> Yes, that's what I figured when you described the potential METACOPY race.

Ok. I think these races have been around for quite some time. I would 
like to address it separately and not as part of this patch series.

> 
> Besides that I am still worried about corners we missed in setup of
> metacopy=on,index=off.
> 
> I may be confused about whether lower with nlink=1 with index=off
> after metacopy up uses upper st_ino and hashed by upper inode?
> Need to understand whether the "hashed by" state is consistent with
> metacopy=on,index=off in all cases.

metacopy=on, index=off should not change inode number reporting. For
the case of nlink=1 and index=off, st_ino of lower is reported and
metacopy=on/off does not matter at all. Similarly, ovl_hash_bylower()
remains untouched due to metacopy. So for the case of nlink=1, index=off,
inode should still be hashed using lower inode number.

Only thing metacopy feature changes in stat() is st_blocks. It should
not impact st_ino or st_dev reporting at all.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-01 14:37         ` Amir Goldstein
@ 2018-05-03 15:12           ` Vivek Goyal
  2018-05-03 20:07             ` Amir Goldstein
  0 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-05-03 15:12 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Tue, May 01, 2018 at 05:37:09PM +0300, Amir Goldstein wrote:
> On Mon, Apr 30, 2018 at 9:08 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > On Mon, Apr 30, 2018 at 5:02 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> [...]
> >>
> >> I think we need to check metacopy here otherwise it becomes racy. For
> >> example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
> >> ovl_lookup(l1) will think metacopy is on while another thread on another
> >> cpu might have trigged copy up, remove metacopy xattr. And it is possible
> >> that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
> >> iget5_locked(), it will get a new inode and it will initialize inode
> >> with wrong information.
> >>
> >> I had done similar thing for REDIRECT, but once we removed logic to
> >> remove REDIRECT on copy up, I felt I did not have to check redirect
> >> again here.
> >>
> >> In general, I feel that once we have the inode lock, we should check
> >> metacopy and redirect both and then initialize inode. And not rely
> >> on information which was checked outside the lock and might have
> >> been stale by now.
> >>
> >
> > Hmmm... so as you once already said, we have a race with INDEX as well.
> > In general, I feel it would be better to do a minimal sanity check inside
> > inode lock for INDEX and METACOPY (maybe REDIRECT) and make
> > sure they are consistent with the info in ovl_inode_info.
> > If they are not, return -ESTALE and that may result in re-calling
> > lookup.
> >
> 
> After thinking about it some more, I believe the way we should do it
> is implement ovl_iget_locked() and call it from ovl_lookup() *before*
> we lookup index.
> 
> Then, *after* looking up index,
> If this turns out to be a broken hardlink, we throw away the
> inode that we got by lower and get a new inode hashed by upper.
> 
> Then we can initialize the inode with another helper
> or directly in ovl_lookup() and unlock_new_inode().
> 
> w.r.t METACOPY things will get complicated if we remove
> METACOPY xattr and the upper does not have ORIGIN.

I am not able to understand the problem you are referring to.

> In that case we may have an inode that is already hashed by lower,
> but for a new lookup that does not see METACOPY xattr it
> looks up the inode in cache by upper and this is not good.

Here is my understanding. Following happens without metacopy feature.

- As of now, only time we do not create ORIGIN on upper is for the
  case of broken hardlink. (for a non-dir file).

- In case of broken hard link (index=off), if file is on lower only, we
  do not put new inode in inode cache at all. Once file gets copied up,
  we hash it by upper inode. And reported inode number changes over
  copy up.

Now above behavior should remain same even with metacopy=on. For the
case of broken hard link.

- If file is on lower only, we will not put inode in cache. And once
  file gets copied up metadata only, we will hash it with upper inode.
  
  The only difference here is that without metacopy non-dir file will
  have lowerdentry=NULL while with metacopy on, lowerdentry will be
  non-null. But bylower should still be false because of following
  check in ovl_hash_bylower().

  /* No, if lower hardlink is or will be broken on copy up */
   if ((upper || !ovl_indexdir(sb)) &&
            !d_is_dir(lower) && d_inode(lower)->i_nlink > 1)
                return false;

And if bylower is false, we hash inode using upper inode as key.

                struct inode *key = d_inode(bylower ? lowerdentry :
                                                      upperdentry);

IOW, despite the fact lowerdentry is there (without ORIGIN on upper),
we don't use that lowerdentry for inode hashing. And hence behavior
should not change.

What am I missing?

Thanks
Vivek

> 
> There is one solution that you will surely not like -
> instead of removing METACOPY xattr set METACOPY=n.
> That means we still need to lookup lower layers for origin inode,
> but we do set UPPERDATA flag.
> 
> There may be other solutions to the problem, but I cannot think
> of any right now.
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-03 14:33               ` Vivek Goyal
@ 2018-05-03 20:05                 ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-05-03 20:05 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

[...]
>>
>> Besides that I am still worried about corners we missed in setup of
>> metacopy=on,index=off.
>>
>> I may be confused about whether lower with nlink=1 with index=off
>> after metacopy up uses upper st_ino and hashed by upper inode?
>> Need to understand whether the "hashed by" state is consistent with
>> metacopy=on,index=off in all cases.
>
> metacopy=on, index=off should not change inode number reporting. For
> the case of nlink=1 and index=off, st_ino of lower is reported and
> metacopy=on/off does not matter at all. Similarly, ovl_hash_bylower()
> remains untouched due to metacopy. So for the case of nlink=1, index=off,
> inode should still be hashed using lower inode number.
>
> Only thing metacopy feature changes in stat() is st_blocks. It should
> not impact st_ino or st_dev reporting at all.
>

I did not articulate my concern correctly. The problem is not with
index=off, but with !ovl_can_decode_fh() which implies index=off,
but does not imply metacopy=off.

With !ovl_can_decode_fh(), lower cannot be followed by ORIGIN
and therefore cannot have CONST_INO, but with METACOPY
lower is followed... only until METACOPY xattr is removed and
then lower is no longer followed.

The solution is probably to fix (or at least audit) ovl_hash_bylower()
to recognize this case and treat those lowers as broken hardlinks
even though they have nlink=1, so METACOPY won't hash by lower
and won't use lower st_ino.
As I wrote, I am not sure there are problems here, but would like
to make sure that there aren't any problems.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-03 15:12           ` Vivek Goyal
@ 2018-05-03 20:07             ` Amir Goldstein
  2018-05-04  7:04               ` Amir Goldstein
  0 siblings, 1 reply; 67+ messages in thread
From: Amir Goldstein @ 2018-05-03 20:07 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, May 3, 2018 at 6:12 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, May 01, 2018 at 05:37:09PM +0300, Amir Goldstein wrote:
>> On Mon, Apr 30, 2018 at 9:08 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> > On Mon, Apr 30, 2018 at 5:02 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> [...]
>> >>
>> >> I think we need to check metacopy here otherwise it becomes racy. For
>> >> example, what if there is a hard link (say, l1 and l2) with metacopy xattr.
>> >> ovl_lookup(l1) will think metacopy is on while another thread on another
>> >> cpu might have trigged copy up, remove metacopy xattr. And it is possible
>> >> that inode got flushed out of cache. So by the time ovl_lookup(l1), calls
>> >> iget5_locked(), it will get a new inode and it will initialize inode
>> >> with wrong information.
>> >>
>> >> I had done similar thing for REDIRECT, but once we removed logic to
>> >> remove REDIRECT on copy up, I felt I did not have to check redirect
>> >> again here.
>> >>
>> >> In general, I feel that once we have the inode lock, we should check
>> >> metacopy and redirect both and then initialize inode. And not rely
>> >> on information which was checked outside the lock and might have
>> >> been stale by now.
>> >>
>> >
>> > Hmmm... so as you once already said, we have a race with INDEX as well.
>> > In general, I feel it would be better to do a minimal sanity check inside
>> > inode lock for INDEX and METACOPY (maybe REDIRECT) and make
>> > sure they are consistent with the info in ovl_inode_info.
>> > If they are not, return -ESTALE and that may result in re-calling
>> > lookup.
>> >
>>
>> After thinking about it some more, I believe the way we should do it
>> is implement ovl_iget_locked() and call it from ovl_lookup() *before*
>> we lookup index.
>>
>> Then, *after* looking up index,
>> If this turns out to be a broken hardlink, we throw away the
>> inode that we got by lower and get a new inode hashed by upper.
>>
>> Then we can initialize the inode with another helper
>> or directly in ovl_lookup() and unlock_new_inode().
>>
>> w.r.t METACOPY things will get complicated if we remove
>> METACOPY xattr and the upper does not have ORIGIN.
>
> I am not able to understand the problem you are referring to.
>
>> In that case we may have an inode that is already hashed by lower,
>> but for a new lookup that does not see METACOPY xattr it
>> looks up the inode in cache by upper and this is not good.
>
> Here is my understanding. Following happens without metacopy feature.
>
> - As of now, only time we do not create ORIGIN on upper is for the
>   case of broken hardlink. (for a non-dir file).
>
> - In case of broken hard link (index=off), if file is on lower only, we
>   do not put new inode in inode cache at all. Once file gets copied up,
>   we hash it by upper inode. And reported inode number changes over
>   copy up.
>
> Now above behavior should remain same even with metacopy=on. For the
> case of broken hard link.
>
> - If file is on lower only, we will not put inode in cache. And once
>   file gets copied up metadata only, we will hash it with upper inode.
>
>   The only difference here is that without metacopy non-dir file will
>   have lowerdentry=NULL while with metacopy on, lowerdentry will be
>   non-null. But bylower should still be false because of following
>   check in ovl_hash_bylower().
>
>   /* No, if lower hardlink is or will be broken on copy up */
>    if ((upper || !ovl_indexdir(sb)) &&
>             !d_is_dir(lower) && d_inode(lower)->i_nlink > 1)
>                 return false;
>
> And if bylower is false, we hash inode using upper inode as key.
>
>                 struct inode *key = d_inode(bylower ? lowerdentry :
>                                                       upperdentry);
>
> IOW, despite the fact lowerdentry is there (without ORIGIN on upper),
> we don't use that lowerdentry for inode hashing. And hence behavior
> should not change.
>
> What am I missing?
>

Just maybe missing the case of nlink=1 and ORIGIN cannot be
followed because lower fs doesn't support file handles.
I just wanted to make sure this case is handled correctly.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy
  2018-04-28  8:49   ` Amir Goldstein
@ 2018-05-03 20:31     ` Vivek Goyal
  0 siblings, 0 replies; 67+ messages in thread
From: Vivek Goyal @ 2018-05-03 20:31 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Sat, Apr 28, 2018 at 01:49:38AM -0700, Amir Goldstein wrote:
> On Thu, Apr 26, 2018 at 12:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > ovl_open() should open file which contains data and not open metacopy
> > inode.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> This looks fine, but I don't see a reason to separate it from 26/31
> first allow only data open, then allow also meta open..
> Anyway, I am not sure about the cleanest way to sort all the cases.

Ok, I have simplified it now. Merged fsync and patch 26 in one. Also
got rid of separate function to open meta file. I now have just added
a parameter to ovl_open_realfile() and ovl_real_file() which specifies
that its fine to open metacopy inode.

That feels much cleaner to me now.

Vivek
> 
> > ---
> >  fs/overlayfs/file.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> > index 23638d8ebab5..558a859b2658 100644
> > --- a/fs/overlayfs/file.c
> > +++ b/fs/overlayfs/file.c
> > @@ -18,18 +18,26 @@ static struct file *ovl_open_realfile(const struct file *file)
> >  {
> >         struct inode *inode = file_inode(file);
> >         struct inode *upperinode = ovl_inode_upper(inode);
> > -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> > +       struct inode *realinode;
> >         struct file *realfile;
> > +       bool upperreal = false;
> >         const struct cred *old_cred;
> >
> > +       if (upperinode && ovl_has_upperdata(inode))
> > +               upperreal = true;
> > +
> > +       /* Always open file which contains data. Do not open metacopy. */
> > +       realinode = upperreal ? upperinode : ovl_inode_lowerdata(inode);
> > +
> >         old_cred = ovl_override_creds(inode->i_sb);
> >         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
> >                              realinode, current_cred(), false);
> >         revert_creds(old_cred);
> >
> >         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> > -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> > -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> > +                file, file, upperreal ? 'u' : 'l',
> > +                file->f_flags, realfile,
> > +                IS_ERR(realfile) ? 0 : realfile->f_flags);
> >
> >         return realfile;
> >  }
> > @@ -80,7 +88,7 @@ static int ovl_real_file(const struct file *file, struct fd *real)
> >         real->file = file->private_data;
> >
> >         /* Has it been copied up since we'd opened it? */
> > -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> > +       if (unlikely(file_inode(real->file) != ovl_inode_real_data(inode))) {
> >                 real->flags = FDPUT_FPUT;
> >                 real->file = ovl_open_realfile(file);
> >
> > --
> > 2.13.6
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-03 20:07             ` Amir Goldstein
@ 2018-05-04  7:04               ` Amir Goldstein
  2018-05-04 15:54                 ` Vivek Goyal
  0 siblings, 1 reply; 67+ messages in thread
From: Amir Goldstein @ 2018-05-04  7:04 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Thu, May 3, 2018 at 11:07 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, May 3, 2018 at 6:12 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> On Tue, May 01, 2018 at 05:37:09PM +0300, Amir Goldstein wrote:
[...]

>>>
>>> w.r.t METACOPY things will get complicated if we remove
>>> METACOPY xattr and the upper does not have ORIGIN.
>>
>> I am not able to understand the problem you are referring to.
>>
>>> In that case we may have an inode that is already hashed by lower,
>>> but for a new lookup that does not see METACOPY xattr it
>>> looks up the inode in cache by upper and this is not good.
>>
>> Here is my understanding. Following happens without metacopy feature.
>>
>> - As of now, only time we do not create ORIGIN on upper is for the
>>   case of broken hardlink. (for a non-dir file).
>>
>> - In case of broken hard link (index=off), if file is on lower only, we
>>   do not put new inode in inode cache at all. Once file gets copied up,
>>   we hash it by upper inode. And reported inode number changes over
>>   copy up.
>>
>> Now above behavior should remain same even with metacopy=on. For the
>> case of broken hard link.
>>
>> - If file is on lower only, we will not put inode in cache. And once
>>   file gets copied up metadata only, we will hash it with upper inode.
>>
>>   The only difference here is that without metacopy non-dir file will
>>   have lowerdentry=NULL while with metacopy on, lowerdentry will be
>>   non-null. But bylower should still be false because of following
>>   check in ovl_hash_bylower().
>>
>>   /* No, if lower hardlink is or will be broken on copy up */
>>    if ((upper || !ovl_indexdir(sb)) &&
>>             !d_is_dir(lower) && d_inode(lower)->i_nlink > 1)
>>                 return false;
>>
>> And if bylower is false, we hash inode using upper inode as key.
>>
>>                 struct inode *key = d_inode(bylower ? lowerdentry :
>>                                                       upperdentry);
>>
>> IOW, despite the fact lowerdentry is there (without ORIGIN on upper),
>> we don't use that lowerdentry for inode hashing. And hence behavior
>> should not change.
>>
>> What am I missing?
>>
>
> Just maybe missing the case of nlink=1 and ORIGIN cannot be
> followed because lower fs doesn't support file handles.
> I just wanted to make sure this case is handled correctly.
>

OK. re-read the code and ready to explain the situation.

The meaning of new OVL_CONST_INO flag is:
"inode preserves st_ino across copy up".
If lower fs support file handles it also means:
"...and across eviction from inode cache".
but if lower fs does not support file handles it means:
"...while non-dir inode remains in cache"

How does METACOPY change the situation?
very slightly - with METACOPY, instead of st_ino change
after copy_up+cache_evict, st_ino will not change after
meta_copy_up+cache_evict, but will change after
data_copy_up+cache_evict, which is not a problem as
far as I can tell.

Specifically, removing METACOPY xattr will not change
st_ino nor hash_by_lower until after cache eviction and as
long as we have one-to-one mapping of upper inode to
lower inode that's good enough.

Things will only break with filesystem inconsistencies
like in xfstest overlay/049, but AFAICT, METACOPY does
not introduce any real regressions, the only thing it does
is add another way of creating an inconsistency with
redirect on non-dir, which probably calls for a new xfstest.

I hope I got it all right.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-04  7:04               ` Amir Goldstein
@ 2018-05-04 15:54                 ` Vivek Goyal
  2018-05-04 18:47                   ` Amir Goldstein
  0 siblings, 1 reply; 67+ messages in thread
From: Vivek Goyal @ 2018-05-04 15:54 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Fri, May 04, 2018 at 10:04:22AM +0300, Amir Goldstein wrote:
> On Thu, May 3, 2018 at 11:07 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > On Thu, May 3, 2018 at 6:12 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> On Tue, May 01, 2018 at 05:37:09PM +0300, Amir Goldstein wrote:
> [...]
> 
> >>>
> >>> w.r.t METACOPY things will get complicated if we remove
> >>> METACOPY xattr and the upper does not have ORIGIN.
> >>
> >> I am not able to understand the problem you are referring to.
> >>
> >>> In that case we may have an inode that is already hashed by lower,
> >>> but for a new lookup that does not see METACOPY xattr it
> >>> looks up the inode in cache by upper and this is not good.
> >>
> >> Here is my understanding. Following happens without metacopy feature.
> >>
> >> - As of now, only time we do not create ORIGIN on upper is for the
> >>   case of broken hardlink. (for a non-dir file).
> >>
> >> - In case of broken hard link (index=off), if file is on lower only, we
> >>   do not put new inode in inode cache at all. Once file gets copied up,
> >>   we hash it by upper inode. And reported inode number changes over
> >>   copy up.
> >>
> >> Now above behavior should remain same even with metacopy=on. For the
> >> case of broken hard link.
> >>
> >> - If file is on lower only, we will not put inode in cache. And once
> >>   file gets copied up metadata only, we will hash it with upper inode.
> >>
> >>   The only difference here is that without metacopy non-dir file will
> >>   have lowerdentry=NULL while with metacopy on, lowerdentry will be
> >>   non-null. But bylower should still be false because of following
> >>   check in ovl_hash_bylower().
> >>
> >>   /* No, if lower hardlink is or will be broken on copy up */
> >>    if ((upper || !ovl_indexdir(sb)) &&
> >>             !d_is_dir(lower) && d_inode(lower)->i_nlink > 1)
> >>                 return false;
> >>
> >> And if bylower is false, we hash inode using upper inode as key.
> >>
> >>                 struct inode *key = d_inode(bylower ? lowerdentry :
> >>                                                       upperdentry);
> >>
> >> IOW, despite the fact lowerdentry is there (without ORIGIN on upper),
> >> we don't use that lowerdentry for inode hashing. And hence behavior
> >> should not change.
> >>
> >> What am I missing?
> >>
> >
> > Just maybe missing the case of nlink=1 and ORIGIN cannot be
> > followed because lower fs doesn't support file handles.
> > I just wanted to make sure this case is handled correctly.
> >
> 
> OK. re-read the code and ready to explain the situation.
> 
> The meaning of new OVL_CONST_INO flag is:
> "inode preserves st_ino across copy up".
> If lower fs support file handles it also means:
> "...and across eviction from inode cache".
> but if lower fs does not support file handles it means:
> "...while non-dir inode remains in cache"
> 
> How does METACOPY change the situation?
> very slightly - with METACOPY, instead of st_ino change
> after copy_up+cache_evict, st_ino will not change after
> meta_copy_up+cache_evict, but will change after
> data_copy_up+cache_evict, which is not a problem as
> far as I can tell.
> 
> Specifically, removing METACOPY xattr will not change
> st_ino nor hash_by_lower until after cache eviction and as
> long as we have one-to-one mapping of upper inode to
> lower inode that's good enough.

Hi Amir,

Thanks for detailed explanation. I now understand your concern.

How about I pass in the origin information in ovl_get_inode() and
to ovl_hash_bylower(). And we set bylower=1 for a metacopy dentry
only if an upper origin could be found and verified? 

That way behavior will be similar to a file with nlink=1 and ORIGIN
not there. And we will not have to worry about thinking another
special corner case and writing test cases for it.

Too many corner cases are bad for future code maintenance and easy
to break.

WDYT?

Thanks
Vivek
> 
> Things will only break with filesystem inconsistencies
> like in xfstest overlay/049, but AFAICT, METACOPY does
> not introduce any real regressions, the only thing it does
> is add another way of creating an inconsistency with
> redirect on non-dir, which probably calls for a new xfstest.
> 
> I hope I got it all right.
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-04 15:54                 ` Vivek Goyal
@ 2018-05-04 18:47                   ` Amir Goldstein
  0 siblings, 0 replies; 67+ messages in thread
From: Amir Goldstein @ 2018-05-04 18:47 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Fri, May 4, 2018 at 6:54 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Fri, May 04, 2018 at 10:04:22AM +0300, Amir Goldstein wrote:
[...]
>> >
>> > Just maybe missing the case of nlink=1 and ORIGIN cannot be
>> > followed because lower fs doesn't support file handles.
>> > I just wanted to make sure this case is handled correctly.
>> >
>>
>> OK. re-read the code and ready to explain the situation.
>>
>> The meaning of new OVL_CONST_INO flag is:
>> "inode preserves st_ino across copy up".
>> If lower fs support file handles it also means:
>> "...and across eviction from inode cache".
>> but if lower fs does not support file handles it means:
>> "...while non-dir inode remains in cache"
>>
>> How does METACOPY change the situation?
>> very slightly - with METACOPY, instead of st_ino change
>> after copy_up+cache_evict, st_ino will not change after
>> meta_copy_up+cache_evict, but will change after
>> data_copy_up+cache_evict, which is not a problem as
>> far as I can tell.
>>
>> Specifically, removing METACOPY xattr will not change
>> st_ino nor hash_by_lower until after cache eviction and as
>> long as we have one-to-one mapping of upper inode to
>> lower inode that's good enough.
>
> Hi Amir,
>
> Thanks for detailed explanation. I now understand your concern.
>
> How about I pass in the origin information in ovl_get_inode() and
> to ovl_hash_bylower(). And we set bylower=1 for a metacopy dentry
> only if an upper origin could be found and verified?
>
> That way behavior will be similar to a file with nlink=1 and ORIGIN
> not there. And we will not have to worry about thinking another
> special corner case and writing test cases for it.
>
> Too many corner cases are bad for future code maintenance and easy
> to break.
>
> WDYT?
>

I think you should leave it for now and maybe we will address
it later. The reason I am not very concerned anymore is because
I realize we are already exposed to situation of lower without
ORIGIN - this is how a just copied up inode looks like when
lower fs doesn't support file handles. So METACOPY is not
adding a corner case, it just increases the window of time
where the corner case exists beyond inode eviction from cache.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2018-05-04 18:47 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-26 19:09 [PATCH v14 00/31] overlayfs: Delayed copy up of data Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 01/31] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 02/31] ovl: Move the copy up helpers to copy_up.c Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 03/31] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 04/31] ovl: During copy up, first copy up metadata and then data Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 05/31] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 06/31] ovl: Add helper ovl_already_copied_up() Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 07/31] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 08/31] ovl: Use out_err instead of out_nomem Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 09/31] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
2018-04-28  8:14   ` Amir Goldstein
2018-04-30 14:02     ` Vivek Goyal
2018-04-30 18:08       ` Amir Goldstein
2018-05-01 14:37         ` Amir Goldstein
2018-05-03 15:12           ` Vivek Goyal
2018-05-03 20:07             ` Amir Goldstein
2018-05-04  7:04               ` Amir Goldstein
2018-05-04 15:54                 ` Vivek Goyal
2018-05-04 18:47                   ` Amir Goldstein
2018-05-02 19:09         ` Vivek Goyal
2018-05-02 19:40           ` Vivek Goyal
2018-05-02 19:57             ` Amir Goldstein
2018-05-03 14:33               ` Vivek Goyal
2018-05-03 20:05                 ` Amir Goldstein
2018-04-30 19:32     ` Vivek Goyal
2018-04-30 20:21       ` Amir Goldstein
2018-04-26 19:09 ` [PATCH v14 10/31] ovl: Copy up meta inode data from lowest data inode Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 11/31] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 12/31] ovl: Add an helper to get real " Vivek Goyal
2018-04-26 20:33   ` Amir Goldstein
2018-04-30 13:06     ` Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 13/31] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 14/31] ovl: Store lower data inode in ovl_inode Vivek Goyal
2018-04-26 20:31   ` Amir Goldstein
2018-04-26 19:09 ` [PATCH v14 15/31] ovl: Add helper ovl_inode_real_data() Vivek Goyal
2018-04-26 20:35   ` Amir Goldstein
2018-04-26 19:09 ` [PATCH v14 16/31] ovl: Always open file/inode which contains data and not metacopy Vivek Goyal
2018-04-28  8:49   ` Amir Goldstein
2018-05-03 20:31     ` Vivek Goyal
2018-04-26 19:09 ` [PATCH v14 17/31] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
2018-04-28  7:36   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 18/31] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
2018-04-28  7:34   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 19/31] ovl: Check redirects for metacopy files Vivek Goyal
2018-04-28  8:32   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 20/31] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Vivek Goyal
2018-04-26 19:10 ` [PATCH v14 21/31] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
2018-04-28  8:33   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 22/31] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
2018-04-28  9:14   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 23/31] ovl: Set redirect on metacopy files upon rename Vivek Goyal
2018-04-28  8:25   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 24/31] ovl: Set redirect on upper inode when it is linked Vivek Goyal
2018-04-28  8:40   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 25/31] ovl: Check redirect on index as well Vivek Goyal
2018-04-28  9:26   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 26/31] ovl: Allow ovl_open_realfile() to open metacopy inode Vivek Goyal
2018-04-28  9:05   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 27/31] ovl: Issue fsync on upper metacopy inode as well Vivek Goyal
2018-04-28  8:54   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 28/31] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
2018-04-28  9:28   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 29/31] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
2018-04-28  8:35   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 30/31] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
2018-04-28  8:31   ` Amir Goldstein
2018-04-26 19:10 ` [PATCH v14 31/31] ovl: Enable metadata only feature Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.