All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v15 00/30] overlayfs: Delayed copy up of data
@ 2018-05-07 17:40 Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure Vivek Goyal
                   ` (31 more replies)
  0 siblings, 32 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Hi,

This is V15 of overlayfs metadata only copy-up feature. These patches I
have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw

Patches are also available here.

https://github.com/rhvgoyal/linux/commits/metacopy-v15

I have run unionmount-testsuite and "./check -overlay -g quick" and that
works. Only 4 overlay tests fail, which fail on vanilla kernel too.

Changes from V14:

- Rebased on top of latest overlayfs-rorw branch.
- Took care of amir's comments on V14.
- Started passing arguments to ovl_get_inode() in a structure, as
  argument list was getting too long.
- Simplified ovl_fsync() and helper patches a bit.
- unioned new field lowerdata with dir cache. 
- Renamed few helper functions with suffix _realdata instead of
- Renamed ovl_rel_redirect() to ovl_need_absolute_redirect() and also
  now started calling it from ovl_set_redirect() instead of calling
  from ovl_reaname() directly.
- Added argument padding to ovl_get_redirect_xattr() and reused this
  function in ovl_check_redirect().
- Added a helper funciton ovl_copy_up_with_data() to force full copy
  up (metadata plus data).

Thanks
Vivek

Vivek Goyal (30):
  ovl: Pass argument to ovl_get_inode() in a structure
  ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  ovl: Move the copy up helpers to copy_up.c
  ovl: Provide a mount option metacopy=on/off for metadata copyup
  ovl: During copy up, first copy up metadata and then data
  ovl: Copy up only metadata during copy up where it makes sense
  ovl: Add helper ovl_already_copied_up()
  ovl: A new xattr OVL_XATTR_METACOPY for file on upper
  ovl: Use out_err instead of out_nomem
  ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  ovl: Copy up meta inode data from lowest data inode
  ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  ovl: Add an helper to get real data dentry
  ovl: Fix ovl_getattr() to get number of blocks from lower
  ovl: Store lower data inode in ovl_inode
  ovl: Add helper ovl_inode_real_data()
  ovl: Open file with data except for the case of fsync
  ovl: Do not expose metacopy only dentry from d_real()
  ovl: Move some dir related ovl_lookup_single() code in else block
  ovl: Check redirects for metacopy files
  ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  ovl: Add an inode flag OVL_CONST_INO
  ovl: Do not set dentry type ORIGIN for broken hardlinks
  ovl: Set redirect on metacopy files upon rename
  ovl: Set redirect on upper inode when it is linked
  ovl: Check redirect on index as well
  ovl: Disbale metacopy for MAP_SHARED mmap()
  ovl: Do not do metadata only copy-up for truncate operation
  ovl: Do not do metacopy only for ioctl modifying file attr
  ovl: Enable metadata only feature

 Documentation/filesystems/overlayfs.txt |  30 +++-
 fs/overlayfs/Kconfig                    |  19 +++
 fs/overlayfs/copy_up.c                  | 161 ++++++++++++++++-----
 fs/overlayfs/dir.c                      |  85 ++++++++---
 fs/overlayfs/export.c                   |   8 +-
 fs/overlayfs/file.c                     |  53 ++++---
 fs/overlayfs/inode.c                    | 131 ++++++++++-------
 fs/overlayfs/namei.c                    | 199 ++++++++++++++++---------
 fs/overlayfs/overlayfs.h                |  36 ++++-
 fs/overlayfs/ovl_entry.h                |  16 +-
 fs/overlayfs/super.c                    |  60 +++++++-
 fs/overlayfs/util.c                     | 249 +++++++++++++++++++++++++++++++-
 12 files changed, 829 insertions(+), 218 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 19:26   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 02/30] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
                   ` (30 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_get_inode() right now has 5 parameters. Soon this patch series will
add 2 more and suddenly argument list starts looking too long.

Hence pass arguments to ovl_get_inode() in a structure and it looks
little cleaner.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    |  4 +++-
 fs/overlayfs/inode.c     | 19 ++++++++++---------
 fs/overlayfs/namei.c     |  6 ++++--
 fs/overlayfs/overlayfs.h |  4 +---
 fs/overlayfs/ovl_entry.h |  8 ++++++++
 5 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 425a94672300..867946adcbc5 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -300,12 +300,14 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	struct dentry *dentry;
 	struct inode *inode;
 	struct ovl_entry *oe;
+	struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower};
 
 	/* We get overlay directory dentries with ovl_lookup_real() */
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
+	oip.upperdentry = dget(upper);
+	inode = ovl_get_inode(&oip);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 7abcf96e94fc..2fe9538fffc9 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -792,15 +792,16 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
 	return true;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower)
+struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 {
+	struct dentry *upperdentry = oip->upperdentry;
+	struct ovl_path *lowerpath = oip->lowerpath;
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
 	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
-	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
-	int fsid = bylower ? lowerpath->layer->fsid : 0;
+	bool bylower = ovl_hash_bylower(oip->sb, upperdentry, lowerdentry,
+					oip->index);
+	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
 
@@ -817,7 +818,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 						      upperdentry);
 		unsigned int nlink = is_dir ? 1 : realinode->i_nlink;
 
-		inode = iget5_locked(sb, (unsigned long) key,
+		inode = iget5_locked(oip->sb, (unsigned long) key,
 				     ovl_inode_test, ovl_inode_set, key);
 		if (!inode)
 			goto out_nomem;
@@ -844,7 +845,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 		ino = key->i_ino;
 	} else {
 		/* Lower hardlink that will be broken on copy up */
-		inode = new_inode(sb);
+		inode = new_inode(oip->sb);
 		if (!inode)
 			goto out_nomem;
 	}
@@ -854,12 +855,12 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
 
-	if (index)
+	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
-		if (((upperdentry && lowerdentry) || numlower > 1) ||
+		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
 		    ovl_check_origin_xattr(upperdentry ?: lowerdentry)) {
 			ovl_set_flag(OVL_WHITEOUTS, inode);
 		}
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 2dba29eadde6..1ab68454bfb6 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1004,8 +1004,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		upperdentry = dget(index);
 
 	if (upperdentry || ctr) {
-		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
-				      ctr);
+		struct ovl_inode_params oip = {dentry->d_sb, upperdentry,
+					       stack, index, ctr};
+
+		inode = ovl_get_inode(&oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index caaa47cea2aa..4b6c90a0aa2c 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -345,9 +345,7 @@ bool ovl_is_private_xattr(const char *name);
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower);
+struct inode *ovl_get_inode(struct ovl_inode_params *oip);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 3bea47c63fd9..11e9abe7e381 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -101,6 +101,14 @@ struct ovl_inode {
 	struct mutex lock;
 };
 
+struct ovl_inode_params {
+	struct super_block *sb;
+	struct dentry *upperdentry;
+	struct ovl_path *lowerpath;
+	struct dentry *index;
+	unsigned int numlower;
+};
+
 static inline struct ovl_inode *OVL_I(struct inode *inode)
 {
 	return container_of(inode, struct ovl_inode, vfs_inode);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 02/30] ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-08 13:56   ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 03/30] ovl: Move the copy up helpers to copy_up.c Vivek Goyal
                   ` (29 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_inode->redirect is an inode property and should be initialized
in ovl_get_inode() only when we are adding a new inode to cache. If
inode is already in cache, it is already initialized and we should
not be touching ovl_inode->redirect field.

As of now this is not a problem as redirects are used only for directories
which don't share inode. But soon I want to use redirects for regular files
also and there it can become an issue.

Hence, move ->redirect initialization in ovl_get_inode().

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    |  3 ++-
 fs/overlayfs/inode.c     |  3 +++
 fs/overlayfs/namei.c     | 10 ++--------
 fs/overlayfs/ovl_entry.h |  1 +
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 867946adcbc5..0549286cc55e 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -300,7 +300,8 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	struct dentry *dentry;
 	struct inode *inode;
 	struct ovl_entry *oe;
-	struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower};
+	struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower,
+				       NULL};
 
 	/* We get overlay directory dentries with ovl_lookup_real() */
 	if (d_is_dir(upper ?: lower))
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 2fe9538fffc9..d85d753f23fc 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -835,6 +835,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 			}
 
 			dput(upperdentry);
+			kfree(oip->redirect);
 			goto out;
 		}
 
@@ -858,6 +859,8 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	OVL_I(inode)->redirect = oip->redirect;
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 1ab68454bfb6..8fd817bf5529 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1005,19 +1005,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (upperdentry || ctr) {
 		struct ovl_inode_params oip = {dentry->d_sb, upperdentry,
-					       stack, index, ctr};
+					       stack, index, ctr,
+					       upperredirect};
 
 		inode = ovl_get_inode(&oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
-
-		/*
-		 * NB: handle redirected hard links when non-dir redirects
-		 * become possible
-		 */
-		WARN_ON(OVL_I(inode)->redirect);
-		OVL_I(inode)->redirect = upperredirect;
 	}
 
 	revert_creds(old_cred);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 11e9abe7e381..07b8cb2785c8 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -107,6 +107,7 @@ struct ovl_inode_params {
 	struct ovl_path *lowerpath;
 	struct dentry *index;
 	unsigned int numlower;
+	char *redirect;
 };
 
 static inline struct ovl_inode *OVL_I(struct inode *inode)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 03/30] ovl: Move the copy up helpers to copy_up.c
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 02/30] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 04/30] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now two copy up helpers are in inode.c. Amir suggested it might
be better to move these to copy_up.c.

There will one more related function which will come in later patch.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 32 ++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c     | 32 --------------------------------
 fs/overlayfs/overlayfs.h |  2 +-
 3 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 8bede0742619..4afc97872c2d 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -828,6 +828,38 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 	return err;
 }
 
+static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
+{
+	/* Copy up of disconnected dentry does not set upper alias */
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) ||
+	     (dentry->d_flags & DCACHE_DISCONNECTED)))
+		return false;
+
+	if (special_file(d_inode(dentry)->i_mode))
+		return false;
+
+	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
+		return false;
+
+	return true;
+}
+
+int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
+{
+	int err = 0;
+
+	if (ovl_open_need_copy_up(dentry, file_flags)) {
+		err = ovl_want_write(dentry);
+		if (!err) {
+			err = ovl_copy_up_flags(dentry, file_flags);
+			ovl_drop_write(dentry);
+		}
+	}
+
+	return err;
+}
+
 int ovl_copy_up(struct dentry *dentry)
 {
 	return ovl_copy_up_flags(dentry, 0);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d85d753f23fc..daab9358db07 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -399,38 +399,6 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 	return acl;
 }
 
-static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
-{
-	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) ||
-	     (dentry->d_flags & DCACHE_DISCONNECTED)))
-		return false;
-
-	if (special_file(d_inode(dentry)->i_mode))
-		return false;
-
-	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
-		return false;
-
-	return true;
-}
-
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
-{
-	int err = 0;
-
-	if (ovl_open_need_copy_up(dentry, file_flags)) {
-		err = ovl_want_write(dentry);
-		if (!err) {
-			err = ovl_copy_up_flags(dentry, file_flags);
-			ovl_drop_write(dentry);
-		}
-	}
-
-	return err;
-}
-
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags)
 {
 	if (flags & S_ATIME) {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 4b6c90a0aa2c..396d9ecca919 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -338,7 +338,6 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 		  void *value, size_t size);
 ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
 struct posix_acl *ovl_get_acl(struct inode *inode, int type);
-int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
@@ -385,6 +384,7 @@ extern const struct file_operations ovl_file_operations;
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
+int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
 struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 04/30] ovl: Provide a mount option metacopy=on/off for metadata copyup
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (2 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 03/30] ovl: Move the copy up helpers to copy_up.c Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 05/30] ovl: During copy up, first copy up metadata and then data Vivek Goyal
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

By default metadata only copy up is disabled. Provide a mount option so
that users can choose one way or other.

Also provide a kernel config and module option to enable/disable
metacopy feature.

metacopy feature requires redirect_dir=on when upper is present. Otherwise,
it requires redirect_dir=follow atleast.

As of now, metacopy does not work with nfs_export=on. So if both metacopy=on
and nfs_export=on then nfs_export is disabled.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 Documentation/filesystems/overlayfs.txt | 30 ++++++++++++++++++++-
 fs/overlayfs/Kconfig                    | 19 ++++++++++++++
 fs/overlayfs/ovl_entry.h                |  1 +
 fs/overlayfs/super.c                    | 46 ++++++++++++++++++++++++++++++---
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 97eae826adf9..35f9b7f40aff 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -262,6 +262,30 @@ rightmost one and going left.  In the above example lower1 will be the
 top, lower2 the middle and lower3 the bottom layer.
 
 
+Metadata only copy up
+--------------------
+
+When metadata only copy up feature is enabled, overlayfs will only copy
+up metadata (as opposed to whole file), when a metadata specific operation
+like chown/chmod is performed. Full file will be copied up later when
+file is opened for WRITE operation.
+
+In other words, this is delayed data copy up operation and data is copied
+up when there is a need to actually modify data.
+
+There are multiple ways to enable/disable this feature. A config option
+CONFIG_OVERLAY_FS_METACOPY can be set/unset to enable/disable this feature
+by default. Or one can enable/disable it at module load time with module
+parameter metacopy=on/off. Lastly, there is also a per mount option
+metacopy=on/off to enable/disable this feature per mount.
+
+Do not use metacopy=on with untrusted upper/lower directories. Otherwise
+it is possible that an attacker can create an handcrafted file with
+appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower
+pointed by REDIRECT. This should not be possible on local system as setting
+"trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible
+for untrusted layers like from a pen drive.
+
 Sharing and copying layers
 --------------------------
 
@@ -280,7 +304,7 @@ though it will not result in a crash or deadlock.
 Mounting an overlay using an upper layer path, where the upper layer path
 was previously used by another mounted overlay in combination with a
 different lower layer path, is allowed, unless the "inodes index" feature
-is enabled.
+or "metadata only copy up" feature is enabled.
 
 With the "inodes index" feature, on the first time mount, an NFS file
 handle of the lower layer root directory, along with the UUID of the lower
@@ -293,6 +317,10 @@ lower root origin, mount will fail with ESTALE.  An overlayfs mount with
 does not support NFS export, lower filesystem does not have a valid UUID or
 if the upper filesystem does not support extended attributes.
 
+For "metadata only copy up" feature there is no verification mechanism at
+mount time. So if same upper is mouted with different set of lower, mount
+probably will succeed but expect the unexpected later on. So don't do it.
+
 It is quite a common practice to copy overlay layers to a different
 directory tree on the same or different underlying filesystem, and even
 to a different machine.  With the "inodes index" feature, trying to mount
diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 991c0a5a0e00..2e75a35e2726 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -64,6 +64,7 @@ config OVERLAY_FS_NFS_EXPORT
 	bool "Overlayfs: turn on NFS export feature by default"
 	depends on OVERLAY_FS
 	depends on OVERLAY_FS_INDEX
+	depends on !OVERLAY_FS_METACOPY
 	help
 	  If this config option is enabled then overlay filesystems will use
 	  the inodes index dir to decode overlay NFS file handles by default.
@@ -124,3 +125,21 @@ config OVERLAY_FS_COPY_UP_SHARED
 	 To get a maximally backward compatible kernel, disable this option.
 
 	 If unsure, say N.
+
+config OVERLAY_FS_METACOPY
+	bool "Overlayfs: turn on metadata only copy up feature by default"
+	depends on OVERLAY_FS
+	select OVERLAY_FS_REDIRECT_DIR
+	help
+	  If this config option is enabled then overlay filesystems will
+	  copy up only metadata where appropriate and data copy up will
+	  happen when a file is opended for WRITE operation. It is still
+	  possible to turn off this feature globally with the "metacopy=off"
+	  module option or on a filesystem instance basis with the
+	  "metacopy=off" mount option.
+
+	  Note, that this feature is not backward compatible.  That is,
+	  mounting an overlay which has metacopy only inodes on a kernel
+	  that doesn't support this feature will have unexpected results.
+
+	  If unsure, say N.
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 07b8cb2785c8..422896406048 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -20,6 +20,7 @@ struct ovl_config {
 	bool nfs_export;
 	bool copy_up_shared;
 	int xino;
+	bool metacopy;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 492d534058ae..2c5cc94e1c5f 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -70,6 +70,11 @@ static void ovl_entry_stack_free(struct ovl_entry *oe)
 		dput(oe->lowerstack[i].dentry);
 }
 
+static bool ovl_metacopy_def = IS_ENABLED(CONFIG_OVERLAY_FS_METACOPY);
+module_param_named(metacopy, ovl_metacopy_def, bool, 0644);
+MODULE_PARM_DESC(ovl_metacopy_def,
+		 "Default to on or off for the metadata only copy up feature");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -356,6 +361,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.copy_up_shared != ovl_copy_up_shared_def)
 		seq_printf(m, ",copy_up_shared=%s",
 			   ofs->config.copy_up_shared ? "on" : "off");
+	if (ofs->config.metacopy != ovl_metacopy_def)
+		seq_printf(m, ",metacopy=%s",
+			   ofs->config.metacopy ? "on" : "off");
 	return 0;
 }
 
@@ -395,6 +403,8 @@ enum {
 	OPT_XINO_AUTO,
 	OPT_COPY_UP_SHARED_ON,
 	OPT_COPY_UP_SHARED_OFF,
+	OPT_METACOPY_ON,
+	OPT_METACOPY_OFF,
 	OPT_ERR,
 };
 
@@ -413,6 +423,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_AUTO,			"xino=auto"},
 	{OPT_COPY_UP_SHARED_ON,		"copy_up_shared=on"},
 	{OPT_COPY_UP_SHARED_OFF,	"copy_up_shared=off"},
+	{OPT_METACOPY_ON,		"metacopy=on"},
+	{OPT_METACOPY_OFF,		"metacopy=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -465,6 +477,7 @@ static int ovl_parse_redirect_mode(struct ovl_config *config, const char *mode)
 static int ovl_parse_opt(char *opt, struct ovl_config *config)
 {
 	char *p;
+	int err;
 
 	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
 	if (!config->redirect_mode)
@@ -547,6 +560,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->copy_up_shared = false;
 			break;
 
+		case OPT_METACOPY_ON:
+			config->metacopy = true;
+			break;
+
+		case OPT_METACOPY_OFF:
+			config->metacopy = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -561,7 +582,20 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 		config->workdir = NULL;
 	}
 
-	return ovl_parse_redirect_mode(config, config->redirect_mode);
+	err = ovl_parse_redirect_mode(config, config->redirect_mode);
+	if (err)
+		return err;
+
+	/* metacopy feature with upper requires redirect_dir=on */
+	if (config->upperdir && config->metacopy && !config->redirect_dir) {
+		pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=on\", falling back to metacopy=off.\n");
+		config->metacopy = false;
+	} else if (config->metacopy && !config->redirect_follow) {
+		pr_warn("overlayfs: metadata only copy up requires \"redirect_dir=follow\" on non-upper mount, falling back to metacopy=off.\n");
+		config->metacopy = false;
+	}
+
+	return 0;
 }
 
 #define OVL_WORKDIR_NAME "work"
@@ -1035,7 +1069,8 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 	if (err) {
 		ofs->noxattr = true;
 		ofs->config.index = false;
-		pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off.\n");
+		ofs->config.metacopy = false;
+		pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off and metacopy=off.\n");
 		err = 0;
 	} else {
 		vfs_removexattr(ofs->workdir, OVL_XATTR_OPAQUE);
@@ -1057,7 +1092,6 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 		pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
 		ofs->config.nfs_export = false;
 	}
-
 out:
 	mnt_drop_write(mnt);
 	return err;
@@ -1369,6 +1403,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ofs->config.nfs_export = ovl_nfs_export_def;
 	ofs->config.xino = ovl_xino_def();
 	ofs->config.copy_up_shared = ovl_copy_up_shared_def;
+	ofs->config.metacopy = ovl_metacopy_def;
 	err = ovl_parse_opt((char *) data, &ofs->config);
 	if (err)
 		goto out_err;
@@ -1439,6 +1474,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		}
 	}
 
+	if (ofs->config.metacopy && ofs->config.nfs_export) {
+		pr_warn("overlayfs: NFS export is not supported with metadata only copy up, falling back to nfs_export=off.\n");
+		ofs->config.nfs_export = false;
+	}
+
 	if (ofs->config.nfs_export)
 		sb->s_export_op = &ovl_export_operations;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 05/30] ovl: During copy up, first copy up metadata and then data
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (3 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 04/30] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 06/30] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Just a little re-ordering of code. This helps with next patch where after
copying up metadata, we skip data copying step, if needed.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 36 +++++++++++++++++-------------------
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 4afc97872c2d..7e2dc41ad693 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -536,28 +536,10 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 {
 	int err;
 
-	if (S_ISREG(c->stat.mode)) {
-		struct path upperpath;
-
-		ovl_path_upper(c->dentry, &upperpath);
-		BUG_ON(upperpath.dentry != NULL);
-		upperpath.dentry = temp;
-
-		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
-		if (err)
-			return err;
-	}
-
 	err = ovl_copy_xattr(c->lowerpath.dentry, temp);
 	if (err)
 		return err;
 
-	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
-	inode_unlock(temp->d_inode);
-	if (err)
-		return err;
-
 	/*
 	 * Store identifier of lower inode in upper inode xattr to
 	 * allow lookup of the copy up origin inode.
@@ -571,7 +553,23 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	return 0;
+	if (S_ISREG(c->stat.mode)) {
+		struct path upperpath;
+
+		ovl_path_upper(c->dentry, &upperpath);
+		BUG_ON(upperpath.dentry != NULL);
+		upperpath.dentry = temp;
+
+		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+		if (err)
+			return err;
+	}
+
+	inode_lock(temp->d_inode);
+	err = ovl_set_attr(temp, &c->stat);
+	inode_unlock(temp->d_inode);
+
+	return err;
 }
 
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 06/30] ovl: Copy up only metadata during copy up where it makes sense
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (4 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 05/30] ovl: During copy up, first copy up metadata and then data Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 07/30] ovl: Add helper ovl_already_copied_up() Vivek Goyal
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

If it makes sense to copy up only metadata during copy up, do it. This
is done for regular files which are not opened for WRITE.

Right now ->metacopy is set to 0 always. Last patch in the series will
remove the hard coded statement and enable metacopy feature.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7e2dc41ad693..38b332d7c7e4 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -416,6 +416,7 @@ struct ovl_copy_up_ctx {
 	bool tmpfile;
 	bool origin;
 	bool indexed;
+	bool metacopy;
 };
 
 static int ovl_link_up(struct ovl_copy_up_ctx *c)
@@ -553,7 +554,7 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
-	if (S_ISREG(c->stat.mode)) {
+	if (S_ISREG(c->stat.mode) && !c->metacopy) {
 		struct path upperpath;
 
 		ovl_path_upper(c->dentry, &upperpath);
@@ -708,6 +709,26 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 	return err;
 }
 
+static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
+				  int flags)
+{
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+
+	/* TODO: Will enable metacopy in last patch of series */
+	return false;
+
+	if (!ofs->config.metacopy)
+		return false;
+
+	if (!S_ISREG(mode))
+		return false;
+
+	if (flags && ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC)))
+		return false;
+
+	return true;
+}
+
 static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   int flags)
 {
@@ -729,6 +750,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	if (err)
 		return err;
 
+	ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags);
+
 	if (parent) {
 		ovl_path_upper(parent, &parentpath);
 		ctx.destdir = parentpath.dentry;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 07/30] ovl: Add helper ovl_already_copied_up()
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (5 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 06/30] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 08/30] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Vivek Goyal
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

There are couple of places where we need to know if file is already copied
up (in lockless manner). Right now its open coded and there are only
two conditions to check. Soon this patch series will introduce another
condition to check and Amir wants to introduce one more. So introduce
a helper instead to check this so that code is easier to read.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 20 ++------------------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 26 +++++++++++++++++++++++++-
 3 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 38b332d7c7e4..6ac3220b0834 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -810,21 +810,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		struct dentry *next;
 		struct dentry *parent = NULL;
 
-		/*
-		 * Check if copy-up has happened as well as for upper alias (in
-		 * case of hard links) is there.
-		 *
-		 * Both checks are lockless:
-		 *  - false negatives: will recheck under oi->lock
-		 *  - false positives:
-		 *    + ovl_dentry_upper() uses memory barriers to ensure the
-		 *      upper dentry is up-to-date
-		 *    + ovl_dentry_has_upper_alias() relies on locking of
-		 *      upper parent i_rwsem to prevent reordering copy-up
-		 *      with rename.
-		 */
-		if (ovl_dentry_upper(dentry) &&
-		    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+		if (ovl_already_copied_up(dentry))
 			break;
 
 		next = dget(dentry);
@@ -852,9 +838,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) ||
-	     (dentry->d_flags & DCACHE_DISCONNECTED)))
+	if (ovl_already_copied_up(dentry))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 396d9ecca919..d8c182f9ec4f 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -244,6 +244,7 @@ bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
 int ovl_copy_up_start(struct dentry *dentry);
 void ovl_copy_up_end(struct dentry *dentry);
+bool ovl_already_copied_up(struct dentry *dentry);
 bool ovl_check_origin_xattr(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 25d202b47326..43235294e77b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -377,13 +377,37 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
+bool ovl_already_copied_up(struct dentry *dentry)
+{
+	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
+
+	/*
+	 * Check if copy-up has happened as well as for upper alias (in
+	 * case of hard links) is there.
+	 *
+	 * Both checks are lockless:
+	 *  - false negatives: will recheck under oi->lock
+	 *  - false positives:
+	 *    + ovl_dentry_upper() uses memory barriers to ensure the
+	 *      upper dentry is up-to-date
+	 *    + ovl_dentry_has_upper_alias() relies on locking of
+	 *      upper parent i_rwsem to prevent reordering copy-up
+	 *      with rename.
+	 */
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+		return true;
+
+	return false;
+}
+
 int ovl_copy_up_start(struct dentry *dentry)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_dentry_has_upper_alias(dentry)) {
+	if (!err && ovl_already_copied_up(dentry)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 08/30] ovl: A new xattr OVL_XATTR_METACOPY for file on upper
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (6 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 07/30] ovl: Add helper ovl_already_copied_up() Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 09/30] ovl: Use out_err instead of out_nomem Vivek Goyal
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Now we will have the capability to have upper inodes which might be only
metadata copy up and data is still on lower inode. So add a new xattr
OVL_XATTR_METACOPY to distinguish between two cases.

Presence of OVL_XATTR_METACOPY reflects that file has been copied up
metadata only and and data will be copied up later from lower origin.
So this xattr is set when a metadata copy takes place and cleared when
data copy takes place.

We also use a bit in ovl_inode->flags to cache OVL_UPPERDATA which reflects
whether ovl inode has data or not (as opposed to metadata only copy up).

If a file is copied up metadata only and later when same file is opened
for WRITE, then data copy up takes place. We copy up data, remove METACOPY
xattr and then set the UPPERDATA flag in ovl_inode->flags. While all
these operations happen with oi->lock held, read side of oi->flags can be
lockless. That is another thread on another cpu can check if UPPERDATA
flag is set or not.

So this gives us an ordering requirement w.r.t UPPERDATA flag. That is, if
another cpu sees UPPERDATA flag set, then it should be guaranteed that
effects of data copy up and remove xattr operations are also visible.

For example.

	CPU1				CPU2
ovl_open()				acquire(oi->lock)
 ovl_open_maybe_copy_up()                ovl_copy_up_data()
  open_open_need_copy_up()		 vfs_removexattr()
   ovl_already_copied_up()
    ovl_dentry_needs_data_copy_up()	 ovl_set_flag(OVL_UPPERDATA)
     ovl_test_flag(OVL_UPPERDATA)       release(oi->lock)

Say CPU2 is copying up data and in the end sets UPPERDATA flag. But if
CPU1 perceives the effects of setting UPPERDATA flag but not the effects
of preceeding operations (ex. upper that is not fully copied up), it will be
a problem.

Hence this patch introduces smp_wmb() on setting UPPERDATA flag operation
and smp_rmb() on UPPERDATA flag test operation.

May be some other lock or barrier is already covering it. But I am not sure
what that is and is it obvious enough that we will not break it in future.

So hence trying to be safe here and introducing barriers explicitly for
UPPERDATA flag/bit.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 56 ++++++++++++++++++++++++++++++----
 fs/overlayfs/dir.c       |  1 +
 fs/overlayfs/overlayfs.h | 18 +++++++++--
 fs/overlayfs/super.c     |  1 +
 fs/overlayfs/util.c      | 78 +++++++++++++++++++++++++++++++++++++++++++++---
 5 files changed, 143 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 6ac3220b0834..7453913ca61b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -195,6 +195,16 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
 	return error;
 }
 
+static int ovl_set_size(struct dentry *upperdentry, struct kstat *stat)
+{
+	struct iattr attr = {
+		.ia_valid = ATTR_SIZE,
+		.ia_size = stat->size,
+	};
+
+	return notify_change(upperdentry, &attr, NULL);
+}
+
 static int ovl_set_timestamps(struct dentry *upperdentry, struct kstat *stat)
 {
 	struct iattr attr = {
@@ -566,8 +576,18 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 			return err;
 	}
 
+	if (c->metacopy) {
+		err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
+					 NULL, 0, -EOPNOTSUPP);
+		if (err)
+			return err;
+	}
+
 	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, &c->stat);
+	if (c->metacopy)
+		err = ovl_set_size(temp, &c->stat);
+	if (!err)
+		err = ovl_set_attr(temp, &c->stat);
 	inode_unlock(temp->d_inode);
 
 	return err;
@@ -605,6 +625,8 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *c)
 	if (err)
 		goto out_cleanup;
 
+	if (!c->metacopy)
+		ovl_set_upperdata(d_inode(c->dentry));
 	inode = d_inode(c->dentry);
 	ovl_inode_update(inode, newdentry);
 	if (S_ISDIR(inode->i_mode))
@@ -729,6 +751,28 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 	return true;
 }
 
+/* Copy up data of an inode which was copied up metadata only in the past. */
+static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
+{
+	struct path upperpath;
+	int err;
+
+	ovl_path_upper(c->dentry, &upperpath);
+	if (WARN_ON(upperpath.dentry == NULL))
+		return -EIO;
+
+	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	if (err)
+		return err;
+
+	err = vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY);
+	if (err)
+		return err;
+
+	ovl_set_upperdata(d_inode(c->dentry));
+	return err;
+}
+
 static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   int flags)
 {
@@ -775,7 +819,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	}
 	ovl_do_check_copy_up(ctx.lowerpath.dentry);
 
-	err = ovl_copy_up_start(dentry);
+	err = ovl_copy_up_start(dentry, flags);
 	/* err < 0: interrupted, err > 0: raced with another copy-up */
 	if (unlikely(err)) {
 		if (err > 0)
@@ -785,6 +829,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			err = ovl_do_copy_up(&ctx);
 		if (!err && parent && !ovl_dentry_has_upper_alias(dentry))
 			err = ovl_link_up(&ctx);
+		if (!err && ovl_dentry_needs_data_copy_up_locked(dentry, flags))
+			err = ovl_copy_up_meta_inode_data(&ctx);
 		ovl_copy_up_end(dentry);
 	}
 	do_delayed_call(&done);
@@ -810,7 +856,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		struct dentry *next;
 		struct dentry *parent = NULL;
 
-		if (ovl_already_copied_up(dentry))
+		if (ovl_already_copied_up(dentry, flags))
 			break;
 
 		next = dget(dentry);
@@ -838,13 +884,13 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags)
 {
 	/* Copy up of disconnected dentry does not set upper alias */
-	if (ovl_already_copied_up(dentry))
+	if (ovl_already_copied_up(dentry, flags))
 		return false;
 
 	if (special_file(d_inode(dentry)->i_mode))
 		return false;
 
-	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
+	if (!ovl_open_flags_need_copy_up(flags))
 		return false;
 
 	return true;
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 47dc980e8b33..bdaa5e4fa603 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -190,6 +190,7 @@ static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 	ovl_copyattr(d_inode(newdentry), inode);
 	ovl_dentry_set_upper_alias(dentry);
 	if (!hardlink) {
+		ovl_set_upperdata(inode);
 		ovl_inode_update(inode, newdentry);
 	} else {
 		WARN_ON(ovl_inode_real(inode) != d_inode(newdentry));
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d8c182f9ec4f..2daea529b7eb 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -29,6 +29,7 @@ enum ovl_path_type {
 #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure"
 #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink"
 #define OVL_XATTR_UPPER OVL_XATTR_PREFIX "upper"
+#define OVL_XATTR_METACOPY OVL_XATTR_PREFIX "metacopy"
 
 enum ovl_inode_flag {
 	/* Pure upper dir that may contain non pure upper entries */
@@ -36,6 +37,7 @@ enum ovl_inode_flag {
 	/* Non-merge dir that may contain whiteout entries */
 	OVL_WHITEOUTS,
 	OVL_INDEX,
+	OVL_UPPERDATA,
 };
 
 enum ovl_entry_flag {
@@ -197,6 +199,14 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
 	return ret;
 }
 
+static inline bool ovl_open_flags_need_copy_up(int flags)
+{
+	if (!flags)
+		return false;
+
+	return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
+}
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
@@ -232,6 +242,10 @@ bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry);
 bool ovl_dentry_has_upper_alias(struct dentry *dentry);
 void ovl_dentry_set_upper_alias(struct dentry *dentry);
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags);
+bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags);
+bool ovl_has_upperdata(struct inode *inode);
+void ovl_set_upperdata(struct inode *inode);
 bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
@@ -242,9 +256,9 @@ void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
-int ovl_copy_up_start(struct dentry *dentry);
+int ovl_copy_up_start(struct dentry *dentry, int flags);
 void ovl_copy_up_end(struct dentry *dentry);
-bool ovl_already_copied_up(struct dentry *dentry);
+bool ovl_already_copied_up(struct dentry *dentry, int flags);
 bool ovl_check_origin_xattr(struct dentry *dentry);
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name);
 int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 2c5cc94e1c5f..1213c035e9ad 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1508,6 +1508,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	/* Root is always merge -> can have whiteouts */
 	ovl_set_flag(OVL_WHITEOUTS, d_inode(root_dentry));
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
+	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
 		       ovl_dentry_lower(root_dentry));
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 43235294e77b..f8e3c95711b8 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -279,6 +279,62 @@ void ovl_dentry_set_upper_alias(struct dentry *dentry)
 	ovl_dentry_set_flag(OVL_E_UPPER_ALIAS, dentry);
 }
 
+static bool ovl_should_check_upperdata(struct inode *inode)
+{
+	if (!S_ISREG(inode->i_mode))
+		return false;
+
+	if (!ovl_inode_lower(inode))
+		return false;
+
+	return true;
+}
+
+bool ovl_has_upperdata(struct inode *inode)
+{
+	if (!ovl_should_check_upperdata(inode))
+		return true;
+
+	if (!ovl_test_flag(OVL_UPPERDATA, inode))
+		return false;
+	/*
+	 * Pairs with smp_wmb() in ovl_set_upperdata(). Main user of
+	 * ovl_has_upperdata() is ovl_copy_up_meta_inode_data(). Make sure
+	 * if setting of OVL_UPPERDATA is visible, then effects of writes
+	 * before that are visible too.
+	 */
+	smp_rmb();
+	return true;
+}
+
+void ovl_set_upperdata(struct inode *inode)
+{
+	/*
+	 * Pairs with smp_rmb() in ovl_has_upperdata(). Make sure
+	 * if OVL_UPPERDATA flag is visible, then effects of write operations
+	 * before it are visible as well.
+	 */
+	smp_wmb();
+	ovl_set_flag(OVL_UPPERDATA, inode);
+}
+
+/* Caller should hold ovl_inode->lock */
+bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
+{
+	if (!ovl_open_flags_need_copy_up(flags))
+		return false;
+
+	return !ovl_test_flag(OVL_UPPERDATA, d_inode(dentry));
+}
+
+bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags)
+{
+	if (!ovl_open_flags_need_copy_up(flags))
+		return false;
+
+	return !ovl_has_upperdata(d_inode(dentry));
+}
+
 bool ovl_redirect_dir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -377,7 +433,20 @@ struct file *ovl_path_open(struct path *path, int flags)
 	return dentry_open(path, flags | O_NOATIME, current_cred());
 }
 
-bool ovl_already_copied_up(struct dentry *dentry)
+/* Caller should hold ovl_inode->lock */
+static bool ovl_already_copied_up_locked(struct dentry *dentry, int flags)
+{
+	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
+
+	if (ovl_dentry_upper(dentry) &&
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected) &&
+	    !ovl_dentry_needs_data_copy_up_locked(dentry, flags))
+		return true;
+
+	return false;
+}
+
+bool ovl_already_copied_up(struct dentry *dentry, int flags)
 {
 	bool disconnected = dentry->d_flags & DCACHE_DISCONNECTED;
 
@@ -395,19 +464,20 @@ bool ovl_already_copied_up(struct dentry *dentry)
 	 *      with rename.
 	 */
 	if (ovl_dentry_upper(dentry) &&
-	    (ovl_dentry_has_upper_alias(dentry) || disconnected))
+	    (ovl_dentry_has_upper_alias(dentry) || disconnected) &&
+	    !ovl_dentry_needs_data_copy_up(dentry, flags))
 		return true;
 
 	return false;
 }
 
-int ovl_copy_up_start(struct dentry *dentry)
+int ovl_copy_up_start(struct dentry *dentry, int flags)
 {
 	struct ovl_inode *oi = OVL_I(d_inode(dentry));
 	int err;
 
 	err = mutex_lock_interruptible(&oi->lock);
-	if (!err && ovl_already_copied_up(dentry)) {
+	if (!err && ovl_already_copied_up_locked(dentry, flags)) {
 		err = 1; /* Already copied up */
 		mutex_unlock(&oi->lock);
 	}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 09/30] ovl: Use out_err instead of out_nomem
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (7 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 08/30] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now we use goto out_nomem which assumes error code is -ENOMEM. But
there are other errors returned like -ESTALE as well. So instead of out_nomem,
use out_err which will do ERR_PTR(err). That way one can putt error code
in err and jump to out_err.

This just code reorganization and no change of functionality.

I am about to add more code and this organization helps laying more code
and error paths on top of it.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index daab9358db07..c128d5d54d0f 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -772,6 +772,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
+	int err = -ENOMEM;
 
 	if (!realinode)
 		realinode = d_inode(lowerdentry);
@@ -789,7 +790,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 		inode = iget5_locked(oip->sb, (unsigned long) key,
 				     ovl_inode_test, ovl_inode_set, key);
 		if (!inode)
-			goto out_nomem;
+			goto out_err;
 		if (!(inode->i_state & I_NEW)) {
 			/*
 			 * Verify that the underlying files stored in the inode
@@ -798,8 +799,8 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 			if (!ovl_verify_inode(inode, lowerdentry, upperdentry,
 					      true)) {
 				iput(inode);
-				inode = ERR_PTR(-ESTALE);
-				goto out;
+				err = -ESTALE;
+				goto out_err;
 			}
 
 			dput(upperdentry);
@@ -815,8 +816,10 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 	} else {
 		/* Lower hardlink that will be broken on copy up */
 		inode = new_inode(oip->sb);
-		if (!inode)
-			goto out_nomem;
+		if (!inode) {
+			err = -ENOMEM;
+			goto out_err;
+		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
 	ovl_inode_init(inode, upperdentry, lowerdentry);
@@ -842,7 +845,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 out:
 	return inode;
 
-out_nomem:
-	inode = ERR_PTR(-ENOMEM);
+out_err:
+	inode = ERR_PTR(err);
 	goto out;
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (8 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 09/30] ovl: Use out_err instead of out_nomem Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 19:14   ` Amir Goldstein
                     ` (3 more replies)
  2018-05-07 17:40 ` [PATCH v15 11/30] ovl: Copy up meta inode data from lowest data inode Vivek Goyal
                   ` (21 subsequent siblings)
  31 siblings, 4 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
It also allows for presence of metacopy dentries in lower layer.

During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
set OVL_UPPERDATA bit in flags.

We don't support metacopy feature with nfs_export. So in nfs_export code,
we set OVL_UPPERDATA flag set unconditionally if upper inode exists.

Do not follow metacopy origin if we find a metacopy only inode and metacopy
feature is not enabled for that mount. Like redirect, this can have security
implications where an attacker could hand craft upper and try to gain
access to file on lower which it should not have to begin with.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    |   3 ++
 fs/overlayfs/inode.c     |  11 ++++-
 fs/overlayfs/namei.c     | 108 +++++++++++++++++++++++++++++++++++++++++------
 fs/overlayfs/overlayfs.h |   1 +
 fs/overlayfs/util.c      |  22 ++++++++++
 5 files changed, 130 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 0549286cc55e..52a09a9f74b7 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -314,6 +314,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 		return ERR_CAST(inode);
 	}
 
+	if (upper)
+		ovl_set_flag(OVL_UPPERDATA, inode);
+
 	dentry = d_find_any_alias(inode);
 	if (!dentry) {
 		dentry = d_alloc_anon(inode->i_sb);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index c128d5d54d0f..83b276ce0240 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -770,7 +770,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 	bool bylower = ovl_hash_bylower(oip->sb, upperdentry, lowerdentry,
 					oip->index);
 	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
-	bool is_dir;
+	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = -ENOMEM;
 
@@ -830,6 +830,15 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	if (upperdentry) {
+		err = ovl_check_metacopy_xattr(upperdentry);
+		if (err < 0)
+			goto out_err;
+		metacopy = err;
+		if (!metacopy)
+			ovl_set_flag(OVL_UPPERDATA, inode);
+	}
+
 	OVL_I(inode)->redirect = oip->redirect;
 
 	/* Check for non-merge dir that may have whiteouts */
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 8fd817bf5529..b2ff08985e29 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -24,6 +24,7 @@ struct ovl_lookup_data {
 	bool stop;
 	bool last;
 	char *redirect;
+	bool metacopy;
 };
 
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
@@ -253,19 +254,29 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		goto put_and_out;
 	}
 	if (!d_can_lookup(this)) {
-		d->stop = true;
-		if (d->is_dir)
+		if (d->is_dir) {
+			d->stop = true;
 			goto put_and_out;
-
+		}
 		/*
 		 * NB: handle failure to lookup non-last element when non-dir
 		 * redirects become possible
 		 */
 		WARN_ON(!last_element);
+		err = ovl_check_metacopy_xattr(this);
+		if (err < 0)
+			goto out_err;
+		d->stop = !err;
+		d->metacopy = !!err;
 		goto out;
 	}
-	if (last_element)
+	if (last_element) {
+		if (d->metacopy) {
+			err = -ESTALE;
+			goto out_err;
+		}
 		d->is_dir = true;
+	}
 	if (d->last)
 		goto out;
 
@@ -823,7 +834,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct ovl_path *stack = NULL;
+	struct ovl_path *stack = NULL, *origin_path = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *origin = NULL;
 	struct dentry *index = NULL;
@@ -834,6 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	bool metacopy = false;
 	struct ovl_lookup_data d = {
 		.name = dentry->d_name,
 		.is_dir = false,
@@ -841,6 +853,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		.stop = false,
 		.last = ofs->config.redirect_follow ? false : !poe->numlower,
 		.redirect = NULL,
+		.metacopy = false,
 	};
 
 	if (dentry->d_name.len > ofs->namelen)
@@ -859,7 +872,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			goto out;
 		}
 		if (upperdentry && !d.is_dir) {
-			BUG_ON(!d.stop || d.redirect);
+			unsigned int origin_ctr = 0;
+			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
@@ -870,9 +884,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			 * number - it's the same as if we held a reference
 			 * to a dentry in lower layer that was moved under us.
 			 */
-			err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
+			err = ovl_check_origin(ofs, upperdentry, &origin_path,
+					       &origin_ctr);
 			if (err)
 				goto out_put_upper;
+
+			if (d.metacopy)
+				metacopy = true;
 		}
 
 		if (d.redirect) {
@@ -913,7 +931,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		 * If no origin fh is stored in upper of a merge dir, store fh
 		 * of lower dir and set upper parent "impure".
 		 */
-		if (upperdentry && !ctr && !ofs->noxattr) {
+		if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
 			err = ovl_fix_origin(dentry, this, upperdentry);
 			if (err) {
 				dput(this);
@@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		 * When "verify_lower" feature is enabled, do not merge with a
 		 * lower dir that does not match a stored origin xattr. In any
 		 * case, only verified origin is used for index lookup.
+		 *
+		 * For non-dir dentry, make sure dentry found by lookup
+		 * matches the origin stored in upper. Otherwise its an
+		 * error.
 		 */
-		if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
+		if (upperdentry && !ctr &&
+		    ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
+		     (!d.is_dir && origin_path))) {
 			err = ovl_verify_origin(upperdentry, this, false);
 			if (err) {
 				dput(this);
-				break;
+				if (d.is_dir)
+					break;
+				goto out_put;
 			}
-
-			/* Bless lower dir as verified origin */
+			/* Bless lower as verified origin */
 			origin = this;
 		}
 
+		if (d.metacopy)
+			metacopy = true;
+		/*
+		 * Do not store intermediate metacopy dentries in chain,
+		 * except top most lower metacopy dentry
+		 */
+		if (d.metacopy && ctr) {
+			dput(this);
+			continue;
+		}
+
 		stack[ctr].dentry = this;
 		stack[ctr].layer = lower.layer;
 		ctr++;
@@ -968,13 +1004,49 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 	}
 
+	if (metacopy) {
+		/*
+		 * Found a metacopy dentry but did not find corresponding
+		 * data dentry
+		 */
+		if (d.metacopy) {
+			err = -ESTALE;
+			goto out_put;
+		}
+
+		err = -EPERM;
+		if (!ofs->config.metacopy) {
+			pr_warn_ratelimited("overlay: refusing to follow"
+					    " metacopy origin for (%pd2)\n",
+					    dentry);
+			goto out_put;
+		}
+	} else if (!d.is_dir && upperdentry && !ctr && origin_path) {
+		if (WARN_ON(stack != NULL)) {
+			err = -EIO;
+			goto out_put;
+		}
+		stack = origin_path;
+		ctr = 1;
+		origin_path = NULL;
+	}
+
 	/*
 	 * Lookup index by lower inode and verify it matches upper inode.
 	 * We only trust dir index if we verified that lower dir matches
 	 * origin, otherwise dir index entries may be inconsistent and we
-	 * ignore them. Always lookup index of non-dir and non-upper.
+	 * ignore them.
+	 *
+	 * For non-dir upper metacopy dentry, we already set "origin" if we
+	 * verified that lower matched upper origin. If upper origin was
+	 * not present (because lower layer did not support fh encode/decode),
+	 * do not set "origin" and skip looking up index. This case should
+	 * be handled in same way as a non-dir upper without ORIGIN is
+	 * handled.
+	 *
+	 * Always lookup index of non-dir non-metacopy and non-upper.
 	 */
-	if (ctr && (!upperdentry || !d.is_dir))
+	if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
 		origin = stack[0].dentry;
 
 	if (origin && ovl_indexdir(dentry->d_sb) &&
@@ -1015,6 +1087,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	revert_creds(old_cred);
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
@@ -1029,6 +1105,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		dput(stack[i].dentry);
 	kfree(stack);
 out_put_upper:
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(upperdentry);
 	kfree(upperredirect);
 out:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 2daea529b7eb..e8954fff1c45 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -274,6 +274,7 @@ bool ovl_need_index(struct dentry *dentry);
 int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
+int ovl_check_metacopy_xattr(struct dentry *dentry);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f8e3c95711b8..ab9a8fae0f99 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -778,3 +778,25 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
 	pr_err("overlayfs: failed to lock workdir+upperdir\n");
 	return -EIO;
 }
+
+/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
+int ovl_check_metacopy_xattr(struct dentry *dentry)
+{
+	int res;
+
+	/* Only regular files can have metacopy xattr */
+	if (!S_ISREG(d_inode(dentry)->i_mode))
+		return 0;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto out;
+	}
+
+	return 1;
+out:
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	return res;
+}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 11/30] ovl: Copy up meta inode data from lowest data inode
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (9 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 12/30] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Vivek Goyal
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

So far lower could not be a meta inode. So whenever it was time to copy
up data of a meta inode, we could copy it up from top most lower dentry.

But now lower itself can be a metacopy inode. That means data copy up
needs to take place from a data inode in metacopy inode chain. Find
lower data inode in the chain and use that for data copy up.

Introduced a helper called ovl_path_lowerdata() to find the lower
data inode chain.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/copy_up.c   | 14 ++++++++++----
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 14 ++++++++++++++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7453913ca61b..5a865a4cf3d7 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -565,13 +565,15 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp)
 	}
 
 	if (S_ISREG(c->stat.mode) && !c->metacopy) {
-		struct path upperpath;
+		struct path upperpath, datapath;
 
 		ovl_path_upper(c->dentry, &upperpath);
 		BUG_ON(upperpath.dentry != NULL);
 		upperpath.dentry = temp;
 
-		err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+		ovl_path_lowerdata(c->dentry, &datapath);
+		BUG_ON(datapath.dentry == NULL);
+		err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size);
 		if (err)
 			return err;
 	}
@@ -754,14 +756,18 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 /* Copy up data of an inode which was copied up metadata only in the past. */
 static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
 {
-	struct path upperpath;
+	struct path upperpath, datapath;
 	int err;
 
 	ovl_path_upper(c->dentry, &upperpath);
 	if (WARN_ON(upperpath.dentry == NULL))
 		return -EIO;
 
-	err = ovl_copy_up_data(&c->lowerpath, &upperpath, c->stat.size);
+	ovl_path_lowerdata(c->dentry, &datapath);
+	if (WARN_ON(datapath.dentry == NULL))
+		return -EIO;
+
+	err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size);
 	if (err)
 		return err;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e8954fff1c45..ee66307451f0 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -223,6 +223,7 @@ bool ovl_dentry_weird(struct dentry *dentry);
 enum ovl_path_type ovl_path_type(struct dentry *dentry);
 void ovl_path_upper(struct dentry *dentry, struct path *path);
 void ovl_path_lower(struct dentry *dentry, struct path *path);
+void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index ab9a8fae0f99..74b38e17a476 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -164,6 +164,20 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
 	}
 }
 
+void ovl_path_lowerdata(struct dentry *dentry, struct path *path)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+	int idx = oe->numlower - 1;
+
+	if (!oe->numlower) {
+		*path = (struct path) { };
+		return;
+	}
+
+	path->mnt = oe->lowerstack[idx].layer->mnt;
+	path->dentry = oe->lowerstack[idx].dentry;
+}
+
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
 {
 	enum ovl_path_type type = ovl_path_type(dentry);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 12/30] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (10 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 11/30] ovl: Copy up meta inode data from lowest data inode Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 13/30] ovl: Add an helper to get real " Vivek Goyal
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Now we have the notion of data dentry and metacopy dentry. ovl_dentry_lower()
will return uppermost lower dentry, but it could be either data or metacopy
dentry. Now we support metacopy dentries in lower layers so it is possible
that lowerstack[0] is metacopy dentry while lowerstack[1] is actual data
dentry.

So add an helper which returns lowest most dentry which is supposed to be
data dentry.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ee66307451f0..4f2bb472c07b 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -227,6 +227,7 @@ void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
+struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
 struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 74b38e17a476..58c4031c1b88 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -209,6 +209,20 @@ struct ovl_layer *ovl_layer_lower(struct dentry *dentry)
 	return oe->numlower ? oe->lowerstack[0].layer : NULL;
 }
 
+/*
+ * ovl_dentry_lower() could return either a data dentry or metacopy dentry
+ * dependig on what is stored in lowerstack[0]. At times we need to find
+ * lower dentry which has data (and not metacopy dentry). This helper
+ * returns the lower data dentry.
+ */
+struct dentry *ovl_dentry_lowerdata(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+	int idx = oe->numlower - 1;
+
+	return idx >= 0 ? oe->lowerstack[idx].dentry : NULL;
+}
+
 struct dentry *ovl_dentry_real(struct dentry *dentry)
 {
 	return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 13/30] ovl: Add an helper to get real data dentry
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (11 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 12/30] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 14/30] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_dentry_real() returns the real dentry. (Either upper or lower). We
also need an helper to ignore metacopy dentries and return "real data"
dentry. This helper returns an upper/lower dentry which contains data.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 4f2bb472c07b..a38aa95f4795 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -230,6 +230,7 @@ struct dentry *ovl_dentry_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
 struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
+struct dentry *ovl_dentry_realdata(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 58c4031c1b88..873d4e56e21b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -228,6 +228,18 @@ struct dentry *ovl_dentry_real(struct dentry *dentry)
 	return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
 }
 
+/* Return real dentry which contains data. Skip metacopy dentries */
+struct dentry *ovl_dentry_realdata(struct dentry *dentry)
+{
+	struct dentry *upperdentry;
+
+	upperdentry = ovl_dentry_upper(dentry);
+	if (upperdentry && ovl_has_upperdata(d_inode(dentry)))
+		return upperdentry;
+
+	return ovl_dentry_lowerdata(dentry);
+}
+
 struct dentry *ovl_i_dentry_upper(struct inode *inode)
 {
 	return ovl_upperdentry_dereference(OVL_I(inode));
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 14/30] ovl: Fix ovl_getattr() to get number of blocks from lower
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (12 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 13/30] ovl: Add an helper to get real " Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode Vivek Goyal
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

If an inode has been copied up metadata only, then we need to query the
number of blocks from lower and fill up the stat->st_blocks.

We need to be careful about races where we are doing stat on one cpu and
data copy up is taking place on other cpu. We want to return
stat->st_blocks either from lower or stable upper and not something in
between. Hence, ovl_has_upperdata() is called first to figure out whether
block reporting will take place from lower or upper.

We now support metacopy dentries in middle layer. That means number of
blocks reporting needs to come from lowest data dentry and this could
be different from lower dentry. Hence we end up making a separate
vfs_getxattr() call for metacopy dentries to get number of blocks.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c     | 35 ++++++++++++++++++++++++++++++++++-
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 16 ++++++++++++++++
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 83b276ce0240..5d461cd57b48 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -145,6 +145,9 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	bool samefs = ovl_same_sb(dentry->d_sb);
 	struct ovl_layer *lower_layer = NULL;
 	int err;
+	bool metacopy_blocks = false;
+
+	metacopy_blocks = ovl_is_metacopy_dentry(dentry);
 
 	type = ovl_path_real(dentry, &realpath);
 	old_cred = ovl_override_creds(dentry->d_sb);
@@ -166,7 +169,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			lower_layer = ovl_layer_lower(dentry);
 		} else if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
-			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
+			u32 lowermask = STATX_INO | STATX_BLOCKS |
+					(!is_dir ? STATX_NLINK : 0);
 
 			ovl_path_lower(dentry, &realpath);
 			err = vfs_getattr(&realpath, &lowerstat,
@@ -195,6 +199,35 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 				stat->ino = lowerstat.ino;
 				lower_layer = ovl_layer_lower(dentry);
 			}
+
+			/*
+			 * If we are querying a metacopy dentry and lower
+			 * dentry is data dentry, then use the blocks we
+			 * queried just now. We don't have to do additional
+			 * vfs_getattr(). If lower itself is metacopy, then
+			 * additional vfs_getattr() is unavoidable.
+			 */
+			if (metacopy_blocks &&
+			    realpath.dentry == ovl_dentry_lowerdata(dentry)) {
+				stat->blocks = lowerstat.blocks;
+				metacopy_blocks = false;
+			}
+		}
+
+		if (metacopy_blocks) {
+			/*
+			 * If lower is not same as lowerdata or if there was
+			 * no origin on upper, we can end up here.
+			 */
+			struct kstat lowerdatastat;
+			u32 lowermask = STATX_BLOCKS;
+
+			ovl_path_lowerdata(dentry, &realpath);
+			err = vfs_getattr(&realpath, &lowerdatastat,
+					  lowermask, flags);
+			if (err)
+				goto out;
+			stat->blocks = lowerdatastat.blocks;
 		}
 	}
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index a38aa95f4795..6d64796e0060 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -278,6 +278,7 @@ int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
+bool ovl_is_metacopy_dentry(struct dentry *dentry);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 873d4e56e21b..7b7d4623df16 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -840,3 +840,19 @@ int ovl_check_metacopy_xattr(struct dentry *dentry)
 	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
 	return res;
 }
+
+bool ovl_is_metacopy_dentry(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	if (!d_is_reg(dentry))
+		return false;
+
+	if (ovl_dentry_upper(dentry)) {
+		if (!ovl_has_upperdata(d_inode(dentry)))
+			return true;
+		return false;
+	}
+
+	return (oe->numlower > 1);
+}
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (13 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 14/30] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 18:59   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 16/30] ovl: Add helper ovl_inode_real_data() Vivek Goyal
                   ` (16 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now ovl_inode stores inode pointer for lower inode. This helps
with quickly getting lower inode given overlay inode (ovl_inode_lower()).

Now with metadata only copy-up, we can have metacopy inode in middle
layer as well and inode containing data can be different from ->lower.
I need to be able to open the real file in ovl_open_realfile() and
for that I need to quickly find the lower data inode.

Hence store lower data inode also in ovl_inode. Also provide an
helper ovl_inode_lowerdata() to access this field.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/export.c    |  2 +-
 fs/overlayfs/inode.c     |  2 +-
 fs/overlayfs/namei.c     |  7 +++++--
 fs/overlayfs/overlayfs.h |  3 ++-
 fs/overlayfs/ovl_entry.h |  6 +++++-
 fs/overlayfs/super.c     |  8 ++++++--
 fs/overlayfs/util.c      | 10 +++++++++-
 7 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 52a09a9f74b7..77d98aa7f118 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -301,7 +301,7 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	struct inode *inode;
 	struct ovl_entry *oe;
 	struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower,
-				       NULL};
+				       NULL, NULL};
 
 	/* We get overlay directory dentries with ovl_lookup_real() */
 	if (d_is_dir(upper ?: lower))
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 5d461cd57b48..949ddc7c6f59 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -855,7 +855,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
-	ovl_inode_init(inode, upperdentry, lowerdentry);
+	ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index b2ff08985e29..a2556f981d3e 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1076,10 +1076,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		upperdentry = dget(index);
 
 	if (upperdentry || ctr) {
+		struct dentry *lowerdata = NULL;
 		struct ovl_inode_params oip = {dentry->d_sb, upperdentry,
 					       stack, index, ctr,
-					       upperredirect};
-
+					       upperredirect, NULL};
+		if (ctr > 1 && !d.is_dir)
+			lowerdata = stack[ctr - 1].dentry;
+		oip.lowerdata = lowerdata;
 		inode = ovl_get_inode(&oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 6d64796e0060..8c68387efe87 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -234,6 +234,7 @@ struct dentry *ovl_dentry_realdata(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
+struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
@@ -253,7 +254,7 @@ bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry);
+		    struct dentry *lowerdentry, struct dentry *lowerdata);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
 void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 422896406048..f72d6191357e 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -90,7 +90,10 @@ static inline struct ovl_entry *OVL_E(struct dentry *dentry)
 }
 
 struct ovl_inode {
-	struct ovl_dir_cache *cache;
+	union {
+		struct ovl_dir_cache *cache;
+		struct inode *lowerdata;
+	};
 	const char *redirect;
 	u64 version;
 	unsigned long flags;
@@ -109,6 +112,7 @@ struct ovl_inode_params {
 	struct dentry *index;
 	unsigned int numlower;
 	char *redirect;
+	struct dentry *lowerdata;
 };
 
 static inline struct ovl_inode *OVL_I(struct inode *inode)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 1213c035e9ad..ce90b6e3ce76 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -187,6 +187,7 @@ static struct inode *ovl_alloc_inode(struct super_block *sb)
 	oi->flags = 0;
 	oi->__upperdentry = NULL;
 	oi->lower = NULL;
+	oi->lowerdata = NULL;
 	mutex_init(&oi->lock);
 
 	return &oi->vfs_inode;
@@ -205,8 +206,11 @@ static void ovl_destroy_inode(struct inode *inode)
 
 	dput(oi->__upperdentry);
 	iput(oi->lower);
+	if (S_ISDIR(inode->i_mode))
+		ovl_dir_cache_free(inode);
+	else
+		iput(oi->lowerdata);
 	kfree(oi->redirect);
-	ovl_dir_cache_free(inode);
 	mutex_destroy(&oi->lock);
 
 	call_rcu(&inode->i_rcu, ovl_i_callback);
@@ -1510,7 +1514,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
 	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
-		       ovl_dentry_lower(root_dentry));
+		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 7b7d4623df16..90ea0a1622c7 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -262,6 +262,12 @@ struct inode *ovl_inode_real(struct inode *inode)
 	return ovl_inode_upper(inode) ?: ovl_inode_lower(inode);
 }
 
+/* Return inode which containers lower data. Do not return metacopy */
+struct inode *ovl_inode_lowerdata(struct inode *inode)
+{
+	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
+}
+
 
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
@@ -396,7 +402,7 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
 }
 
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry)
+		    struct dentry *lowerdentry, struct dentry *lowerdata)
 {
 	struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
 
@@ -404,6 +410,8 @@ void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 		OVL_I(inode)->__upperdentry = upperdentry;
 	if (lowerdentry)
 		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
+	if (lowerdata)
+		OVL_I(inode)->lowerdata = igrab(d_inode(lowerdata));
 
 	ovl_copyattr(realinode, inode);
 	ovl_copyflags(realinode, inode);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 16/30] ovl: Add helper ovl_inode_real_data()
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (14 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 18:18   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 17/30] ovl: Open file with data except for the case of fsync Vivek Goyal
                   ` (15 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Add an helper to retrieve real data inode associated with overlay inode.
This helper will ignore all metacopy inodes and will return only the
real inode which has data.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 8c68387efe87..2dabe5f11d22 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -236,6 +236,7 @@ struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
 struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
+struct inode *ovl_inode_realdata(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
 void ovl_dentry_set_flag(unsigned long flag, struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 90ea0a1622c7..eb6453e5cfa6 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -268,6 +268,18 @@ struct inode *ovl_inode_lowerdata(struct inode *inode)
 	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
 }
 
+/* Return real inode which contains data. Does not return metacopy inode */
+struct inode *ovl_inode_realdata(struct inode *inode)
+{
+	struct inode *upperinode;
+
+	upperinode = ovl_inode_upper(inode);
+	if (upperinode && ovl_has_upperdata(inode))
+		return upperinode;
+
+	return ovl_inode_lowerdata(inode);
+}
+
 
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (15 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 16/30] ovl: Add helper ovl_inode_real_data() Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 19:47   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 18/30] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
                   ` (14 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_open() should open file which contains data and not open metacopy
inode. With the introduction of metacopy inodes, with current implementaion
we will end up opening metacopy inode as well.

But there can be certain circumstances like ovl_fsync() where we
want to allow opening a metacopy inode instead.

Hence, change ovl_open_realfile() and ovl_open_real() and add extra
parameter which specifies whether to allow opening metacopy inode or not.
If this parameter is false, we look for data inode and open that.

This should allow covering both the cases.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c | 49 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 33 insertions(+), 16 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index a60734ec89ec..885151e8d0cb 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -14,22 +14,32 @@
 #include <linux/uio.h>
 #include "overlayfs.h"
 
-static struct file *ovl_open_realfile(const struct file *file)
+static struct file *ovl_open_realfile(const struct file *file,
+				      bool allow_metacopy)
 {
 	struct inode *inode = file_inode(file);
 	struct inode *upperinode = ovl_inode_upper(inode);
-	struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
+	struct inode *realinode;
 	struct file *realfile;
+	bool upperopen = false;
 	const struct cred *old_cred;
 
+	if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
+		realinode = upperinode;
+		upperopen = true;
+	} else {
+		realinode = allow_metacopy ? ovl_inode_lower(inode) :
+				 ovl_inode_lowerdata(inode);
+	}
 	old_cred = ovl_override_creds(inode->i_sb);
 	realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
 			     realinode, current_cred(), false);
 	revert_creds(old_cred);
 
 	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
-		 file, file, upperinode ? 'u' : 'l', file->f_flags,
-		 realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
+		 file, file, upperopen ? 'u' : 'l',
+		 file->f_flags, realfile,
+		 IS_ERR(realfile) ? 0 : realfile->f_flags);
 
 	return realfile;
 }
@@ -72,17 +82,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
 	return 0;
 }
 
-static int ovl_real_fdget(const struct file *file, struct fd *real)
+static int ovl_real_fdget(const struct file *file, struct fd *real,
+			  bool allow_metacopy)
 {
 	struct inode *inode = file_inode(file);
+	struct inode *realinode;
 
 	real->flags = 0;
 	real->file = file->private_data;
 
+	if (allow_metacopy)
+		realinode = ovl_inode_real(inode);
+	else
+		realinode = ovl_inode_realdata(inode);
+
 	/* Has it been copied up since we'd opened it? */
-	if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
+	if (unlikely(file_inode(real->file) != realinode)) {
 		real->flags = FDPUT_FPUT;
-		real->file = ovl_open_realfile(file);
+		real->file = ovl_open_realfile(file, allow_metacopy);
 
 		return PTR_ERR_OR_ZERO(real->file);
 	}
@@ -107,7 +124,7 @@ static int ovl_open(struct inode *inode, struct file *file)
 	/* No longer need these flags, so don't pass them on to underlying fs */
 	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
 
-	realfile = ovl_open_realfile(file);
+	realfile = ovl_open_realfile(file, false);
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 
@@ -184,7 +201,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	if (!iov_iter_count(iter))
 		return 0;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget(file, &real, false);
 	if (ret)
 		return ret;
 
@@ -218,7 +235,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 	if (ret)
 		goto out_unlock;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget(file, &real, false);
 	if (ret)
 		goto out_unlock;
 
@@ -244,7 +261,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget(file, &real, true);
 	if (ret)
 		return ret;
 
@@ -283,7 +300,7 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget(file, &real, false);
 	if (ret)
 		return ret;
 
@@ -311,7 +328,7 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget(file, &real, false);
 	if (ret)
 		return ret;
 
@@ -334,7 +351,7 @@ static long ovl_real_ioctl(struct file *file, unsigned int cmd,
 	const struct cred *old_cred;
 	long ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget(file, &real, false);
 	if (ret)
 		return ret;
 
@@ -418,11 +435,11 @@ static s64 ovl_copyfile(struct file *file_in, loff_t pos_in,
 	const struct cred *old_cred;
 	s64 ret;
 
-	ret = ovl_real_fdget(file_out, &real_out);
+	ret = ovl_real_fdget(file_out, &real_out, false);
 	if (ret)
 		return ret;
 
-	ret = ovl_real_fdget(file_in, &real_in);
+	ret = ovl_real_fdget(file_in, &real_in, false);
 	if (ret) {
 		fdput(real_out);
 		return ret;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 18/30] ovl: Do not expose metacopy only dentry from d_real()
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (16 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 17/30] ovl: Open file with data except for the case of fsync Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 19:39   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 19/30] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Metacopy dentry/inode is internal to overlay and is never exposed
outside of it. Modify d_real() to look for only dentries/inode which
have data and which are not metacopy.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/super.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ce90b6e3ce76..c97b5abda954 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -101,10 +101,11 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 	}
 
 	real = ovl_dentry_upper(dentry);
-	if (real && (!inode || inode == d_inode(real)))
+	if (real && ovl_has_upperdata(d_inode(dentry)) &&
+	    (!inode || inode == d_inode(real)))
 		return real;
 
-	real = ovl_dentry_lower(dentry);
+	real = ovl_dentry_lowerdata(dentry);
 	if (!real)
 		goto bug;
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 19/30] ovl: Move some dir related ovl_lookup_single() code in else block
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (17 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 18/30] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 20/30] ovl: Check redirects for metacopy files Vivek Goyal
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Move some directory related code in else block. This is pure code
reorganization and no functionality change.

Next patch enables redirect processing on metacopy files and needs this
change.  By keeping non-functional changes in a separate patch, next patch
looks much smaller and cleaner.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index a2556f981d3e..51e5c1bc5678 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -269,22 +269,23 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		d->stop = !err;
 		d->metacopy = !!err;
 		goto out;
-	}
-	if (last_element) {
-		if (d->metacopy) {
-			err = -ESTALE;
-			goto out_err;
+	} else {
+		if (last_element) {
+			if (d->metacopy) {
+				err = -ESTALE;
+				goto out_err;
+			}
+			d->is_dir = true;
 		}
-		d->is_dir = true;
-	}
-	if (d->last)
-		goto out;
+		if (d->last)
+			goto out;
 
-	if (ovl_is_opaquedir(this)) {
-		d->stop = true;
-		if (last_element)
-			d->opaque = true;
-		goto out;
+		if (ovl_is_opaquedir(this)) {
+			d->stop = true;
+			if (last_element)
+				d->opaque = true;
+			goto out;
+		}
 	}
 	err = ovl_check_redirect(this, d, prelen, post);
 	if (err)
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 20/30] ovl: Check redirects for metacopy files
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (18 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 19/30] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 21/30] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Vivek Goyal
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now we rely on path based lookup for data origin of metacopy upper.
This will work only if upper has not been renamed. We solved this problem
already for merged directories using redirect. Use same logic for metacopy
files.

This patch just goes on to check redirects for metacopy files.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 51e5c1bc5678..cd06e7ff9fd1 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -268,7 +268,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 			goto out_err;
 		d->stop = !err;
 		d->metacopy = !!err;
-		goto out;
+		if (!d->metacopy || d->last)
+			goto out;
 	} else {
 		if (last_element) {
 			if (d->metacopy) {
@@ -874,7 +875,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 		if (upperdentry && !d.is_dir) {
 			unsigned int origin_ctr = 0;
-			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 21/30] ovl: Treat metacopy dentries as type OVL_PATH_MERGE
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (19 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 20/30] ovl: Check redirects for metacopy files Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 22/30] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now OVL_PATH_MERGE is used only for merged directories.
But conceptually, a metacopy dentry (backed by a lower data dentry) is
a merged entity as well.

So mark metacopy dentries as OVL_PATH_MERGE and ovl_rename() makes use
of this property later to set redirect on a metacopy file.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/util.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index eb6453e5cfa6..92ead380f3ec 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -134,7 +134,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
 		 */
 		if (oe->numlower) {
 			type |= __OVL_PATH_ORIGIN;
-			if (d_is_dir(dentry))
+			if (d_is_dir(dentry) ||
+			    !ovl_has_upperdata(d_inode(dentry)))
 				type |= __OVL_PATH_MERGE;
 		}
 	} else {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 22/30] ovl: Add an inode flag OVL_CONST_INO
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (20 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 21/30] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 23/30] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Add an ovl_inode flag OVL_CONST_INO. This flag signifies if inode number
will remain constant over copy up or not. This flag does not get updated
over copy up and remains unmodifed after setting once.

Next patch in the series will make use of this flag. It will basically
figure out if dentry is of type ORIGIN or not. And this can be derived
by this flag.

ORIGIN = (upperdentry && ovl_test_flag(OVL_CONST_INO, inode)).

Suggested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 3 +++
 fs/overlayfs/overlayfs.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 949ddc7c6f59..3ac5a684798c 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -874,6 +874,9 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
 
 	OVL_I(inode)->redirect = oip->redirect;
 
+	if (bylower)
+		ovl_set_flag(OVL_CONST_INO, inode);
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 2dabe5f11d22..ea2cf5b6bb85 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -38,6 +38,8 @@ enum ovl_inode_flag {
 	OVL_WHITEOUTS,
 	OVL_INDEX,
 	OVL_UPPERDATA,
+	/* Inode number will remain constant over copy up. */
+	OVL_CONST_INO,
 };
 
 enum ovl_entry_flag {
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 23/30] ovl: Do not set dentry type ORIGIN for broken hardlinks
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (21 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 22/30] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 24/30] ovl: Set redirect on metacopy files upon rename Vivek Goyal
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

If a dentry has copy up origin, we set flag OVL_PATH_ORIGIN. So far
this decision was easy that we had to check only for oe->numlower
and if it is non-zero, we knew there is copy up origin. (For non-dir
we installed origin dentry in lowerstack[0]).

But we don't create ORGIN xattr for broken hardlinks (index=off). And
with metacopy feature it is possible that we will install lowerstack[0]
but ORIGIN xattr is not there. It is data dentry of upper metacopy dentry
which has been found using regular name based lookup or using REDIRECT.
So with addition of this new case, just presence of oe->numlower is
not sufficient to guarantee that ORIGIN xattr is present.

So to differentiate between two cases, look at OVL_CONST_INO flag. If
this flag is set and upperdentry is there, that means it can be marked
as type ORIGIN. OVL_CONST_INO is not set if lower hardlink is broken
or will be broken over copy up.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/util.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 92ead380f3ec..73b4129ffeff 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -133,7 +133,8 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
 		 * Non-dir dentry can hold lower dentry of its copy up origin.
 		 */
 		if (oe->numlower) {
-			type |= __OVL_PATH_ORIGIN;
+			if (ovl_test_flag(OVL_CONST_INO, d_inode(dentry)))
+				type |= __OVL_PATH_ORIGIN;
 			if (d_is_dir(dentry) ||
 			    !ovl_has_upperdata(d_inode(dentry)))
 				type |= __OVL_PATH_MERGE;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 24/30] ovl: Set redirect on metacopy files upon rename
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (22 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 23/30] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 18:21   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 25/30] ovl: Set redirect on upper inode when it is linked Vivek Goyal
                   ` (7 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Set redirect on metacopy files upon rename. This will help find data dentry
in lower dirs.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c | 66 +++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 20 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index bdaa5e4fa603..602b0ac1f4d4 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -811,13 +811,13 @@ static bool ovl_can_move(struct dentry *dentry)
 		!d_is_dir(dentry) || !ovl_type_merge_or_lower(dentry);
 }
 
-static char *ovl_get_redirect(struct dentry *dentry, bool samedir)
+static char *ovl_get_redirect(struct dentry *dentry, bool abs_redirect)
 {
 	char *buf, *ret;
 	struct dentry *d, *tmp;
 	int buflen = ovl_redirect_max + 1;
 
-	if (samedir) {
+	if (!abs_redirect) {
 		ret = kstrndup(dentry->d_name.name, dentry->d_name.len,
 			       GFP_KERNEL);
 		goto out;
@@ -871,15 +871,43 @@ static char *ovl_get_redirect(struct dentry *dentry, bool samedir)
 	return ret ? ret : ERR_PTR(-ENOMEM);
 }
 
+static bool ovl_need_absolute_redirect(struct dentry *dentry, bool samedir)
+{
+	struct dentry *lowerdentry;
+
+	if (!samedir)
+		return true;
+
+	if (d_is_dir(dentry))
+		return false;
+
+	/*
+	 * For non-dir hardlinked files, we need absolute redirects
+	 * in general as two upper hardlinks could be in different
+	 * dirs. We could put a relative redirect now and convert
+	 * it to absolute redirect later. But when nlink > 1 and
+	 * indexing is on, that means relative redirect needs to be
+	 * converted to absolute during copy up of another lower
+	 * hardllink as well.
+	 *
+	 * So without optimizing too much, just check if lower is
+	 * a hard link or not. If lower is hard link, put absolute
+	 * redirect.
+	 */
+	lowerdentry = ovl_dentry_lower(dentry);
+	return (d_inode(lowerdentry)->i_nlink > 1);
+}
+
 static int ovl_set_redirect(struct dentry *dentry, bool samedir)
 {
 	int err;
 	const char *redirect = ovl_dentry_get_redirect(dentry);
+	bool absolute_redirect = ovl_need_absolute_redirect(dentry, samedir);
 
-	if (redirect && (samedir || redirect[0] == '/'))
+	if (redirect && (!absolute_redirect || redirect[0] == '/'))
 		return 0;
 
-	redirect = ovl_get_redirect(dentry, samedir);
+	redirect = ovl_get_redirect(dentry, absolute_redirect);
 	if (IS_ERR(redirect))
 		return PTR_ERR(redirect);
 
@@ -1055,22 +1083,20 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 		goto out_dput;
 
 	err = 0;
-	if (is_dir) {
-		if (ovl_type_merge_or_lower(old))
-			err = ovl_set_redirect(old, samedir);
-		else if (!old_opaque && ovl_type_merge(new->d_parent))
-			err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
-		if (err)
-			goto out_dput;
-	}
-	if (!overwrite && new_is_dir) {
-		if (ovl_type_merge_or_lower(new))
-			err = ovl_set_redirect(new, samedir);
-		else if (!new_opaque && ovl_type_merge(old->d_parent))
-			err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
-		if (err)
-			goto out_dput;
-	}
+	if (ovl_type_merge_or_lower(old))
+		err = ovl_set_redirect(old, samedir);
+	else if (is_dir && !old_opaque && ovl_type_merge(new->d_parent))
+		err = ovl_set_opaque_xerr(old, olddentry, -EXDEV);
+	if (err)
+		goto out_dput;
+
+	if (!overwrite && ovl_type_merge_or_lower(new))
+		err = ovl_set_redirect(new, samedir);
+	else if (!overwrite && new_is_dir && !new_opaque &&
+		 ovl_type_merge(old->d_parent))
+		err = ovl_set_opaque_xerr(new, newdentry, -EXDEV);
+	if (err)
+		goto out_dput;
 
 	err = ovl_do_rename(old_upperdir->d_inode, olddentry,
 			    new_upperdir->d_inode, newdentry, flags);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 25/30] ovl: Set redirect on upper inode when it is linked
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (23 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 24/30] ovl: Set redirect on metacopy files upon rename Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:40 ` [PATCH v15 26/30] ovl: Check redirect on index as well Vivek Goyal
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

When we create a hardlink to a metacopy upper file, first the redirect
on that inode. Path based lookup will not work with newly created link
and redirect will solve that issue.

Also use absolute redirect as two hardlinks could be in different directores
and relative redirect will not work.

I have not put any additional locking around setting redirects while
introducing redirects for non-dir files. For now it feels like existing
locking is sufficient. If that's not the case, we will have add more
locking. Following is my rationale about why do I think current locking
seems ok.

Basic problem for non-dir files is that more than on dentry could be
pointing to same inode and in theory only relying on dentry based locks
(d->d_lock) did not seem sufficient.

We set redirect upon rename and upon link creation. In both the paths
for non-dir file, VFS locks both source and target inodes (->i_rwsem).
That means vfs rename and link operations on same source and target
can't he happening in parallel (Even if there are multiple dentries
pointing to same inode). So that probably means that at a time on an inode,
only one call of ovl_set_redirect() could be working and we don't need
additional locking in ovl_set_redirect().

ovl_inode->redirect is initialized only when inode is created new. That
means it should not race with any other path and setting ovl_inode->redirect
should be fine.

Reading of ovl_inode->redirect happens in ovl_get_redirect() path. And this
called only in ovl_set_redirect(). And ovl_set_redirect() already seemed
to be protected using ->i_rwsem. That means ovl_set_redirect() and
ovl_get_redirect() on source/target inode should not make progress in
parallel and is mutually exclusive. Hence no additional locking required.

Now, only case where ovl_set_redirect() and ovl_get_redirect() could race
seems to be case of absolute redirects where ovl_get_redirect() has to
travel up the tree. In that case we already take d->d_lock and that should
be sufficient as directories will not have multiple dentries pointing to
same inode.

So given VFS locking and current usage of redirect, current locking around
redirect seems to be ok for non-dir as well. Once we have the logic to
remove redirect when metacopy file gets copied up, then we probably will
need additional locking.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 602b0ac1f4d4..c28354141baf 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -24,6 +24,8 @@ module_param_named(redirect_max, ovl_redirect_max, ushort, 0644);
 MODULE_PARM_DESC(ovl_redirect_max,
 		 "Maximum length of absolute redirect xattr value");
 
+static int ovl_set_redirect(struct dentry *dentry, bool samedir);
+
 int ovl_cleanup(struct inode *wdir, struct dentry *wdentry)
 {
 	int err;
@@ -468,6 +470,9 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 	const struct cred *old_cred;
 	struct cred *override_cred;
 	struct dentry *parent = dentry->d_parent;
+	struct dentry *hardlink_upper;
+
+	hardlink_upper = hardlink ? ovl_dentry_upper(hardlink) : NULL;
 
 	err = ovl_copy_up(parent);
 	if (err)
@@ -502,12 +507,18 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 		put_cred(override_creds(override_cred));
 		put_cred(override_cred);
 
+		if (hardlink && ovl_is_metacopy_dentry(hardlink)) {
+			err = ovl_set_redirect(hardlink, false);
+			if (err)
+				goto out_revert_creds;
+		}
+
 		if (!ovl_dentry_is_whiteout(dentry))
 			err = ovl_create_upper(dentry, inode, attr,
-						hardlink);
+						hardlink_upper);
 		else
 			err = ovl_create_over_whiteout(dentry, inode, attr,
-							hardlink);
+							hardlink_upper);
 	}
 out_revert_creds:
 	revert_creds(old_cred);
@@ -602,8 +613,7 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
 	inode = d_inode(old);
 	ihold(inode);
 
-	err = ovl_create_or_link(new, inode, NULL, ovl_dentry_upper(old),
-				 ovl_type_origin(old));
+	err = ovl_create_or_link(new, inode, NULL, old, ovl_type_origin(old));
 	if (err)
 		iput(inode);
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 26/30] ovl: Check redirect on index as well
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (24 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 25/30] ovl: Set redirect on upper inode when it is linked Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 18:43   ` Amir Goldstein
  2018-05-07 17:40 ` [PATCH v15 27/30] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
                   ` (5 subsequent siblings)
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

Right now we seem to check redirect only if upperdentry is found. But it
is possible that there is no upperdentry but later we found an index.

We need to check redirect on index as well and set it in ovl_inode->redirect.
Otherwise link code can assume that dentry does not have redirect and
place a new one which breaks things. In my testing overlay/033 test
started failing in xfstests. Following are the details.

For example do following.

$ mkdir lower upper work merged

- Make lower dir with 4 links.
  $ echo "foo" > lower/l0.txt
  $ ln  lower/l0.txt lower/l1.txt
  $ ln  lower/l0.txt lower/l2.txt
  $ ln  lower/l0.txt lower/l3.txt

- Mount with index on and metacopy on.

  $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,index=on,metacopy=on none merged

- Link lower

  $ ln merged/l0.txt merged/l4.txt
    (This will metadata copy up of l0.txt and put an absolute redirect
     /l0.txt)

  $ echo 2 > /proc/sys/vm/drop/caches

  $ ls merged/l1.txt
  (Now l1.txt will be looked up. There is no upper dentry but there is
   lower dentry and index will be found. We don't check for redirect on
   index, hence ovl_inode->redirect will be NULL.)

- Link Upper

  $ ln merged/l4.txt merged/l5.txt
  (Lookup of l4.txt will use inode from l1.txt lookup which is still in
   cache. It has ovl_inode->redirect NULL, hence link will put a new
   redirect and replace /l0.txt with /l4.txt

- Drop caches.
  echo 2 > /proc/sys/vm/drop_caches

- List l1.txt and it returns -ESTALE

  $ ls merged/l0.txt

  (It returns stale because, we found a metacopy of l0.txt in upper
   and it has redirect l4.txt but there is no file named l4.txt in
   lower layer. So lower data copy is not found and -ESTALE is returned.)

So problem here is that we did not process redirect on index. Check
redirect on index as well and then problem is fixed.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     | 53 ++++++++++++++++--------------------------------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      | 45 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 35 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index cd06e7ff9fd1..8d5beed3876b 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -31,32 +31,20 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 			      size_t prelen, const char *post)
 {
 	int res;
-	char *s, *next, *buf = NULL;
+	char *buf;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
-	if (res < 0) {
-		if (res == -ENODATA || res == -EOPNOTSUPP)
-			return 0;
-		goto fail;
-	}
-	buf = kzalloc(prelen + res + strlen(post) + 1, GFP_KERNEL);
+	buf = ovl_get_redirect_xattr(dentry, prelen + strlen(post));
 	if (!buf)
-		return -ENOMEM;
+		return 0;
 
-	if (res == 0)
-		goto invalid;
+	if (IS_ERR(buf)) {
+		res = PTR_ERR(buf);
+		pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n",
+				    res);
+		return res;
+	}
 
-	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
-	if (res < 0)
-		goto fail;
-	if (res == 0)
-		goto invalid;
 	if (buf[0] == '/') {
-		for (s = buf; *s++ == '/'; s = next) {
-			next = strchrnul(s, '/');
-			if (s == next)
-				goto invalid;
-		}
 		/*
 		 * One of the ancestor path elements in an absolute path
 		 * lookup in ovl_lookup_layer() could have been opaque and
@@ -67,9 +55,7 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 		 */
 		d->stop = false;
 	} else {
-		if (strchr(buf, '/') != NULL)
-			goto invalid;
-
+		res = strlen(buf) + 1;
 		memmove(buf + prelen, buf, res);
 		memcpy(buf, d->name.name, prelen);
 	}
@@ -81,16 +67,6 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 	d->name.len = strlen(d->redirect);
 
 	return 0;
-
-err_free:
-	kfree(buf);
-	return 0;
-fail:
-	pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res);
-	goto err_free;
-invalid:
-	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
-	goto err_free;
 }
 
 static int ovl_acceptable(void *ctx, struct dentry *dentry)
@@ -1073,8 +1049,15 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (upperdentry)
 		ovl_dentry_set_upper_alias(dentry);
-	else if (index)
+	else if (index) {
 		upperdentry = dget(index);
+		upperredirect = ovl_get_redirect_xattr(upperdentry, 0);
+		if (IS_ERR(upperredirect)) {
+			err = PTR_ERR(upperredirect);
+			upperredirect = NULL;
+			goto out_free_oe;
+		}
+	}
 
 	if (upperdentry || ctr) {
 		struct dentry *lowerdata = NULL;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ea2cf5b6bb85..bd83b27d8163 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -283,6 +283,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
 bool ovl_is_metacopy_dentry(struct dentry *dentry);
+char *ovl_get_redirect_xattr(struct dentry *dentry, int padding);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 73b4129ffeff..0d8a9ac92390 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -878,3 +878,48 @@ bool ovl_is_metacopy_dentry(struct dentry *dentry)
 
 	return (oe->numlower > 1);
 }
+
+char *ovl_get_redirect_xattr(struct dentry *dentry, int padding)
+{
+	int res;
+	char *s, *next, *buf = NULL;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return NULL;
+		return ERR_PTR(res);
+	}
+
+	buf = kzalloc(res + padding + 1, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	if (res == 0)
+		goto invalid;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
+	if (res < 0) {
+		kfree(buf);
+		return ERR_PTR(res);
+        }
+	if (res == 0)
+		goto invalid;
+
+	if (buf[0] == '/') {
+		for (s = buf; *s++ == '/'; s = next) {
+			next = strchrnul(s, '/');
+			if (s == next)
+				goto invalid;
+		}
+	} else {
+		if (strchr(buf, '/') != NULL)
+			goto invalid;
+	}
+
+	return buf;
+invalid:
+	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
+	kfree(buf);
+	return ERR_PTR(-EINVAL);
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 27/30] ovl: Disbale metacopy for MAP_SHARED mmap()
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (25 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 26/30] ovl: Check redirect on index as well Vivek Goyal
@ 2018-05-07 17:40 ` Vivek Goyal
  2018-05-07 17:41 ` [PATCH v15 28/30] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:40 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

When user chose the option of copying up a file when mmap(MAP_SHARED)
happens, then do full copy up and not just metacopy.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 5 +++++
 fs/overlayfs/file.c      | 2 +-
 fs/overlayfs/overlayfs.h | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 5a865a4cf3d7..06720454861b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -917,6 +917,11 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
 	return err;
 }
 
+int ovl_copy_up_with_data(struct dentry *dentry)
+{
+	return ovl_copy_up_flags(dentry, O_WRONLY);
+}
+
 int ovl_copy_up(struct dentry *dentry)
 {
 	return ovl_copy_up_flags(dentry, 0);
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 885151e8d0cb..93306ec57683 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -289,7 +289,7 @@ static int ovl_pre_mmap(struct file *file, unsigned long prot,
 	 * later.
 	 */
 	if ((flag & MAP_SHARED) && ovl_copy_up_shared(file_inode(file)->i_sb))
-		err = ovl_copy_up(file_dentry(file));
+		err = ovl_copy_up_with_data(file_dentry(file));
 
 	return err;
 }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index bd83b27d8163..7d2ba0330267 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -408,6 +408,7 @@ extern const struct file_operations ovl_file_operations;
 
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
+int ovl_copy_up_with_data(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
 int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 28/30] ovl: Do not do metadata only copy-up for truncate operation
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (26 preceding siblings ...)
  2018-05-07 17:40 ` [PATCH v15 27/30] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
@ 2018-05-07 17:41 ` Vivek Goyal
  2018-05-07 17:41 ` [PATCH v15 29/30] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:41 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

truncate should copy up full file (and not do metacopy only), otherwise
it will be broken. For example, use truncate to increase size of a file
so that any read beyong existing size will return null bytes. If we
don't copy up full file, then we end up opening lower file and read
from it only reads upto the old size (and not new size after truncate).
Hence to avoid such situations, copy up data as well when file size
changes.

So far it was being done by d_real(O_WRONLY) call in truncate()
path. Now that patch has been reverted. So force full copy up in
ovl_setattr() if size of file is changing.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 3ac5a684798c..98fe5e920beb 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -19,6 +19,7 @@
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int err;
+	bool full_copy_up = false;
 	struct dentry *upperdentry;
 	const struct cred *old_cred;
 
@@ -36,9 +37,15 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		err = -ETXTBSY;
 		if (atomic_read(&realinode->i_writecount) < 0)
 			goto out_drop_write;
+
+		/* Truncate should trigger data copy up as well */
+		full_copy_up = true;
 	}
 
-	err = ovl_copy_up(dentry);
+	if (!full_copy_up)
+		err = ovl_copy_up(dentry);
+	else
+		err = ovl_copy_up_with_data(dentry);
 	if (!err) {
 		struct inode *winode = NULL;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 29/30] ovl: Do not do metacopy only for ioctl modifying file attr
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (27 preceding siblings ...)
  2018-05-07 17:41 ` [PATCH v15 28/30] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
@ 2018-05-07 17:41 ` Vivek Goyal
  2018-05-07 17:41 ` [PATCH v15 30/30] ovl: Enable metadata only feature Vivek Goyal
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:41 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

ovl_copy_up() by default will only do metadata only copy up (if enabled).
That means when ovl_real_ioctl() calls ovl_real_file(), it will still
get the lower file (as ovl_real_file() opens data file and not metacopy).
And that means "chattr +i" will end up modifying lower inode.

There seem to be two ways to solve this.
A. Open metacopy file in ovl_real_ioctl() and do operations on that
B. Force full copy up when FS_IOC_SETFLAGS is called.

I am resorting to option B for now as it feels little safer option. If
there are performance issues due to this, we can revisit it.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 93306ec57683..c3e23ea32297 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -382,7 +382,7 @@ static long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		if (ret)
 			return ret;
 
-		ret = ovl_copy_up(file_dentry(file));
+		ret = ovl_copy_up_with_data(file_dentry(file));
 		if (!ret) {
 			ret = ovl_real_ioctl(file, cmd, arg);
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v15 30/30] ovl: Enable metadata only feature
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (28 preceding siblings ...)
  2018-05-07 17:41 ` [PATCH v15 29/30] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
@ 2018-05-07 17:41 ` Vivek Goyal
  2018-05-07 18:10 ` [PATCH v15 00/30] overlayfs: Delayed copy up of data Amir Goldstein
  2018-05-08 13:42 ` Vivek Goyal
  31 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 17:41 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il, vgoyal

All the bits are in patches before this. So it is time to enable the
metadata only copy up feature.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 06720454861b..4b6040382676 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -738,9 +738,6 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 {
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 
-	/* TODO: Will enable metacopy in last patch of series */
-	return false;
-
 	if (!ofs->config.metacopy)
 		return false;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (29 preceding siblings ...)
  2018-05-07 17:41 ` [PATCH v15 30/30] ovl: Enable metadata only feature Vivek Goyal
@ 2018-05-07 18:10 ` Amir Goldstein
  2018-05-07 18:24   ` Vivek Goyal
  2018-05-08 13:42 ` Vivek Goyal
  31 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 18:10 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Hi,
>
> This is V15 of overlayfs metadata only copy-up feature. These patches I
> have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
>
> Patches are also available here.
>
> https://github.com/rhvgoyal/linux/commits/metacopy-v15
>
> I have run unionmount-testsuite and "./check -overlay -g quick" and that
> works. Only 4 overlay tests fail, which fail on vanilla kernel too.

I wonder which tests failed? My -g quick as well as -s auto runs got
no fails.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 16/30] ovl: Add helper ovl_inode_real_data()
  2018-05-07 17:40 ` [PATCH v15 16/30] ovl: Add helper ovl_inode_real_data() Vivek Goyal
@ 2018-05-07 18:18   ` Amir Goldstein
  0 siblings, 0 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 18:18 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Add an helper to retrieve real data inode associated with overlay inode.
> This helper will ignore all metacopy inodes and will return only the
> real inode which has data.
>

Title of the patch uses the old helper name.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 24/30] ovl: Set redirect on metacopy files upon rename
  2018-05-07 17:40 ` [PATCH v15 24/30] ovl: Set redirect on metacopy files upon rename Vivek Goyal
@ 2018-05-07 18:21   ` Amir Goldstein
  0 siblings, 0 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 18:21 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Set redirect on metacopy files upon rename. This will help find data dentry
> in lower dirs.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Looks OK.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-07 18:10 ` [PATCH v15 00/30] overlayfs: Delayed copy up of data Amir Goldstein
@ 2018-05-07 18:24   ` Vivek Goyal
  2018-05-07 18:33     ` Amir Goldstein
  0 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 18:24 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 09:10:25PM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Hi,
> >
> > This is V15 of overlayfs metadata only copy-up feature. These patches I
> > have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
> >
> > Patches are also available here.
> >
> > https://github.com/rhvgoyal/linux/commits/metacopy-v15
> >
> > I have run unionmount-testsuite and "./check -overlay -g quick" and that
> > works. Only 4 overlay tests fail, which fail on vanilla kernel too.
> 
> I wonder which tests failed? My -g quick as well as -s auto runs got
> no fails.

Hi Amir,

overlay/16, overlay/41, overlay/43, overlay/44 fail for me, even with
vanilla kernel. I never debugged these to figure out what's happening.

Vivek


overlay/016      - output mismatch (see /root/git/xfstests-dev/results//overlay/016.out.bad)
    --- tests/overlay/016.out   2017-04-19 08:18:17.658511331 -0400
    +++ /root/git/xfstests-dev/results//overlay/016.out.bad     2018-05-07 13:31:09.137091967 -0400
    @@ -8,4 +8,4 @@
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
     wrote 16/16 bytes at offset 0
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -00000000:  61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61  aaaaaaaaaaaaaaaa
    +00000000:  54 68 69 73 20 69 73 20 6f 6c 64 20 6e 65 77 73  This.is.old.news
    ...
    (Run 'diff -u tests/overlay/016.out /root/git/xfstests-dev/results//overlay/016.out.bad'  to see the entire diff)



overlay/041      - output mismatch (see /root/git/xfstests-dev/results//overlay/041.out.bad)
    --- tests/overlay/041.out   2017-11-20 13:35:17.024529298 -0500
    +++ /root/git/xfstests-dev/results//overlay/041.out.bad     2018-05-07 13:31:46.124091967 -0400
    @@ -1,2 +1,19 @@
     QA output created by 041
    +Pure upper dir: Invalid d_ino reported for ..
    +Pure upper dir: Invalid d_ino reported for .
    +Pure upper dir: Invalid d_ino reported for subdir
    +Impure dir: Invalid d_ino reported for ..
    +Impure dir: Invalid d_ino reported for .
    +Impure dir: Invalid d_ino reported for subdir
    ...
overlay/043      - output mismatch (see /root/git/xfstests-dev/results//overlay/043.out.bad)
    --- tests/overlay/043.out   2017-11-27 14:48:31.704797612 -0500
    +++ /root/git/xfstests-dev/results//overlay/043.out.bad     2018-05-07 13:31:48.293091967 -0400
    @@ -1,2 +1,39 @@
     QA output created by 043
    +dir not found by ino 232521 (from /tmp/15394.before)
    +file not found by ino 50367463 (from /tmp/15394.before)
    +symlink not found by ino 50367464 (from /tmp/15394.before)
    +chrdev not found by ino 50367465 (from /tmp/15394.before)
    +blkdev not found by ino 50367466 (from /tmp/15394.before)
    +fifo not found by ino 50367467 (from /tmp/15394.before)
    ...

overlay/044      - output mismatch (see /root/git/xfstests-dev/results//overlay/044.out.bad)
    --- tests/overlay/044.out   2017-11-27 14:48:31.704797612 -0500
    +++ /root/git/xfstests-dev/results//overlay/044.out.bad     2018-05-07 13:31:49.462091967 -0400
    @@ -7,11 +7,13 @@
     one
     zero
     one
    +bar not found by ino 33592805 (from /tmp/15776.before)
     == After mount cycle ==
     zero
     one

> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-07 18:24   ` Vivek Goyal
@ 2018-05-07 18:33     ` Amir Goldstein
  2018-05-07 19:14       ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 18:33 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 9:24 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, May 07, 2018 at 09:10:25PM +0300, Amir Goldstein wrote:
>> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > Hi,
>> >
>> > This is V15 of overlayfs metadata only copy-up feature. These patches I
>> > have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
>> >
>> > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
>> >
>> > Patches are also available here.
>> >
>> > https://github.com/rhvgoyal/linux/commits/metacopy-v15
>> >
>> > I have run unionmount-testsuite and "./check -overlay -g quick" and that
>> > works. Only 4 overlay tests fail, which fail on vanilla kernel too.
>>
>> I wonder which tests failed? My -g quick as well as -s auto runs got
>> no fails.
>
> Hi Amir,
>
> overlay/16, overlay/41, overlay/43, overlay/44 fail for me, even with
> vanilla kernel. I never debugged these to figure out what's happening.
>
> Vivek
>
>
> overlay/016      - output mismatch (see /root/git/xfstests-dev/results//overlay/016.out.bad)
>     --- tests/overlay/016.out   2017-04-19 08:18:17.658511331 -0400
>     +++ /root/git/xfstests-dev/results//overlay/016.out.bad     2018-05-07 13:31:09.137091967 -0400
>     @@ -8,4 +8,4 @@
>      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>      wrote 16/16 bytes at offset 0
>      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>     -00000000:  61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61  aaaaaaaaaaaaaaaa
>     +00000000:  54 68 69 73 20 69 73 20 6f 6c 64 20 6e 65 77 73  This.is.old.news
>     ...
>     (Run 'diff -u tests/overlay/016.out /root/git/xfstests-dev/results//overlay/016.out.bad'  to see the entire diff)
>
>

This will pass if you set OVERLAY_MOUNT_OPTIONS="-o copy_up_shared=on"
after rorw patches are merge, we can make the test enable this option.

>
> overlay/041      - output mismatch (see /root/git/xfstests-dev/results//overlay/041.out.bad)
>     --- tests/overlay/041.out   2017-11-20 13:35:17.024529298 -0500
>     +++ /root/git/xfstests-dev/results//overlay/041.out.bad     2018-05-07 13:31:46.124091967 -0400
>     @@ -1,2 +1,19 @@
>      QA output created by 041
>     +Pure upper dir: Invalid d_ino reported for ..
>     +Pure upper dir: Invalid d_ino reported for .
>     +Pure upper dir: Invalid d_ino reported for subdir
>     +Impure dir: Invalid d_ino reported for ..
>     +Impure dir: Invalid d_ino reported for .
>     +Impure dir: Invalid d_ino reported for subdir
>     ...
> overlay/043      - output mismatch (see /root/git/xfstests-dev/results//overlay/043.out.bad)
>     --- tests/overlay/043.out   2017-11-27 14:48:31.704797612 -0500
>     +++ /root/git/xfstests-dev/results//overlay/043.out.bad     2018-05-07 13:31:48.293091967 -0400
>     @@ -1,2 +1,39 @@
>      QA output created by 043
>     +dir not found by ino 232521 (from /tmp/15394.before)
>     +file not found by ino 50367463 (from /tmp/15394.before)
>     +symlink not found by ino 50367464 (from /tmp/15394.before)
>     +chrdev not found by ino 50367465 (from /tmp/15394.before)
>     +blkdev not found by ino 50367466 (from /tmp/15394.before)
>     +fifo not found by ino 50367467 (from /tmp/15394.before)
>     ...
>
> overlay/044      - output mismatch (see /root/git/xfstests-dev/results//overlay/044.out.bad)
>     --- tests/overlay/044.out   2017-11-27 14:48:31.704797612 -0500
>     +++ /root/git/xfstests-dev/results//overlay/044.out.bad     2018-05-07 13:31:49.462091967 -0400
>     @@ -7,11 +7,13 @@
>      one
>      zero
>      one
>     +bar not found by ino 33592805 (from /tmp/15776.before)
>      == After mount cycle ==
>      zero
>      one
>

These 3 tests have been fixed in xfstest master to enable xino last Sunday.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 26/30] ovl: Check redirect on index as well
  2018-05-07 17:40 ` [PATCH v15 26/30] ovl: Check redirect on index as well Vivek Goyal
@ 2018-05-07 18:43   ` Amir Goldstein
  2018-05-08 12:58     ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 18:43 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Right now we seem to check redirect only if upperdentry is found. But it
> is possible that there is no upperdentry but later we found an index.
>
> We need to check redirect on index as well and set it in ovl_inode->redirect.
> Otherwise link code can assume that dentry does not have redirect and
> place a new one which breaks things. In my testing overlay/033 test
> started failing in xfstests. Following are the details.
>
> For example do following.
>
> $ mkdir lower upper work merged
>
> - Make lower dir with 4 links.
>   $ echo "foo" > lower/l0.txt
>   $ ln  lower/l0.txt lower/l1.txt
>   $ ln  lower/l0.txt lower/l2.txt
>   $ ln  lower/l0.txt lower/l3.txt
>
> - Mount with index on and metacopy on.
>
>   $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,index=on,metacopy=on none merged
>
> - Link lower
>
>   $ ln merged/l0.txt merged/l4.txt
>     (This will metadata copy up of l0.txt and put an absolute redirect
>      /l0.txt)
>
>   $ echo 2 > /proc/sys/vm/drop/caches
>
>   $ ls merged/l1.txt
>   (Now l1.txt will be looked up. There is no upper dentry but there is
>    lower dentry and index will be found. We don't check for redirect on
>    index, hence ovl_inode->redirect will be NULL.)
>
> - Link Upper
>
>   $ ln merged/l4.txt merged/l5.txt
>   (Lookup of l4.txt will use inode from l1.txt lookup which is still in
>    cache. It has ovl_inode->redirect NULL, hence link will put a new
>    redirect and replace /l0.txt with /l4.txt
>
> - Drop caches.
>   echo 2 > /proc/sys/vm/drop_caches
>
> - List l1.txt and it returns -ESTALE
>
>   $ ls merged/l0.txt
>
>   (It returns stale because, we found a metacopy of l0.txt in upper
>    and it has redirect l4.txt but there is no file named l4.txt in
>    lower layer. So lower data copy is not found and -ESTALE is returned.)
>
> So problem here is that we did not process redirect on index. Check
> redirect on index as well and then problem is fixed.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>

Just one nit below

> ---
>  fs/overlayfs/namei.c     | 53 ++++++++++++++++--------------------------------
>  fs/overlayfs/overlayfs.h |  1 +
>  fs/overlayfs/util.c      | 45 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 64 insertions(+), 35 deletions(-)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index cd06e7ff9fd1..8d5beed3876b 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -31,32 +31,20 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
>                               size_t prelen, const char *post)
>  {
>         int res;
> -       char *s, *next, *buf = NULL;
> +       char *buf;
>
> -       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
> -       if (res < 0) {
> -               if (res == -ENODATA || res == -EOPNOTSUPP)
> -                       return 0;
> -               goto fail;
> -       }
> -       buf = kzalloc(prelen + res + strlen(post) + 1, GFP_KERNEL);
> +       buf = ovl_get_redirect_xattr(dentry, prelen + strlen(post));
>         if (!buf)
> -               return -ENOMEM;
> +               return 0;
>
> -       if (res == 0)
> -               goto invalid;
> +       if (IS_ERR(buf)) {
> +               res = PTR_ERR(buf);
> +               pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n",
> +                                   res);
> +               return res;
> +       }

Don't see why you didn't leave this in the helper.

>
> -       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
> -       if (res < 0)
> -               goto fail;
> -       if (res == 0)
> -               goto invalid;
>         if (buf[0] == '/') {
> -               for (s = buf; *s++ == '/'; s = next) {
> -                       next = strchrnul(s, '/');
> -                       if (s == next)
> -                               goto invalid;
> -               }
>                 /*
>                  * One of the ancestor path elements in an absolute path
>                  * lookup in ovl_lookup_layer() could have been opaque and
> @@ -67,9 +55,7 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
>                  */
>                 d->stop = false;
>         } else {
> -               if (strchr(buf, '/') != NULL)
> -                       goto invalid;
> -
> +               res = strlen(buf) + 1;
>                 memmove(buf + prelen, buf, res);
>                 memcpy(buf, d->name.name, prelen);
>         }
> @@ -81,16 +67,6 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
>         d->name.len = strlen(d->redirect);
>
>         return 0;
> -
> -err_free:
> -       kfree(buf);
> -       return 0;
> -fail:
> -       pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res);
> -       goto err_free;
> -invalid:
> -       pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
> -       goto err_free;
>  }
>
>  static int ovl_acceptable(void *ctx, struct dentry *dentry)
> @@ -1073,8 +1049,15 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>
>         if (upperdentry)
>                 ovl_dentry_set_upper_alias(dentry);
> -       else if (index)
> +       else if (index) {
>                 upperdentry = dget(index);
> +               upperredirect = ovl_get_redirect_xattr(upperdentry, 0);
> +               if (IS_ERR(upperredirect)) {
> +                       err = PTR_ERR(upperredirect);
> +                       upperredirect = NULL;
> +                       goto out_free_oe;
> +               }
> +       }
>
>         if (upperdentry || ctr) {
>                 struct dentry *lowerdata = NULL;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index ea2cf5b6bb85..bd83b27d8163 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -283,6 +283,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked);
>  int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
>  int ovl_check_metacopy_xattr(struct dentry *dentry);
>  bool ovl_is_metacopy_dentry(struct dentry *dentry);
> +char *ovl_get_redirect_xattr(struct dentry *dentry, int padding);
>
>  static inline bool ovl_is_impuredir(struct dentry *dentry)
>  {
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 73b4129ffeff..0d8a9ac92390 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -878,3 +878,48 @@ bool ovl_is_metacopy_dentry(struct dentry *dentry)
>
>         return (oe->numlower > 1);
>  }
> +
> +char *ovl_get_redirect_xattr(struct dentry *dentry, int padding)
> +{
> +       int res;
> +       char *s, *next, *buf = NULL;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
> +       if (res < 0) {
> +               if (res == -ENODATA || res == -EOPNOTSUPP)
> +                       return NULL;
> +               return ERR_PTR(res);
> +       }
> +
> +       buf = kzalloc(res + padding + 1, GFP_KERNEL);
> +       if (!buf)
> +               return ERR_PTR(-ENOMEM);
> +
> +       if (res == 0)
> +               goto invalid;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
> +       if (res < 0) {
> +               kfree(buf);
> +               return ERR_PTR(res);
> +        }
> +       if (res == 0)
> +               goto invalid;
> +
> +       if (buf[0] == '/') {
> +               for (s = buf; *s++ == '/'; s = next) {
> +                       next = strchrnul(s, '/');
> +                       if (s == next)
> +                               goto invalid;
> +               }
> +       } else {
> +               if (strchr(buf, '/') != NULL)
> +                       goto invalid;
> +       }
> +
> +       return buf;
> +invalid:
> +       pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
> +       kfree(buf);
> +       return ERR_PTR(-EINVAL);
> +}
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode
  2018-05-07 17:40 ` [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode Vivek Goyal
@ 2018-05-07 18:59   ` Amir Goldstein
  2018-05-08 13:47     ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 18:59 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Right now ovl_inode stores inode pointer for lower inode. This helps
> with quickly getting lower inode given overlay inode (ovl_inode_lower()).
>
> Now with metadata only copy-up, we can have metacopy inode in middle
> layer as well and inode containing data can be different from ->lower.
> I need to be able to open the real file in ovl_open_realfile() and
> for that I need to quickly find the lower data inode.
>
> Hence store lower data inode also in ovl_inode. Also provide an
> helper ovl_inode_lowerdata() to access this field.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

After fixing some nits below

> ---
>  fs/overlayfs/export.c    |  2 +-
>  fs/overlayfs/inode.c     |  2 +-
>  fs/overlayfs/namei.c     |  7 +++++--
>  fs/overlayfs/overlayfs.h |  3 ++-
>  fs/overlayfs/ovl_entry.h |  6 +++++-
>  fs/overlayfs/super.c     |  8 ++++++--
>  fs/overlayfs/util.c      | 10 +++++++++-
>  7 files changed, 29 insertions(+), 9 deletions(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 52a09a9f74b7..77d98aa7f118 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -301,7 +301,7 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>         struct inode *inode;
>         struct ovl_entry *oe;
>         struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower,
> -                                      NULL};
> +                                      NULL, NULL};

You should convert to new initialization standard for structs, i.e.:
{
    .index = index,
    ...

and no need to initialize NULL members.


>
>         /* We get overlay directory dentries with ovl_lookup_real() */
>         if (d_is_dir(upper ?: lower))
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 5d461cd57b48..949ddc7c6f59 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -855,7 +855,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
>                 }
>         }
>         ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
> -       ovl_inode_init(inode, upperdentry, lowerdentry);
> +       ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);

Would make senses to pass oip to ovl_inode_init() as well.

>
>         if (upperdentry && ovl_is_impuredir(upperdentry))
>                 ovl_set_flag(OVL_IMPURE, inode);
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index b2ff08985e29..a2556f981d3e 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1076,10 +1076,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 upperdentry = dget(index);
>
>         if (upperdentry || ctr) {
> +               struct dentry *lowerdata = NULL;

You don't seem to need this helper var.

>                 struct ovl_inode_params oip = {dentry->d_sb, upperdentry,
>                                                stack, index, ctr,
> -                                              upperredirect};
> -
> +                                              upperredirect, NULL};
> +               if (ctr > 1 && !d.is_dir)
> +                       lowerdata = stack[ctr - 1].dentry;
> +               oip.lowerdata = lowerdata;
>                 inode = ovl_get_inode(&oip);
>                 err = PTR_ERR(inode);
>                 if (IS_ERR(inode))
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 6d64796e0060..8c68387efe87 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -234,6 +234,7 @@ struct dentry *ovl_dentry_realdata(struct dentry *dentry);
>  struct dentry *ovl_i_dentry_upper(struct inode *inode);
>  struct inode *ovl_inode_upper(struct inode *inode);
>  struct inode *ovl_inode_lower(struct inode *inode);
> +struct inode *ovl_inode_lowerdata(struct inode *inode);
>  struct inode *ovl_inode_real(struct inode *inode);
>  struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
>  void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
> @@ -253,7 +254,7 @@ bool ovl_redirect_dir(struct super_block *sb);
>  const char *ovl_dentry_get_redirect(struct dentry *dentry);
>  void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
>  void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
> -                   struct dentry *lowerdentry);
> +                   struct dentry *lowerdentry, struct dentry *lowerdata);
>  void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
>  void ovl_dir_modified(struct dentry *dentry, bool impurity);
>  u64 ovl_dentry_version_get(struct dentry *dentry);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 422896406048..f72d6191357e 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -90,7 +90,10 @@ static inline struct ovl_entry *OVL_E(struct dentry *dentry)
>  }
>
>  struct ovl_inode {
> -       struct ovl_dir_cache *cache;
> +       union {
> +               struct ovl_dir_cache *cache;
> +               struct inode *lowerdata;


Please leave comments for union members
... cache; /* directory */
... lowerdata; /* regular file */

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
@ 2018-05-07 19:14   ` Amir Goldstein
  2018-05-10  9:19   ` Miklos Szeredi
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 19:14 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> It also allows for presence of metacopy dentries in lower layer.
>
> During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> set OVL_UPPERDATA bit in flags.
>
> We don't support metacopy feature with nfs_export. So in nfs_export code,
> we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
>
> Do not follow metacopy origin if we find a metacopy only inode and metacopy
> feature is not enabled for that mount. Like redirect, this can have security
> implications where an attacker could hand craft upper and try to gain
> access to file on lower which it should not have to begin with.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

OK, you may add

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/export.c    |   3 ++
>  fs/overlayfs/inode.c     |  11 ++++-
>  fs/overlayfs/namei.c     | 108 +++++++++++++++++++++++++++++++++++++++++------
>  fs/overlayfs/overlayfs.h |   1 +
>  fs/overlayfs/util.c      |  22 ++++++++++
>  5 files changed, 130 insertions(+), 15 deletions(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 0549286cc55e..52a09a9f74b7 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -314,6 +314,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>                 return ERR_CAST(inode);
>         }
>
> +       if (upper)
> +               ovl_set_flag(OVL_UPPERDATA, inode);
> +
>         dentry = d_find_any_alias(inode);
>         if (!dentry) {
>                 dentry = d_alloc_anon(inode->i_sb);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index c128d5d54d0f..83b276ce0240 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -770,7 +770,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
>         bool bylower = ovl_hash_bylower(oip->sb, upperdentry, lowerdentry,
>                                         oip->index);
>         int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
> -       bool is_dir;
> +       bool is_dir, metacopy = false;
>         unsigned long ino = 0;
>         int err = -ENOMEM;
>
> @@ -830,6 +830,15 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
>         if (oip->index)
>                 ovl_set_flag(OVL_INDEX, inode);
>
> +       if (upperdentry) {
> +               err = ovl_check_metacopy_xattr(upperdentry);
> +               if (err < 0)
> +                       goto out_err;
> +               metacopy = err;
> +               if (!metacopy)
> +                       ovl_set_flag(OVL_UPPERDATA, inode);
> +       }
> +
>         OVL_I(inode)->redirect = oip->redirect;
>
>         /* Check for non-merge dir that may have whiteouts */
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 8fd817bf5529..b2ff08985e29 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -24,6 +24,7 @@ struct ovl_lookup_data {
>         bool stop;
>         bool last;
>         char *redirect;
> +       bool metacopy;
>  };
>
>  static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
> @@ -253,19 +254,29 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
>                 goto put_and_out;
>         }
>         if (!d_can_lookup(this)) {
> -               d->stop = true;
> -               if (d->is_dir)
> +               if (d->is_dir) {
> +                       d->stop = true;
>                         goto put_and_out;
> -
> +               }
>                 /*
>                  * NB: handle failure to lookup non-last element when non-dir
>                  * redirects become possible
>                  */
>                 WARN_ON(!last_element);
> +               err = ovl_check_metacopy_xattr(this);
> +               if (err < 0)
> +                       goto out_err;
> +               d->stop = !err;
> +               d->metacopy = !!err;
>                 goto out;
>         }
> -       if (last_element)
> +       if (last_element) {
> +               if (d->metacopy) {
> +                       err = -ESTALE;
> +                       goto out_err;
> +               }
>                 d->is_dir = true;
> +       }
>         if (d->last)
>                 goto out;
>
> @@ -823,7 +834,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> -       struct ovl_path *stack = NULL;
> +       struct ovl_path *stack = NULL, *origin_path = NULL;
>         struct dentry *upperdir, *upperdentry = NULL;
>         struct dentry *origin = NULL;
>         struct dentry *index = NULL;
> @@ -834,6 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         struct dentry *this;
>         unsigned int i;
>         int err;
> +       bool metacopy = false;
>         struct ovl_lookup_data d = {
>                 .name = dentry->d_name,
>                 .is_dir = false,
> @@ -841,6 +853,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 .stop = false,
>                 .last = ofs->config.redirect_follow ? false : !poe->numlower,
>                 .redirect = NULL,
> +               .metacopy = false,
>         };
>
>         if (dentry->d_name.len > ofs->namelen)
> @@ -859,7 +872,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                         goto out;
>                 }
>                 if (upperdentry && !d.is_dir) {
> -                       BUG_ON(!d.stop || d.redirect);
> +                       unsigned int origin_ctr = 0;
> +                       BUG_ON(d.redirect);
>                         /*
>                          * Lookup copy up origin by decoding origin file handle.
>                          * We may get a disconnected dentry, which is fine,
> @@ -870,9 +884,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                          * number - it's the same as if we held a reference
>                          * to a dentry in lower layer that was moved under us.
>                          */
> -                       err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
> +                       err = ovl_check_origin(ofs, upperdentry, &origin_path,
> +                                              &origin_ctr);
>                         if (err)
>                                 goto out_put_upper;
> +
> +                       if (d.metacopy)
> +                               metacopy = true;
>                 }
>
>                 if (d.redirect) {
> @@ -913,7 +931,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                  * If no origin fh is stored in upper of a merge dir, store fh
>                  * of lower dir and set upper parent "impure".
>                  */
> -               if (upperdentry && !ctr && !ofs->noxattr) {
> +               if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
>                         err = ovl_fix_origin(dentry, this, upperdentry);
>                         if (err) {
>                                 dput(this);
> @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                  * When "verify_lower" feature is enabled, do not merge with a
>                  * lower dir that does not match a stored origin xattr. In any
>                  * case, only verified origin is used for index lookup.
> +                *
> +                * For non-dir dentry, make sure dentry found by lookup
> +                * matches the origin stored in upper. Otherwise its an
> +                * error.
>                  */
> -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
> +               if (upperdentry && !ctr &&
> +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
> +                    (!d.is_dir && origin_path))) {
>                         err = ovl_verify_origin(upperdentry, this, false);
>                         if (err) {
>                                 dput(this);
> -                               break;
> +                               if (d.is_dir)
> +                                       break;
> +                               goto out_put;
>                         }
> -
> -                       /* Bless lower dir as verified origin */
> +                       /* Bless lower as verified origin */
>                         origin = this;
>                 }
>
> +               if (d.metacopy)
> +                       metacopy = true;
> +               /*
> +                * Do not store intermediate metacopy dentries in chain,
> +                * except top most lower metacopy dentry
> +                */
> +               if (d.metacopy && ctr) {
> +                       dput(this);
> +                       continue;
> +               }
> +
>                 stack[ctr].dentry = this;
>                 stack[ctr].layer = lower.layer;
>                 ctr++;
> @@ -968,13 +1004,49 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 }
>         }
>
> +       if (metacopy) {
> +               /*
> +                * Found a metacopy dentry but did not find corresponding
> +                * data dentry
> +                */
> +               if (d.metacopy) {
> +                       err = -ESTALE;
> +                       goto out_put;
> +               }
> +
> +               err = -EPERM;
> +               if (!ofs->config.metacopy) {
> +                       pr_warn_ratelimited("overlay: refusing to follow"
> +                                           " metacopy origin for (%pd2)\n",
> +                                           dentry);
> +                       goto out_put;
> +               }
> +       } else if (!d.is_dir && upperdentry && !ctr && origin_path) {
> +               if (WARN_ON(stack != NULL)) {
> +                       err = -EIO;
> +                       goto out_put;
> +               }
> +               stack = origin_path;
> +               ctr = 1;
> +               origin_path = NULL;
> +       }
> +
>         /*
>          * Lookup index by lower inode and verify it matches upper inode.
>          * We only trust dir index if we verified that lower dir matches
>          * origin, otherwise dir index entries may be inconsistent and we
> -        * ignore them. Always lookup index of non-dir and non-upper.
> +        * ignore them.
> +        *
> +        * For non-dir upper metacopy dentry, we already set "origin" if we
> +        * verified that lower matched upper origin. If upper origin was
> +        * not present (because lower layer did not support fh encode/decode),
> +        * do not set "origin" and skip looking up index. This case should
> +        * be handled in same way as a non-dir upper without ORIGIN is
> +        * handled.
> +        *
> +        * Always lookup index of non-dir non-metacopy and non-upper.
>          */
> -       if (ctr && (!upperdentry || !d.is_dir))
> +       if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
>                 origin = stack[0].dentry;
>
>         if (origin && ovl_indexdir(dentry->d_sb) &&
> @@ -1015,6 +1087,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         }
>
>         revert_creds(old_cred);
> +       if (origin_path) {
> +               dput(origin_path->dentry);
> +               kfree(origin_path);
> +       }
>         dput(index);
>         kfree(stack);
>         kfree(d.redirect);
> @@ -1029,6 +1105,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 dput(stack[i].dentry);
>         kfree(stack);
>  out_put_upper:
> +       if (origin_path) {
> +               dput(origin_path->dentry);
> +               kfree(origin_path);
> +       }
>         dput(upperdentry);
>         kfree(upperredirect);
>  out:
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 2daea529b7eb..e8954fff1c45 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -274,6 +274,7 @@ bool ovl_need_index(struct dentry *dentry);
>  int ovl_nlink_start(struct dentry *dentry, bool *locked);
>  void ovl_nlink_end(struct dentry *dentry, bool locked);
>  int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
> +int ovl_check_metacopy_xattr(struct dentry *dentry);
>
>  static inline bool ovl_is_impuredir(struct dentry *dentry)
>  {
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index f8e3c95711b8..ab9a8fae0f99 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -778,3 +778,25 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
>         pr_err("overlayfs: failed to lock workdir+upperdir\n");
>         return -EIO;
>  }
> +
> +/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
> +int ovl_check_metacopy_xattr(struct dentry *dentry)
> +{
> +       int res;
> +
> +       /* Only regular files can have metacopy xattr */
> +       if (!S_ISREG(d_inode(dentry)->i_mode))
> +               return 0;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
> +       if (res < 0) {
> +               if (res == -ENODATA || res == -EOPNOTSUPP)
> +                       return 0;
> +               goto out;
> +       }
> +
> +       return 1;
> +out:
> +       pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
> +       return res;
> +}
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-07 18:33     ` Amir Goldstein
@ 2018-05-07 19:14       ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 19:14 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 09:33:32PM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 9:24 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Mon, May 07, 2018 at 09:10:25PM +0300, Amir Goldstein wrote:
> >> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > Hi,
> >> >
> >> > This is V15 of overlayfs metadata only copy-up feature. These patches I
> >> > have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
> >> >
> >> > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
> >> >
> >> > Patches are also available here.
> >> >
> >> > https://github.com/rhvgoyal/linux/commits/metacopy-v15
> >> >
> >> > I have run unionmount-testsuite and "./check -overlay -g quick" and that
> >> > works. Only 4 overlay tests fail, which fail on vanilla kernel too.
> >>
> >> I wonder which tests failed? My -g quick as well as -s auto runs got
> >> no fails.
> >
> > Hi Amir,
> >
> > overlay/16, overlay/41, overlay/43, overlay/44 fail for me, even with
> > vanilla kernel. I never debugged these to figure out what's happening.
> >
> > Vivek
> >
> >
> > overlay/016      - output mismatch (see /root/git/xfstests-dev/results//overlay/016.out.bad)
> >     --- tests/overlay/016.out   2017-04-19 08:18:17.658511331 -0400
> >     +++ /root/git/xfstests-dev/results//overlay/016.out.bad     2018-05-07 13:31:09.137091967 -0400
> >     @@ -8,4 +8,4 @@
> >      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >      wrote 16/16 bytes at offset 0
> >      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >     -00000000:  61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61  aaaaaaaaaaaaaaaa
> >     +00000000:  54 68 69 73 20 69 73 20 6f 6c 64 20 6e 65 77 73  This.is.old.news
> >     ...
> >     (Run 'diff -u tests/overlay/016.out /root/git/xfstests-dev/results//overlay/016.out.bad'  to see the entire diff)
> >
> >
> 
> This will pass if you set OVERLAY_MOUNT_OPTIONS="-o copy_up_shared=on"
> after rorw patches are merge, we can make the test enable this option.

Thanks for the suggestions. All 4 failing tests are passing now (both
with vanilla kernel and with metacopy patches).

Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure
  2018-05-07 17:40 ` [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure Vivek Goyal
@ 2018-05-07 19:26   ` Amir Goldstein
  2018-05-07 20:37     ` Vivek Goyal
  2018-05-08 13:45     ` Vivek Goyal
  0 siblings, 2 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 19:26 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> ovl_get_inode() right now has 5 parameters. Soon this patch series will
> add 2 more and suddenly argument list starts looking too long.
>
> Hence pass arguments to ovl_get_inode() in a structure and it looks
> little cleaner.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Reviewed-by: Amir Goldstein <amir73il@gmail.com>

After fixing some nits

> ---
>  fs/overlayfs/export.c    |  4 +++-
>  fs/overlayfs/inode.c     | 19 ++++++++++---------
>  fs/overlayfs/namei.c     |  6 ++++--
>  fs/overlayfs/overlayfs.h |  4 +---
>  fs/overlayfs/ovl_entry.h |  8 ++++++++
>  5 files changed, 26 insertions(+), 15 deletions(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 425a94672300..867946adcbc5 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -300,12 +300,14 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>         struct dentry *dentry;
>         struct inode *inode;
>         struct ovl_entry *oe;
> +       struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower};

{
   .index = index,
...

>
>         /* We get overlay directory dentries with ovl_lookup_real() */
>         if (d_is_dir(upper ?: lower))
>                 return ERR_PTR(-EIO);
>
> -       inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
> +       oip.upperdentry = dget(upper);
> +       inode = ovl_get_inode(&oip);
>         if (IS_ERR(inode)) {
>                 dput(upper);
>                 return ERR_CAST(inode);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 7abcf96e94fc..2fe9538fffc9 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -792,15 +792,16 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
>         return true;
>  }
>
> -struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> -                           struct ovl_path *lowerpath, struct dentry *index,
> -                           unsigned int numlower)
> +struct inode *ovl_get_inode(struct ovl_inode_params *oip)

IMO, sb should not be part of ovl_inode_params
it should be passed as a separate arg and ovl_inode_params
should be passed to ovl_inode_init() as well.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 18/30] ovl: Do not expose metacopy only dentry from d_real()
  2018-05-07 17:40 ` [PATCH v15 18/30] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
@ 2018-05-07 19:39   ` Amir Goldstein
  0 siblings, 0 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 19:39 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Metacopy dentry/inode is internal to overlay and is never exposed
> outside of it. Modify d_real() to look for only dentries/inode which
> have data and which are not metacopy.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

> ---
>  fs/overlayfs/super.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index ce90b6e3ce76..c97b5abda954 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -101,10 +101,11 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>         }
>
>         real = ovl_dentry_upper(dentry);
> -       if (real && (!inode || inode == d_inode(real)))
> +       if (real && ovl_has_upperdata(d_inode(dentry)) &&
> +           (!inode || inode == d_inode(real)))
>                 return real;
>
> -       real = ovl_dentry_lower(dentry);
> +       real = ovl_dentry_lowerdata(dentry);
>         if (!real)
>                 goto bug;
>
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-07 17:40 ` [PATCH v15 17/30] ovl: Open file with data except for the case of fsync Vivek Goyal
@ 2018-05-07 19:47   ` Amir Goldstein
  2018-05-07 20:59     ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-07 19:47 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> ovl_open() should open file which contains data and not open metacopy
> inode. With the introduction of metacopy inodes, with current implementaion
> we will end up opening metacopy inode as well.
>
> But there can be certain circumstances like ovl_fsync() where we
> want to allow opening a metacopy inode instead.
>
> Hence, change ovl_open_realfile() and ovl_open_real() and add extra
> parameter which specifies whether to allow opening metacopy inode or not.
> If this parameter is false, we look for data inode and open that.
>
> This should allow covering both the cases.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

except one nit

> ---
>  fs/overlayfs/file.c | 49 +++++++++++++++++++++++++++++++++----------------
>  1 file changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index a60734ec89ec..885151e8d0cb 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -14,22 +14,32 @@
>  #include <linux/uio.h>
>  #include "overlayfs.h"
>
> -static struct file *ovl_open_realfile(const struct file *file)
> +static struct file *ovl_open_realfile(const struct file *file,
> +                                     bool allow_metacopy)
>  {
>         struct inode *inode = file_inode(file);
>         struct inode *upperinode = ovl_inode_upper(inode);
> -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> +       struct inode *realinode;
>         struct file *realfile;
> +       bool upperopen = false;
>         const struct cred *old_cred;
>
> +       if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
> +               realinode = upperinode;
> +               upperopen = true;
> +       } else {
> +               realinode = allow_metacopy ? ovl_inode_lower(inode) :
> +                                ovl_inode_lowerdata(inode);
> +       }
>         old_cred = ovl_override_creds(inode->i_sb);
>         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
>                              realinode, current_cred(), false);
>         revert_creds(old_cred);
>
>         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> +                file, file, upperopen ? 'u' : 'l',
> +                file->f_flags, realfile,
> +                IS_ERR(realfile) ? 0 : realfile->f_flags);
>
>         return realfile;
>  }
> @@ -72,17 +82,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
>         return 0;
>  }
>
> -static int ovl_real_fdget(const struct file *file, struct fd *real)
> +static int ovl_real_fdget(const struct file *file, struct fd *real,
> +                         bool allow_metacopy)
>  {
>         struct inode *inode = file_inode(file);
> +       struct inode *realinode;
>
>         real->flags = 0;
>         real->file = file->private_data;
>
> +       if (allow_metacopy)
> +               realinode = ovl_inode_real(inode);
> +       else
> +               realinode = ovl_inode_realdata(inode);
> +
>         /* Has it been copied up since we'd opened it? */
> -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> +       if (unlikely(file_inode(real->file) != realinode)) {
>                 real->flags = FDPUT_FPUT;
> -               real->file = ovl_open_realfile(file);
> +               real->file = ovl_open_realfile(file, allow_metacopy);
>
>                 return PTR_ERR_OR_ZERO(real->file);
>         }
> @@ -107,7 +124,7 @@ static int ovl_open(struct inode *inode, struct file *file)
>         /* No longer need these flags, so don't pass them on to underlying fs */
>         file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
>
> -       realfile = ovl_open_realfile(file);
> +       realfile = ovl_open_realfile(file, false);
>         if (IS_ERR(realfile))
>                 return PTR_ERR(realfile);
>
> @@ -184,7 +201,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>         if (!iov_iter_count(iter))
>                 return 0;
>
> -       ret = ovl_real_fdget(file, &real);
> +       ret = ovl_real_fdget(file, &real, false);


Instead of changing all those call sites, use a wrapper

ovl_real_fdget(file, real) => ovl_real_fdget_metacopy(file, real, false)

and use ovl_real_fdget_metacopy() directly only when you want
the metacopy file.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure
  2018-05-07 19:26   ` Amir Goldstein
@ 2018-05-07 20:37     ` Vivek Goyal
  2018-05-08  4:45       ` Amir Goldstein
  2018-05-08 13:45     ` Vivek Goyal
  1 sibling, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 20:37 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 10:26:15PM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > ovl_get_inode() right now has 5 parameters. Soon this patch series will
> > add 2 more and suddenly argument list starts looking too long.
> >
> > Hence pass arguments to ovl_get_inode() in a structure and it looks
> > little cleaner.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> 
> After fixing some nits
> 
> > ---
> >  fs/overlayfs/export.c    |  4 +++-
> >  fs/overlayfs/inode.c     | 19 ++++++++++---------
> >  fs/overlayfs/namei.c     |  6 ++++--
> >  fs/overlayfs/overlayfs.h |  4 +---
> >  fs/overlayfs/ovl_entry.h |  8 ++++++++
> >  5 files changed, 26 insertions(+), 15 deletions(-)
> >
> > diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> > index 425a94672300..867946adcbc5 100644
> > --- a/fs/overlayfs/export.c
> > +++ b/fs/overlayfs/export.c
> > @@ -300,12 +300,14 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
> >         struct dentry *dentry;
> >         struct inode *inode;
> >         struct ovl_entry *oe;
> > +       struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower};
> 
> {
>    .index = index,
> ...

Will fix this.

> 
> >
> >         /* We get overlay directory dentries with ovl_lookup_real() */
> >         if (d_is_dir(upper ?: lower))
> >                 return ERR_PTR(-EIO);
> >
> > -       inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
> > +       oip.upperdentry = dget(upper);
> > +       inode = ovl_get_inode(&oip);
> >         if (IS_ERR(inode)) {
> >                 dput(upper);
> >                 return ERR_CAST(inode);
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index 7abcf96e94fc..2fe9538fffc9 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -792,15 +792,16 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
> >         return true;
> >  }
> >
> > -struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
> > -                           struct ovl_path *lowerpath, struct dentry *index,
> > -                           unsigned int numlower)
> > +struct inode *ovl_get_inode(struct ovl_inode_params *oip)
> 
> IMO, sb should not be part of ovl_inode_params
> it should be passed as a separate arg

Will do.

> and ovl_inode_params
> should be passed to ovl_inode_init() as well.

I want to avoid making this change as part of this series. Right now it
has 3 args and my patches add one more. Four arguments are not a lot. And
even if we pass oip, only 3 will go in it. inode param remains outside
of it.

And it is being called from super.c as well as inode.c. So while fixable,
but it increases the code further. At this point of time, trying to limit
the changes to which are really needed and are easy to do.

Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-07 19:47   ` Amir Goldstein
@ 2018-05-07 20:59     ` Vivek Goyal
  2018-05-08  5:26       ` Amir Goldstein
  0 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-07 20:59 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 10:47:28PM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > ovl_open() should open file which contains data and not open metacopy
> > inode. With the introduction of metacopy inodes, with current implementaion
> > we will end up opening metacopy inode as well.
> >
> > But there can be certain circumstances like ovl_fsync() where we
> > want to allow opening a metacopy inode instead.
> >
> > Hence, change ovl_open_realfile() and ovl_open_real() and add extra
> > parameter which specifies whether to allow opening metacopy inode or not.
> > If this parameter is false, we look for data inode and open that.
> >
> > This should allow covering both the cases.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> 
> except one nit
> 
> > ---
> >  fs/overlayfs/file.c | 49 +++++++++++++++++++++++++++++++++----------------
> >  1 file changed, 33 insertions(+), 16 deletions(-)
> >
> > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> > index a60734ec89ec..885151e8d0cb 100644
> > --- a/fs/overlayfs/file.c
> > +++ b/fs/overlayfs/file.c
> > @@ -14,22 +14,32 @@
> >  #include <linux/uio.h>
> >  #include "overlayfs.h"
> >
> > -static struct file *ovl_open_realfile(const struct file *file)
> > +static struct file *ovl_open_realfile(const struct file *file,
> > +                                     bool allow_metacopy)
> >  {
> >         struct inode *inode = file_inode(file);
> >         struct inode *upperinode = ovl_inode_upper(inode);
> > -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> > +       struct inode *realinode;
> >         struct file *realfile;
> > +       bool upperopen = false;
> >         const struct cred *old_cred;
> >
> > +       if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
> > +               realinode = upperinode;
> > +               upperopen = true;
> > +       } else {
> > +               realinode = allow_metacopy ? ovl_inode_lower(inode) :
> > +                                ovl_inode_lowerdata(inode);
> > +       }
> >         old_cred = ovl_override_creds(inode->i_sb);
> >         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
> >                              realinode, current_cred(), false);
> >         revert_creds(old_cred);
> >
> >         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> > -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> > -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> > +                file, file, upperopen ? 'u' : 'l',
> > +                file->f_flags, realfile,
> > +                IS_ERR(realfile) ? 0 : realfile->f_flags);
> >
> >         return realfile;
> >  }
> > @@ -72,17 +82,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
> >         return 0;
> >  }
> >
> > -static int ovl_real_fdget(const struct file *file, struct fd *real)
> > +static int ovl_real_fdget(const struct file *file, struct fd *real,
> > +                         bool allow_metacopy)
> >  {
> >         struct inode *inode = file_inode(file);
> > +       struct inode *realinode;
> >
> >         real->flags = 0;
> >         real->file = file->private_data;
> >
> > +       if (allow_metacopy)
> > +               realinode = ovl_inode_real(inode);
> > +       else
> > +               realinode = ovl_inode_realdata(inode);
> > +
> >         /* Has it been copied up since we'd opened it? */
> > -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> > +       if (unlikely(file_inode(real->file) != realinode)) {
> >                 real->flags = FDPUT_FPUT;
> > -               real->file = ovl_open_realfile(file);
> > +               real->file = ovl_open_realfile(file, allow_metacopy);
> >
> >                 return PTR_ERR_OR_ZERO(real->file);
> >         }
> > @@ -107,7 +124,7 @@ static int ovl_open(struct inode *inode, struct file *file)
> >         /* No longer need these flags, so don't pass them on to underlying fs */
> >         file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
> >
> > -       realfile = ovl_open_realfile(file);
> > +       realfile = ovl_open_realfile(file, false);
> >         if (IS_ERR(realfile))
> >                 return PTR_ERR(realfile);
> >
> > @@ -184,7 +201,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
> >         if (!iov_iter_count(iter))
> >                 return 0;
> >
> > -       ret = ovl_real_fdget(file, &real);
> > +       ret = ovl_real_fdget(file, &real, false);
> 
> 
> Instead of changing all those call sites, use a wrapper
> 
> ovl_real_fdget(file, real) => ovl_real_fdget_metacopy(file, real, false)
> 
> and use ovl_real_fdget_metacopy() directly only when you want
> the metacopy file.

You mean define another helper ovl_real_fdget_metacopy(file, real) and
use that when metacopy file is desired/allowed?

That's what I had done in previous series. That is define a separate
wrapper for metacopy and you had not liked it and that's why I did
it this way.

Anyway, I will change it again.

Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure
  2018-05-07 20:37     ` Vivek Goyal
@ 2018-05-08  4:45       ` Amir Goldstein
  0 siblings, 0 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-08  4:45 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 11:37 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, May 07, 2018 at 10:26:15PM +0300, Amir Goldstein wrote:
>> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > ovl_get_inode() right now has 5 parameters. Soon this patch series will
>> > add 2 more and suddenly argument list starts looking too long.
>> >
>> > Hence pass arguments to ovl_get_inode() in a structure and it looks
>> > little cleaner.
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>>
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>
>> After fixing some nits
>>
>> > ---
>> >  fs/overlayfs/export.c    |  4 +++-
>> >  fs/overlayfs/inode.c     | 19 ++++++++++---------
>> >  fs/overlayfs/namei.c     |  6 ++++--
>> >  fs/overlayfs/overlayfs.h |  4 +---
>> >  fs/overlayfs/ovl_entry.h |  8 ++++++++
>> >  5 files changed, 26 insertions(+), 15 deletions(-)
>> >
>> > diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
>> > index 425a94672300..867946adcbc5 100644
>> > --- a/fs/overlayfs/export.c
>> > +++ b/fs/overlayfs/export.c
>> > @@ -300,12 +300,14 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>> >         struct dentry *dentry;
>> >         struct inode *inode;
>> >         struct ovl_entry *oe;
>> > +       struct ovl_inode_params oip = {sb, NULL, lowerpath, index, !!lower};
>>
>> {
>>    .index = index,
>> ...
>
> Will fix this.
>
>>
>> >
>> >         /* We get overlay directory dentries with ovl_lookup_real() */
>> >         if (d_is_dir(upper ?: lower))
>> >                 return ERR_PTR(-EIO);
>> >
>> > -       inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
>> > +       oip.upperdentry = dget(upper);
>> > +       inode = ovl_get_inode(&oip);
>> >         if (IS_ERR(inode)) {
>> >                 dput(upper);
>> >                 return ERR_CAST(inode);
>> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>> > index 7abcf96e94fc..2fe9538fffc9 100644
>> > --- a/fs/overlayfs/inode.c
>> > +++ b/fs/overlayfs/inode.c
>> > @@ -792,15 +792,16 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
>> >         return true;
>> >  }
>> >
>> > -struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
>> > -                           struct ovl_path *lowerpath, struct dentry *index,
>> > -                           unsigned int numlower)
>> > +struct inode *ovl_get_inode(struct ovl_inode_params *oip)
>>
>> IMO, sb should not be part of ovl_inode_params
>> it should be passed as a separate arg
>
> Will do.
>
>> and ovl_inode_params
>> should be passed to ovl_inode_init() as well.
>
> I want to avoid making this change as part of this series. Right now it
> has 3 args and my patches add one more. Four arguments are not a lot. And
> even if we pass oip, only 3 will go in it. inode param remains outside
> of it.
>
> And it is being called from super.c as well as inode.c. So while fixable,
> but it increases the code further. At this point of time, trying to limit
> the changes to which are really needed and are easy to do.
>

OK.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-07 20:59     ` Vivek Goyal
@ 2018-05-08  5:26       ` Amir Goldstein
  2018-05-08 12:50         ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-08  5:26 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Mon, May 7, 2018 at 11:59 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, May 07, 2018 at 10:47:28PM +0300, Amir Goldstein wrote:
>> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > ovl_open() should open file which contains data and not open metacopy
>> > inode. With the introduction of metacopy inodes, with current implementaion
>> > we will end up opening metacopy inode as well.
>> >
>> > But there can be certain circumstances like ovl_fsync() where we
>> > want to allow opening a metacopy inode instead.
>> >
>> > Hence, change ovl_open_realfile() and ovl_open_real() and add extra
>> > parameter which specifies whether to allow opening metacopy inode or not.
>> > If this parameter is false, we look for data inode and open that.
>> >
>> > This should allow covering both the cases.
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>
>> except one nit
>>
>> > ---
>> >  fs/overlayfs/file.c | 49 +++++++++++++++++++++++++++++++++----------------
>> >  1 file changed, 33 insertions(+), 16 deletions(-)
>> >
>> > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>> > index a60734ec89ec..885151e8d0cb 100644
>> > --- a/fs/overlayfs/file.c
>> > +++ b/fs/overlayfs/file.c
>> > @@ -14,22 +14,32 @@
>> >  #include <linux/uio.h>
>> >  #include "overlayfs.h"
>> >
>> > -static struct file *ovl_open_realfile(const struct file *file)
>> > +static struct file *ovl_open_realfile(const struct file *file,
>> > +                                     bool allow_metacopy)
>> >  {
>> >         struct inode *inode = file_inode(file);
>> >         struct inode *upperinode = ovl_inode_upper(inode);
>> > -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
>> > +       struct inode *realinode;
>> >         struct file *realfile;
>> > +       bool upperopen = false;
>> >         const struct cred *old_cred;
>> >
>> > +       if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
>> > +               realinode = upperinode;
>> > +               upperopen = true;
>> > +       } else {
>> > +               realinode = allow_metacopy ? ovl_inode_lower(inode) :
>> > +                                ovl_inode_lowerdata(inode);
>> > +       }
>> >         old_cred = ovl_override_creds(inode->i_sb);
>> >         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
>> >                              realinode, current_cred(), false);
>> >         revert_creds(old_cred);
>> >
>> >         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
>> > -                file, file, upperinode ? 'u' : 'l', file->f_flags,
>> > -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
>> > +                file, file, upperopen ? 'u' : 'l',
>> > +                file->f_flags, realfile,
>> > +                IS_ERR(realfile) ? 0 : realfile->f_flags);
>> >
>> >         return realfile;
>> >  }
>> > @@ -72,17 +82,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
>> >         return 0;
>> >  }
>> >
>> > -static int ovl_real_fdget(const struct file *file, struct fd *real)
>> > +static int ovl_real_fdget(const struct file *file, struct fd *real,
>> > +                         bool allow_metacopy)
>> >  {
>> >         struct inode *inode = file_inode(file);
>> > +       struct inode *realinode;
>> >
>> >         real->flags = 0;
>> >         real->file = file->private_data;
>> >
>> > +       if (allow_metacopy)
>> > +               realinode = ovl_inode_real(inode);
>> > +       else
>> > +               realinode = ovl_inode_realdata(inode);
>> > +
>> >         /* Has it been copied up since we'd opened it? */
>> > -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
>> > +       if (unlikely(file_inode(real->file) != realinode)) {
>> >                 real->flags = FDPUT_FPUT;
>> > -               real->file = ovl_open_realfile(file);
>> > +               real->file = ovl_open_realfile(file, allow_metacopy);
>> >
>> >                 return PTR_ERR_OR_ZERO(real->file);
>> >         }
>> > @@ -107,7 +124,7 @@ static int ovl_open(struct inode *inode, struct file *file)
>> >         /* No longer need these flags, so don't pass them on to underlying fs */
>> >         file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
>> >
>> > -       realfile = ovl_open_realfile(file);
>> > +       realfile = ovl_open_realfile(file, false);
>> >         if (IS_ERR(realfile))
>> >                 return PTR_ERR(realfile);
>> >
>> > @@ -184,7 +201,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>> >         if (!iov_iter_count(iter))
>> >                 return 0;
>> >
>> > -       ret = ovl_real_fdget(file, &real);
>> > +       ret = ovl_real_fdget(file, &real, false);
>>
>>
>> Instead of changing all those call sites, use a wrapper
>>
>> ovl_real_fdget(file, real) => ovl_real_fdget_metacopy(file, real, false)
>>
>> and use ovl_real_fdget_metacopy() directly only when you want
>> the metacopy file.
>
> You mean define another helper ovl_real_fdget_metacopy(file, real) and
> use that when metacopy file is desired/allowed?
>
> That's what I had done in previous series. That is define a separate
> wrapper for metacopy and you had not liked it and that's why I did
> it this way.

We miss-communicated. I did like it, but I though it was not needed
because of fsync implementation bug (which I forgot to report to Miklos).
I was wrong though. I understand now why allow_metacopy is needed
for fsync.

What you actually need to call from ovl_fsync() is:
_ovl_real_file(file, real, !datasync);

So you don't actually have a user for ovl_real_meta_file(), but that's
fine to have the helper anyway.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-08  5:26       ` Amir Goldstein
@ 2018-05-08 12:50         ` Vivek Goyal
  2018-05-08 14:14           ` Amir Goldstein
  0 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 12:50 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Tue, May 08, 2018 at 08:26:28AM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 11:59 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Mon, May 07, 2018 at 10:47:28PM +0300, Amir Goldstein wrote:
> >> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > ovl_open() should open file which contains data and not open metacopy
> >> > inode. With the introduction of metacopy inodes, with current implementaion
> >> > we will end up opening metacopy inode as well.
> >> >
> >> > But there can be certain circumstances like ovl_fsync() where we
> >> > want to allow opening a metacopy inode instead.
> >> >
> >> > Hence, change ovl_open_realfile() and ovl_open_real() and add extra
> >> > parameter which specifies whether to allow opening metacopy inode or not.
> >> > If this parameter is false, we look for data inode and open that.
> >> >
> >> > This should allow covering both the cases.
> >> >
> >> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> >> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >>
> >> except one nit
> >>
> >> > ---
> >> >  fs/overlayfs/file.c | 49 +++++++++++++++++++++++++++++++++----------------
> >> >  1 file changed, 33 insertions(+), 16 deletions(-)
> >> >
> >> > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> >> > index a60734ec89ec..885151e8d0cb 100644
> >> > --- a/fs/overlayfs/file.c
> >> > +++ b/fs/overlayfs/file.c
> >> > @@ -14,22 +14,32 @@
> >> >  #include <linux/uio.h>
> >> >  #include "overlayfs.h"
> >> >
> >> > -static struct file *ovl_open_realfile(const struct file *file)
> >> > +static struct file *ovl_open_realfile(const struct file *file,
> >> > +                                     bool allow_metacopy)
> >> >  {
> >> >         struct inode *inode = file_inode(file);
> >> >         struct inode *upperinode = ovl_inode_upper(inode);
> >> > -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> >> > +       struct inode *realinode;
> >> >         struct file *realfile;
> >> > +       bool upperopen = false;
> >> >         const struct cred *old_cred;
> >> >
> >> > +       if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
> >> > +               realinode = upperinode;
> >> > +               upperopen = true;
> >> > +       } else {
> >> > +               realinode = allow_metacopy ? ovl_inode_lower(inode) :
> >> > +                                ovl_inode_lowerdata(inode);
> >> > +       }
> >> >         old_cred = ovl_override_creds(inode->i_sb);
> >> >         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
> >> >                              realinode, current_cred(), false);
> >> >         revert_creds(old_cred);
> >> >
> >> >         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> >> > -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> >> > -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> >> > +                file, file, upperopen ? 'u' : 'l',
> >> > +                file->f_flags, realfile,
> >> > +                IS_ERR(realfile) ? 0 : realfile->f_flags);
> >> >
> >> >         return realfile;
> >> >  }
> >> > @@ -72,17 +82,24 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
> >> >         return 0;
> >> >  }
> >> >
> >> > -static int ovl_real_fdget(const struct file *file, struct fd *real)
> >> > +static int ovl_real_fdget(const struct file *file, struct fd *real,
> >> > +                         bool allow_metacopy)
> >> >  {
> >> >         struct inode *inode = file_inode(file);
> >> > +       struct inode *realinode;
> >> >
> >> >         real->flags = 0;
> >> >         real->file = file->private_data;
> >> >
> >> > +       if (allow_metacopy)
> >> > +               realinode = ovl_inode_real(inode);
> >> > +       else
> >> > +               realinode = ovl_inode_realdata(inode);
> >> > +
> >> >         /* Has it been copied up since we'd opened it? */
> >> > -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> >> > +       if (unlikely(file_inode(real->file) != realinode)) {
> >> >                 real->flags = FDPUT_FPUT;
> >> > -               real->file = ovl_open_realfile(file);
> >> > +               real->file = ovl_open_realfile(file, allow_metacopy);
> >> >
> >> >                 return PTR_ERR_OR_ZERO(real->file);
> >> >         }
> >> > @@ -107,7 +124,7 @@ static int ovl_open(struct inode *inode, struct file *file)
> >> >         /* No longer need these flags, so don't pass them on to underlying fs */
> >> >         file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
> >> >
> >> > -       realfile = ovl_open_realfile(file);
> >> > +       realfile = ovl_open_realfile(file, false);
> >> >         if (IS_ERR(realfile))
> >> >                 return PTR_ERR(realfile);
> >> >
> >> > @@ -184,7 +201,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
> >> >         if (!iov_iter_count(iter))
> >> >                 return 0;
> >> >
> >> > -       ret = ovl_real_fdget(file, &real);
> >> > +       ret = ovl_real_fdget(file, &real, false);
> >>
> >>
> >> Instead of changing all those call sites, use a wrapper
> >>
> >> ovl_real_fdget(file, real) => ovl_real_fdget_metacopy(file, real, false)
> >>
> >> and use ovl_real_fdget_metacopy() directly only when you want
> >> the metacopy file.
> >
> > You mean define another helper ovl_real_fdget_metacopy(file, real) and
> > use that when metacopy file is desired/allowed?
> >
> > That's what I had done in previous series. That is define a separate
> > wrapper for metacopy and you had not liked it and that's why I did
> > it this way.
> 
> We miss-communicated. I did like it, but I though it was not needed
> because of fsync implementation bug (which I forgot to report to Miklos).
> I was wrong though. I understand now why allow_metacopy is needed
> for fsync.
> 
> What you actually need to call from ovl_fsync() is:
> _ovl_real_file(file, real, !datasync);
> 
> So you don't actually have a user for ovl_real_meta_file(), but that's
> fine to have the helper anyway.

Ok, here is the updated patch. I have not defined quivalent of
ovl_real_meta_file() as there are no users.


Subject: ovl: Open file with data except for the case of fsync

ovl_open() should open file which contains data and not open metacopy
inode. With the introduction of metacopy inodes, with current implementaion
we will end up opening metacopy inode as well.

But there can be certain circumstances like ovl_fsync() where we
want to allow opening a metacopy inode instead. 

Hence, change ovl_open_realfile() and ovl_open_real() and add extra
parameter which specifies whether to allow opening metacopy inode or not.
If this parameter is false, we look for data inode and open that.

This should allow covering both the cases.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/file.c |   40 +++++++++++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 9 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/file.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/file.c	2018-05-07 16:55:02.562350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/file.c	2018-05-08 08:46:49.994350785 -0400
@@ -14,22 +14,32 @@
 #include <linux/uio.h>
 #include "overlayfs.h"
 
-static struct file *ovl_open_realfile(const struct file *file)
+static struct file *ovl_open_realfile(const struct file *file,
+				      bool allow_metacopy)
 {
 	struct inode *inode = file_inode(file);
 	struct inode *upperinode = ovl_inode_upper(inode);
-	struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
+	struct inode *realinode;
 	struct file *realfile;
+	bool upperopen = false;
 	const struct cred *old_cred;
 
+	if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
+		realinode = upperinode;
+		upperopen = true;
+	} else {
+		realinode = allow_metacopy ? ovl_inode_lower(inode) :
+				 ovl_inode_lowerdata(inode);
+	}
 	old_cred = ovl_override_creds(inode->i_sb);
 	realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
 			     realinode, current_cred(), false);
 	revert_creds(old_cred);
 
 	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
-		 file, file, upperinode ? 'u' : 'l', file->f_flags,
-		 realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
+		 file, file, upperopen ? 'u' : 'l',
+		 file->f_flags, realfile,
+		 IS_ERR(realfile) ? 0 : realfile->f_flags);
 
 	return realfile;
 }
@@ -72,17 +82,24 @@ static int ovl_change_flags(struct file
 	return 0;
 }
 
-static int ovl_real_fdget(const struct file *file, struct fd *real)
+static int _ovl_real_fdget(const struct file *file, struct fd *real,
+			  bool allow_metacopy)
 {
 	struct inode *inode = file_inode(file);
+	struct inode *realinode;
 
 	real->flags = 0;
 	real->file = file->private_data;
 
+	if (allow_metacopy)
+		realinode = ovl_inode_real(inode);
+	else
+		realinode = ovl_inode_realdata(inode);
+
 	/* Has it been copied up since we'd opened it? */
-	if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
+	if (unlikely(file_inode(real->file) != realinode)) {
 		real->flags = FDPUT_FPUT;
-		real->file = ovl_open_realfile(file);
+		real->file = ovl_open_realfile(file, allow_metacopy);
 
 		return PTR_ERR_OR_ZERO(real->file);
 	}
@@ -94,6 +111,11 @@ static int ovl_real_fdget(const struct f
 	return 0;
 }
 
+static int ovl_real_fdget(const struct file *file, struct fd *real)
+{
+	return _ovl_real_fdget(file, real, false);
+}
+
 static int ovl_open(struct inode *inode, struct file *file)
 {
 	struct dentry *dentry = file_dentry(file);
@@ -107,7 +129,7 @@ static int ovl_open(struct inode *inode,
 	/* No longer need these flags, so don't pass them on to underlying fs */
 	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
 
-	realfile = ovl_open_realfile(file);
+	realfile = ovl_open_realfile(file, false);
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 
@@ -244,7 +266,7 @@ static int ovl_fsync(struct file *file,
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = _ovl_real_fdget(file, &real, !datasync);
 	if (ret)
 		return ret;
 

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 26/30] ovl: Check redirect on index as well
  2018-05-07 18:43   ` Amir Goldstein
@ 2018-05-08 12:58     ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 12:58 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 09:43:13PM +0300, Amir Goldstein wrote:

[..]
> > diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> > index cd06e7ff9fd1..8d5beed3876b 100644
> > --- a/fs/overlayfs/namei.c
> > +++ b/fs/overlayfs/namei.c
> > @@ -31,32 +31,20 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
> >                               size_t prelen, const char *post)
> >  {
> >         int res;
> > -       char *s, *next, *buf = NULL;
> > +       char *buf;
> >
> > -       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
> > -       if (res < 0) {
> > -               if (res == -ENODATA || res == -EOPNOTSUPP)
> > -                       return 0;
> > -               goto fail;
> > -       }
> > -       buf = kzalloc(prelen + res + strlen(post) + 1, GFP_KERNEL);
> > +       buf = ovl_get_redirect_xattr(dentry, prelen + strlen(post));
> >         if (!buf)
> > -               return -ENOMEM;
> > +               return 0;
> >
> > -       if (res == 0)
> > -               goto invalid;
> > +       if (IS_ERR(buf)) {
> > +               res = PTR_ERR(buf);
> > +               pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n",
> > +                                   res);
> > +               return res;
> > +       }
> 
> Don't see why you didn't leave this in the helper.

I don't have any good reason to do it that way. Here is the updated patch.

Subject: ovl: Check redirect on index as well

Right now we seem to check redirect only if upperdentry is found. But it
is possible that there is no upperdentry but later we found an index.

We need to check redirect on index as well and set it in ovl_inode->redirect.
Otherwise link code can assume that dentry does not have redirect and
place a new one which breaks things. In my testing overlay/033 test
started failing in xfstests. Following are the details.

For example do following.

$ mkdir lower upper work merged

- Make lower dir with 4 links.
  $ echo "foo" > lower/l0.txt
  $ ln  lower/l0.txt lower/l1.txt 
  $ ln  lower/l0.txt lower/l2.txt 
  $ ln  lower/l0.txt lower/l3.txt 

- Mount with index on and metacopy on.

  $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,index=on,metacopy=on none merged

- Link lower

  $ ln merged/l0.txt merged/l4.txt
    (This will metadata copy up of l0.txt and put an absolute redirect
     /l0.txt)

  $ echo 2 > /proc/sys/vm/drop/caches

  $ ls merged/l1.txt
  (Now l1.txt will be looked up. There is no upper dentry but there is
   lower dentry and index will be found. We don't check for redirect on
   index, hence ovl_inode->redirect will be NULL.)

- Link Upper

  $ ln merged/l4.txt merged/l5.txt
  (Lookup of l4.txt will use inode from l1.txt lookup which is still in
   cache. It has ovl_inode->redirect NULL, hence link will put a new
   redirect and replace /l0.txt with /l4.txt

- Drop caches.
  echo 2 > /proc/sys/vm/drop_caches

- List l1.txt and it returns -ESTALE

  $ ls merged/l0.txt

  (It returns stale because, we found a metacopy of l0.txt in upper
   and it has redirect l4.txt but there is no file named l4.txt in
   lower layer. So lower data copy is not found and -ESTALE is returned.)

So problem here is that we did not process redirect on index. Check
redirect on index as well and then problem is fixed.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     |   50 ++++++++++++-----------------------------------
 fs/overlayfs/overlayfs.h |    1 
 fs/overlayfs/util.c      |   50 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 37 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/util.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/util.c	2018-05-08 08:50:50.896350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/util.c	2018-05-08 08:55:03.440350785 -0400
@@ -878,3 +878,53 @@ bool ovl_is_metacopy_dentry(struct dentr
 
 	return (oe->numlower > 1);
 }
+
+char *ovl_get_redirect_xattr(struct dentry *dentry, int padding)
+{
+	int res;
+	char *s, *next, *buf = NULL;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return NULL;
+		goto fail;
+	}
+
+	buf = kzalloc(res + padding + 1, GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	if (res == 0)
+		goto invalid;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
+	if (res < 0)
+		goto fail;
+	if (res == 0)
+		goto invalid;
+
+	if (buf[0] == '/') {
+		for (s = buf; *s++ == '/'; s = next) {
+			next = strchrnul(s, '/');
+			if (s == next)
+				goto invalid;
+		}
+	} else {
+		if (strchr(buf, '/') != NULL)
+			goto invalid;
+	}
+
+	return buf;
+
+err_free:
+	kfree(buf);
+	return ERR_PTR(res);
+fail:
+	pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res);
+	goto err_free;
+invalid:
+	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
+	res = -EINVAL;
+	goto err_free;
+}
Index: rhvgoyal-linux/fs/overlayfs/overlayfs.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/overlayfs.h	2018-05-08 08:50:50.871350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/overlayfs.h	2018-05-08 08:54:53.820350785 -0400
@@ -283,6 +283,7 @@ void ovl_nlink_end(struct dentry *dentry
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
 int ovl_check_metacopy_xattr(struct dentry *dentry);
 bool ovl_is_metacopy_dentry(struct dentry *dentry);
+char *ovl_get_redirect_xattr(struct dentry *dentry, int padding);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-05-08 08:50:49.444350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-05-08 08:55:03.441350785 -0400
@@ -31,32 +31,13 @@ static int ovl_check_redirect(struct den
 			      size_t prelen, const char *post)
 {
 	int res;
-	char *s, *next, *buf = NULL;
+	char *buf;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
-	if (res < 0) {
-		if (res == -ENODATA || res == -EOPNOTSUPP)
-			return 0;
-		goto fail;
-	}
-	buf = kzalloc(prelen + res + strlen(post) + 1, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
+	buf = ovl_get_redirect_xattr(dentry, prelen + strlen(post));
+	if (IS_ERR_OR_NULL(buf))
+		return PTR_ERR(buf);
 
-	if (res == 0)
-		goto invalid;
-
-	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
-	if (res < 0)
-		goto fail;
-	if (res == 0)
-		goto invalid;
 	if (buf[0] == '/') {
-		for (s = buf; *s++ == '/'; s = next) {
-			next = strchrnul(s, '/');
-			if (s == next)
-				goto invalid;
-		}
 		/*
 		 * One of the ancestor path elements in an absolute path
 		 * lookup in ovl_lookup_layer() could have been opaque and
@@ -67,9 +48,7 @@ static int ovl_check_redirect(struct den
 		 */
 		d->stop = false;
 	} else {
-		if (strchr(buf, '/') != NULL)
-			goto invalid;
-
+		res = strlen(buf) + 1;
 		memmove(buf + prelen, buf, res);
 		memcpy(buf, d->name.name, prelen);
 	}
@@ -81,16 +60,6 @@ static int ovl_check_redirect(struct den
 	d->name.len = strlen(d->redirect);
 
 	return 0;
-
-err_free:
-	kfree(buf);
-	return 0;
-fail:
-	pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res);
-	goto err_free;
-invalid:
-	pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf);
-	goto err_free;
 }
 
 static int ovl_acceptable(void *ctx, struct dentry *dentry)
@@ -1073,8 +1042,15 @@ struct dentry *ovl_lookup(struct inode *
 
 	if (upperdentry)
 		ovl_dentry_set_upper_alias(dentry);
-	else if (index)
+	else if (index) {
 		upperdentry = dget(index);
+		upperredirect = ovl_get_redirect_xattr(upperdentry, 0);
+		if (IS_ERR(upperredirect)) {
+			err = PTR_ERR(upperredirect);
+			upperredirect = NULL;
+			goto out_free_oe;
+		}
+	}
 
 	if (upperdentry || ctr) {
 		struct ovl_inode_params oip = {

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
                   ` (30 preceding siblings ...)
  2018-05-07 18:10 ` [PATCH v15 00/30] overlayfs: Delayed copy up of data Amir Goldstein
@ 2018-05-08 13:42 ` Vivek Goyal
  2018-05-08 14:16   ` Amir Goldstein
  31 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 13:42 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il

On Mon, May 07, 2018 at 01:40:32PM -0400, Vivek Goyal wrote:
> Hi,
> 
> This is V15 of overlayfs metadata only copy-up feature. These patches I
> have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
> 
> Patches are also available here.
> 
> https://github.com/rhvgoyal/linux/commits/metacopy-v15
> 
> I have run unionmount-testsuite and "./check -overlay -g quick" and that
> works. Only 4 overlay tests fail, which fail on vanilla kernel too.
> 

Hi Amir,

I have taken care of your review comments and pushed new patches at
"metcopy-next" branch.

https://github.com/rhvgoyal/linux/commits/metacopy-next

Changes are small and I am not sure if I should be patch bomb mailing
list again.

Amir, Miklos, do let me know if I should post V16 patches on mailing
list.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure
  2018-05-07 19:26   ` Amir Goldstein
  2018-05-07 20:37     ` Vivek Goyal
@ 2018-05-08 13:45     ` Vivek Goyal
  1 sibling, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 13:45 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 10:26:15PM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > ovl_get_inode() right now has 5 parameters. Soon this patch series will
> > add 2 more and suddenly argument list starts looking too long.
> >
> > Hence pass arguments to ovl_get_inode() in a structure and it looks
> > little cleaner.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> 
> After fixing some nits

Here is the new version of patch.

Subject: ovl: Pass argument to ovl_get_inode() in a structure

ovl_get_inode() right now has 5 parameters. Soon this patch series will
add 2 more and suddenly argument list starts looking too long.

Hence pass arguments to ovl_get_inode() in a structure and it looks
little cleaner.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c    |    8 +++++++-
 fs/overlayfs/inode.c     |   20 +++++++++++---------
 fs/overlayfs/namei.c     |   10 ++++++++--
 fs/overlayfs/overlayfs.h |    5 ++---
 fs/overlayfs/ovl_entry.h |    7 +++++++
 5 files changed, 35 insertions(+), 15 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/ovl_entry.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/ovl_entry.h	2018-05-07 16:30:06.874350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/ovl_entry.h	2018-05-07 16:42:22.396350785 -0400
@@ -101,6 +101,13 @@ struct ovl_inode {
 	struct mutex lock;
 };
 
+struct ovl_inode_params {
+	struct dentry *upperdentry;
+	struct ovl_path *lowerpath;
+	struct dentry *index;
+	unsigned int numlower;
+};
+
 static inline struct ovl_inode *OVL_I(struct inode *inode)
 {
 	return container_of(inode, struct ovl_inode, vfs_inode);
Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-05-07 16:30:06.874350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-05-07 16:42:22.398350785 -0400
@@ -1004,8 +1004,14 @@ struct dentry *ovl_lookup(struct inode *
 		upperdentry = dget(index);
 
 	if (upperdentry || ctr) {
-		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
-				      ctr);
+		struct ovl_inode_params oip = {
+			.upperdentry = upperdentry,
+			.lowerpath = stack,
+			.index = index,
+			.numlower = ctr,
+		};
+
+		inode = ovl_get_inode(dentry->d_sb, &oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2018-05-07 16:30:06.874350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2018-05-07 16:42:22.395350785 -0400
@@ -792,15 +792,17 @@ static bool ovl_hash_bylower(struct supe
 	return true;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower)
+struct inode *ovl_get_inode(struct super_block *sb,
+			    struct ovl_inode_params *oip)
 {
+	struct dentry *upperdentry = oip->upperdentry;
+	struct ovl_path *lowerpath = oip->lowerpath;
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
 	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
-	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
-	int fsid = bylower ? lowerpath->layer->fsid : 0;
+	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
+					oip->index);
+	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
 
@@ -817,8 +819,8 @@ struct inode *ovl_get_inode(struct super
 						      upperdentry);
 		unsigned int nlink = is_dir ? 1 : realinode->i_nlink;
 
-		inode = iget5_locked(sb, (unsigned long) key,
-				     ovl_inode_test, ovl_inode_set, key);
+		inode = iget5_locked(sb, (unsigned long) key, ovl_inode_test,
+				     ovl_inode_set, key);
 		if (!inode)
 			goto out_nomem;
 		if (!(inode->i_state & I_NEW)) {
@@ -854,12 +856,12 @@ struct inode *ovl_get_inode(struct super
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
 
-	if (index)
+	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
-		if (((upperdentry && lowerdentry) || numlower > 1) ||
+		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
 		    ovl_check_origin_xattr(upperdentry ?: lowerdentry)) {
 			ovl_set_flag(OVL_WHITEOUTS, inode);
 		}
Index: rhvgoyal-linux/fs/overlayfs/overlayfs.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/overlayfs.h	2018-05-07 16:30:06.874350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/overlayfs.h	2018-05-07 16:37:45.222350785 -0400
@@ -345,9 +345,8 @@ bool ovl_is_private_xattr(const char *na
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower);
+struct inode *ovl_get_inode(struct super_block *sb,
+			    struct ovl_inode_params *oip);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
Index: rhvgoyal-linux/fs/overlayfs/export.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/export.c	2018-05-07 16:30:06.874350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/export.c	2018-05-07 16:41:57.905350785 -0400
@@ -300,12 +300,18 @@ static struct dentry *ovl_obtain_alias(s
 	struct dentry *dentry;
 	struct inode *inode;
 	struct ovl_entry *oe;
+	struct ovl_inode_params oip = {
+		.lowerpath = lowerpath,
+		.index = index,
+		.numlower = !!lower
+	};
 
 	/* We get overlay directory dentries with ovl_lookup_real() */
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
+	oip.upperdentry = dget(upper);
+	inode = ovl_get_inode(sb, &oip);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode
  2018-05-07 18:59   ` Amir Goldstein
@ 2018-05-08 13:47     ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 13:47 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Mon, May 07, 2018 at 09:59:00PM +0300, Amir Goldstein wrote:
> On Mon, May 7, 2018 at 8:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Right now ovl_inode stores inode pointer for lower inode. This helps
> > with quickly getting lower inode given overlay inode (ovl_inode_lower()).
> >
> > Now with metadata only copy-up, we can have metacopy inode in middle
> > layer as well and inode containing data can be different from ->lower.
> > I need to be able to open the real file in ovl_open_realfile() and
> > for that I need to quickly find the lower data inode.
> >
> > Hence store lower data inode also in ovl_inode. Also provide an
> > helper ovl_inode_lowerdata() to access this field.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> 
> After fixing some nits below

Here is the new version of patch.

Subject: ovl: Store lower data inode in ovl_inode

Right now ovl_inode stores inode pointer for lower inode. This helps
with quickly getting lower inode given overlay inode (ovl_inode_lower()).

Now with metadata only copy-up, we can have metacopy inode in middle
layer as well and inode containing data can be different from ->lower.
I need to be able to open the real file in ovl_open_realfile() and
for that I need to quickly find the lower data inode. 

Hence store lower data inode also in ovl_inode. Also provide an
helper ovl_inode_lowerdata() to access this field.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |    2 +-
 fs/overlayfs/namei.c     |    2 ++
 fs/overlayfs/overlayfs.h |    3 ++-
 fs/overlayfs/ovl_entry.h |    6 +++++-
 fs/overlayfs/super.c     |    8 ++++++--
 fs/overlayfs/util.c      |   10 +++++++++-
 6 files changed, 25 insertions(+), 6 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/ovl_entry.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/ovl_entry.h	2018-05-07 16:48:20.399350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/ovl_entry.h	2018-05-07 16:55:48.911350785 -0400
@@ -90,7 +90,10 @@ static inline struct ovl_entry *OVL_E(st
 }
 
 struct ovl_inode {
-	struct ovl_dir_cache *cache;
+	union {
+		struct ovl_dir_cache *cache;	/* directory */
+		struct inode *lowerdata;	/* regular file */
+	};
 	const char *redirect;
 	u64 version;
 	unsigned long flags;
@@ -108,6 +111,7 @@ struct ovl_inode_params {
 	struct dentry *index;
 	unsigned int numlower;
 	char *redirect;
+	struct dentry *lowerdata;
 };
 
 static inline struct ovl_inode *OVL_I(struct inode *inode)
Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2018-05-07 16:48:20.400350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2018-05-07 16:55:01.997350785 -0400
@@ -856,7 +856,7 @@ struct inode *ovl_get_inode(struct super
 		}
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
-	ovl_inode_init(inode, upperdentry, lowerdentry);
+	ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
Index: rhvgoyal-linux/fs/overlayfs/util.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/util.c	2018-05-07 16:48:20.402350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/util.c	2018-05-07 16:55:02.683350785 -0400
@@ -262,6 +262,12 @@ struct inode *ovl_inode_real(struct inod
 	return ovl_inode_upper(inode) ?: ovl_inode_lower(inode);
 }
 
+/* Return inode which containers lower data. Do not return metacopy */
+struct inode *ovl_inode_lowerdata(struct inode *inode)
+{
+	return OVL_I(inode)->lowerdata ?: ovl_inode_lower(inode);
+}
+
 
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode)
 {
@@ -396,7 +402,7 @@ void ovl_dentry_set_redirect(struct dent
 }
 
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry)
+		    struct dentry *lowerdentry, struct dentry *lowerdata)
 {
 	struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
 
@@ -404,6 +410,8 @@ void ovl_inode_init(struct inode *inode,
 		OVL_I(inode)->__upperdentry = upperdentry;
 	if (lowerdentry)
 		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
+	if (lowerdata)
+		OVL_I(inode)->lowerdata = igrab(d_inode(lowerdata));
 
 	ovl_copyattr(realinode, inode);
 	ovl_copyflags(realinode, inode);
Index: rhvgoyal-linux/fs/overlayfs/overlayfs.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/overlayfs.h	2018-05-07 16:48:20.403350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/overlayfs.h	2018-05-07 16:55:02.683350785 -0400
@@ -234,6 +234,7 @@ struct dentry *ovl_dentry_realdata(struc
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
 struct inode *ovl_inode_lower(struct inode *inode);
+struct inode *ovl_inode_lowerdata(struct inode *inode);
 struct inode *ovl_inode_real(struct inode *inode);
 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode);
 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache);
@@ -253,7 +254,7 @@ bool ovl_redirect_dir(struct super_block
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry);
+		    struct dentry *lowerdentry, struct dentry *lowerdata);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
 void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
Index: rhvgoyal-linux/fs/overlayfs/super.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/super.c	2018-05-07 16:48:20.405350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/super.c	2018-05-07 16:55:02.443350785 -0400
@@ -187,6 +187,7 @@ static struct inode *ovl_alloc_inode(str
 	oi->flags = 0;
 	oi->__upperdentry = NULL;
 	oi->lower = NULL;
+	oi->lowerdata = NULL;
 	mutex_init(&oi->lock);
 
 	return &oi->vfs_inode;
@@ -205,8 +206,11 @@ static void ovl_destroy_inode(struct ino
 
 	dput(oi->__upperdentry);
 	iput(oi->lower);
+	if (S_ISDIR(inode->i_mode))
+		ovl_dir_cache_free(inode);
+	else
+		iput(oi->lowerdata);
 	kfree(oi->redirect);
-	ovl_dir_cache_free(inode);
 	mutex_destroy(&oi->lock);
 
 	call_rcu(&inode->i_rcu, ovl_i_callback);
@@ -1510,7 +1514,7 @@ static int ovl_fill_super(struct super_b
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
 	ovl_set_upperdata(d_inode(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
-		       ovl_dentry_lower(root_dentry));
+		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
 
Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-05-07 16:48:23.302350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-05-07 16:55:02.379350785 -0400
@@ -1082,6 +1082,8 @@ struct dentry *ovl_lookup(struct inode *
 			.index = index,
 			.numlower = ctr,
 			.redirect = upperredirect,
+			.lowerdata = (ctr > 1 && !d.is_dir) ?
+				      stack[ctr - 1].dentry : NULL,
 		};
 
 		inode = ovl_get_inode(dentry->d_sb, &oip);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 02/30] ovl: Initialize ovl_inode->redirect in ovl_get_inode()
  2018-05-07 17:40 ` [PATCH v15 02/30] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
@ 2018-05-08 13:56   ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 13:56 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il

Subject: ovl: Initialize ovl_inode->redirect in ovl_get_inode()

ovl_inode->redirect is an inode property and should be initialized
in ovl_get_inode() only when we are adding a new inode to cache. If
inode is already in cache, it is already initialized and we should
not be touching ovl_inode->redirect field.

As of now this is not a problem as redirects are used only for directories
which don't share inode. But soon I want to use redirects for regular files
also and there it can become an issue. 

Hence, move ->redirect initialization in ovl_get_inode().

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/overlayfs/inode.c     |    3 +++
 fs/overlayfs/namei.c     |    8 +-------
 fs/overlayfs/ovl_entry.h |    1 +
 3 files changed, 5 insertions(+), 7 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2018-05-07 16:42:22.395350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2018-05-07 16:42:24.825350785 -0400
@@ -836,6 +836,7 @@ struct inode *ovl_get_inode(struct super
 			}
 
 			dput(upperdentry);
+			kfree(oip->redirect);
 			goto out;
 		}
 
@@ -859,6 +860,8 @@ struct inode *ovl_get_inode(struct super
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	OVL_I(inode)->redirect = oip->redirect;
+
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
 		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
Index: rhvgoyal-linux/fs/overlayfs/ovl_entry.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/ovl_entry.h	2018-05-07 16:42:22.396350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/ovl_entry.h	2018-05-07 16:42:24.826350785 -0400
@@ -106,6 +106,7 @@ struct ovl_inode_params {
 	struct ovl_path *lowerpath;
 	struct dentry *index;
 	unsigned int numlower;
+	char *redirect;
 };
 
 static inline struct ovl_inode *OVL_I(struct inode *inode)
Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-05-07 16:42:24.827350785 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-05-07 16:42:56.609350785 -0400
@@ -1009,19 +1009,13 @@ struct dentry *ovl_lookup(struct inode *
 			.lowerpath = stack,
 			.index = index,
 			.numlower = ctr,
+			.redirect = upperredirect,
 		};
 
 		inode = ovl_get_inode(dentry->d_sb, &oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
-
-		/*
-		 * NB: handle redirected hard links when non-dir redirects
-		 * become possible
-		 */
-		WARN_ON(OVL_I(inode)->redirect);
-		OVL_I(inode)->redirect = upperredirect;
 	}
 
 	revert_creds(old_cred);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-08 12:50         ` Vivek Goyal
@ 2018-05-08 14:14           ` Amir Goldstein
  2018-05-08 14:26             ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-08 14:14 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Tue, May 8, 2018 at 3:50 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>
[...]
>
> Ok, here is the updated patch. I have not defined quivalent of
> ovl_real_meta_file() as there are no users.
>
>
> Subject: ovl: Open file with data except for the case of fsync
>
> ovl_open() should open file which contains data and not open metacopy
> inode. With the introduction of metacopy inodes, with current implementaion
> we will end up opening metacopy inode as well.
>
> But there can be certain circumstances like ovl_fsync() where we
> want to allow opening a metacopy inode instead.
>
> Hence, change ovl_open_realfile() and ovl_open_real() and add extra
> parameter which specifies whether to allow opening metacopy inode or not.
> If this parameter is false, we look for data inode and open that.
>
> This should allow covering both the cases.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Nice! much shorter. You can add
Reviewed-by: Amir Goldstein <amir73il@gmail.com>

BTW, I have never given that much thought, but I see that you
added Reviewed-by before Signed-off-by in some of the patches
and after SOB in other patches.
Of course it doesn't really matter and there isn't a single convention
in the kernel, but the way I always thought about it is:
You sign-off at the bottom on the complete work including all the tags
may have added.

Feel free to ignore this OCD comment ;-)

> ---
>  fs/overlayfs/file.c |   40 +++++++++++++++++++++++++++++++---------
>  1 file changed, 31 insertions(+), 9 deletions(-)
>
> Index: rhvgoyal-linux/fs/overlayfs/file.c
> ===================================================================
> --- rhvgoyal-linux.orig/fs/overlayfs/file.c     2018-05-07 16:55:02.562350785 -0400
> +++ rhvgoyal-linux/fs/overlayfs/file.c  2018-05-08 08:46:49.994350785 -0400
> @@ -14,22 +14,32 @@
>  #include <linux/uio.h>
>  #include "overlayfs.h"
>
> -static struct file *ovl_open_realfile(const struct file *file)
> +static struct file *ovl_open_realfile(const struct file *file,
> +                                     bool allow_metacopy)
>  {
>         struct inode *inode = file_inode(file);
>         struct inode *upperinode = ovl_inode_upper(inode);
> -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> +       struct inode *realinode;
>         struct file *realfile;
> +       bool upperopen = false;
>         const struct cred *old_cred;
>
> +       if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
> +               realinode = upperinode;
> +               upperopen = true;
> +       } else {
> +               realinode = allow_metacopy ? ovl_inode_lower(inode) :
> +                                ovl_inode_lowerdata(inode);
> +       }
>         old_cred = ovl_override_creds(inode->i_sb);
>         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
>                              realinode, current_cred(), false);
>         revert_creds(old_cred);
>
>         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> +                file, file, upperopen ? 'u' : 'l',
> +                file->f_flags, realfile,
> +                IS_ERR(realfile) ? 0 : realfile->f_flags);
>
>         return realfile;
>  }
> @@ -72,17 +82,24 @@ static int ovl_change_flags(struct file
>         return 0;
>  }
>
> -static int ovl_real_fdget(const struct file *file, struct fd *real)
> +static int _ovl_real_fdget(const struct file *file, struct fd *real,
> +                         bool allow_metacopy)
>  {
>         struct inode *inode = file_inode(file);
> +       struct inode *realinode;
>
>         real->flags = 0;
>         real->file = file->private_data;
>
> +       if (allow_metacopy)
> +               realinode = ovl_inode_real(inode);
> +       else
> +               realinode = ovl_inode_realdata(inode);
> +
>         /* Has it been copied up since we'd opened it? */
> -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> +       if (unlikely(file_inode(real->file) != realinode)) {
>                 real->flags = FDPUT_FPUT;
> -               real->file = ovl_open_realfile(file);
> +               real->file = ovl_open_realfile(file, allow_metacopy);
>
>                 return PTR_ERR_OR_ZERO(real->file);
>         }
> @@ -94,6 +111,11 @@ static int ovl_real_fdget(const struct f
>         return 0;
>  }
>
> +static int ovl_real_fdget(const struct file *file, struct fd *real)
> +{
> +       return _ovl_real_fdget(file, real, false);
> +}
> +
>  static int ovl_open(struct inode *inode, struct file *file)
>  {
>         struct dentry *dentry = file_dentry(file);
> @@ -107,7 +129,7 @@ static int ovl_open(struct inode *inode,
>         /* No longer need these flags, so don't pass them on to underlying fs */
>         file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
>
> -       realfile = ovl_open_realfile(file);
> +       realfile = ovl_open_realfile(file, false);
>         if (IS_ERR(realfile))
>                 return PTR_ERR(realfile);
>
> @@ -244,7 +266,7 @@ static int ovl_fsync(struct file *file,
>         const struct cred *old_cred;
>         int ret;
>
> -       ret = ovl_real_fdget(file, &real);
> +       ret = _ovl_real_fdget(file, &real, !datasync);
>         if (ret)
>                 return ret;
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-08 13:42 ` Vivek Goyal
@ 2018-05-08 14:16   ` Amir Goldstein
  2018-05-23 20:00     ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-08 14:16 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Tue, May 8, 2018 at 4:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, May 07, 2018 at 01:40:32PM -0400, Vivek Goyal wrote:
>> Hi,
>>
>> This is V15 of overlayfs metadata only copy-up feature. These patches I
>> have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
>>
>> Patches are also available here.
>>
>> https://github.com/rhvgoyal/linux/commits/metacopy-v15
>>
>> I have run unionmount-testsuite and "./check -overlay -g quick" and that
>> works. Only 4 overlay tests fail, which fail on vanilla kernel too.
>>
>
> Hi Amir,
>
> I have taken care of your review comments and pushed new patches at
> "metcopy-next" branch.
>
> https://github.com/rhvgoyal/linux/commits/metacopy-next

Looks good.

>
> Changes are small and I am not sure if I should be patch bomb mailing
> list again.
>
> Amir, Miklos, do let me know if I should post V16 patches on mailing
> list.
>

I don't need the patch bomb.
All patches on updated branch are fine.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-08 14:14           ` Amir Goldstein
@ 2018-05-08 14:26             ` Vivek Goyal
  2018-05-08 15:04               ` Amir Goldstein
  0 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-08 14:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Tue, May 08, 2018 at 05:14:42PM +0300, Amir Goldstein wrote:
> On Tue, May 8, 2018 at 3:50 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> [...]
> >
> > Ok, here is the updated patch. I have not defined quivalent of
> > ovl_real_meta_file() as there are no users.
> >
> >
> > Subject: ovl: Open file with data except for the case of fsync
> >
> > ovl_open() should open file which contains data and not open metacopy
> > inode. With the introduction of metacopy inodes, with current implementaion
> > we will end up opening metacopy inode as well.
> >
> > But there can be certain circumstances like ovl_fsync() where we
> > want to allow opening a metacopy inode instead.
> >
> > Hence, change ovl_open_realfile() and ovl_open_real() and add extra
> > parameter which specifies whether to allow opening metacopy inode or not.
> > If this parameter is false, we look for data inode and open that.
> >
> > This should allow covering both the cases.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> Nice! much shorter. You can add
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> 
> BTW, I have never given that much thought, but I see that you
> added Reviewed-by before Signed-off-by in some of the patches
> and after SOB in other patches.
> Of course it doesn't really matter and there isn't a single convention
> in the kernel, but the way I always thought about it is:
> You sign-off at the bottom on the complete work including all the tags
> may have added.

I was thinking about it and looked at some of the kernel commits and
there seem to be a mix. There does not seem to be a fixed convention
on the order of these tags.

My take away was that they seem to be ordered FIFO order. So I 
do my Signed-off-by:, then you have your Reviewed-by and when
Miklos merges these, he will put his sign-off-by.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

If you do not disagree with above, I will make order of Reviewed-by
consistent in whole series and push again metacopy-next branch.

Vivek

> 
> Feel free to ignore this OCD comment ;-)
> 
> > ---
> >  fs/overlayfs/file.c |   40 +++++++++++++++++++++++++++++++---------
> >  1 file changed, 31 insertions(+), 9 deletions(-)
> >
> > Index: rhvgoyal-linux/fs/overlayfs/file.c
> > ===================================================================
> > --- rhvgoyal-linux.orig/fs/overlayfs/file.c     2018-05-07 16:55:02.562350785 -0400
> > +++ rhvgoyal-linux/fs/overlayfs/file.c  2018-05-08 08:46:49.994350785 -0400
> > @@ -14,22 +14,32 @@
> >  #include <linux/uio.h>
> >  #include "overlayfs.h"
> >
> > -static struct file *ovl_open_realfile(const struct file *file)
> > +static struct file *ovl_open_realfile(const struct file *file,
> > +                                     bool allow_metacopy)
> >  {
> >         struct inode *inode = file_inode(file);
> >         struct inode *upperinode = ovl_inode_upper(inode);
> > -       struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
> > +       struct inode *realinode;
> >         struct file *realfile;
> > +       bool upperopen = false;
> >         const struct cred *old_cred;
> >
> > +       if (upperinode && (allow_metacopy || ovl_has_upperdata(inode))) {
> > +               realinode = upperinode;
> > +               upperopen = true;
> > +       } else {
> > +               realinode = allow_metacopy ? ovl_inode_lower(inode) :
> > +                                ovl_inode_lowerdata(inode);
> > +       }
> >         old_cred = ovl_override_creds(inode->i_sb);
> >         realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
> >                              realinode, current_cred(), false);
> >         revert_creds(old_cred);
> >
> >         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> > -                file, file, upperinode ? 'u' : 'l', file->f_flags,
> > -                realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
> > +                file, file, upperopen ? 'u' : 'l',
> > +                file->f_flags, realfile,
> > +                IS_ERR(realfile) ? 0 : realfile->f_flags);
> >
> >         return realfile;
> >  }
> > @@ -72,17 +82,24 @@ static int ovl_change_flags(struct file
> >         return 0;
> >  }
> >
> > -static int ovl_real_fdget(const struct file *file, struct fd *real)
> > +static int _ovl_real_fdget(const struct file *file, struct fd *real,
> > +                         bool allow_metacopy)
> >  {
> >         struct inode *inode = file_inode(file);
> > +       struct inode *realinode;
> >
> >         real->flags = 0;
> >         real->file = file->private_data;
> >
> > +       if (allow_metacopy)
> > +               realinode = ovl_inode_real(inode);
> > +       else
> > +               realinode = ovl_inode_realdata(inode);
> > +
> >         /* Has it been copied up since we'd opened it? */
> > -       if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
> > +       if (unlikely(file_inode(real->file) != realinode)) {
> >                 real->flags = FDPUT_FPUT;
> > -               real->file = ovl_open_realfile(file);
> > +               real->file = ovl_open_realfile(file, allow_metacopy);
> >
> >                 return PTR_ERR_OR_ZERO(real->file);
> >         }
> > @@ -94,6 +111,11 @@ static int ovl_real_fdget(const struct f
> >         return 0;
> >  }
> >
> > +static int ovl_real_fdget(const struct file *file, struct fd *real)
> > +{
> > +       return _ovl_real_fdget(file, real, false);
> > +}
> > +
> >  static int ovl_open(struct inode *inode, struct file *file)
> >  {
> >         struct dentry *dentry = file_dentry(file);
> > @@ -107,7 +129,7 @@ static int ovl_open(struct inode *inode,
> >         /* No longer need these flags, so don't pass them on to underlying fs */
> >         file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
> >
> > -       realfile = ovl_open_realfile(file);
> > +       realfile = ovl_open_realfile(file, false);
> >         if (IS_ERR(realfile))
> >                 return PTR_ERR(realfile);
> >
> > @@ -244,7 +266,7 @@ static int ovl_fsync(struct file *file,
> >         const struct cred *old_cred;
> >         int ret;
> >
> > -       ret = ovl_real_fdget(file, &real);
> > +       ret = _ovl_real_fdget(file, &real, !datasync);
> >         if (ret)
> >                 return ret;
> >

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 17/30] ovl: Open file with data except for the case of fsync
  2018-05-08 14:26             ` Vivek Goyal
@ 2018-05-08 15:04               ` Amir Goldstein
  0 siblings, 0 replies; 77+ messages in thread
From: Amir Goldstein @ 2018-05-08 15:04 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Tue, May 8, 2018 at 5:26 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, May 08, 2018 at 05:14:42PM +0300, Amir Goldstein wrote:
>> On Tue, May 8, 2018 at 3:50 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> >
>> [...]
>> >
>> > Ok, here is the updated patch. I have not defined quivalent of
>> > ovl_real_meta_file() as there are no users.
>> >
>> >
>> > Subject: ovl: Open file with data except for the case of fsync
>> >
>> > ovl_open() should open file which contains data and not open metacopy
>> > inode. With the introduction of metacopy inodes, with current implementaion
>> > we will end up opening metacopy inode as well.
>> >
>> > But there can be certain circumstances like ovl_fsync() where we
>> > want to allow opening a metacopy inode instead.
>> >
>> > Hence, change ovl_open_realfile() and ovl_open_real() and add extra
>> > parameter which specifies whether to allow opening metacopy inode or not.
>> > If this parameter is false, we look for data inode and open that.
>> >
>> > This should allow covering both the cases.
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>>
>> Nice! much shorter. You can add
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>
>> BTW, I have never given that much thought, but I see that you
>> added Reviewed-by before Signed-off-by in some of the patches
>> and after SOB in other patches.
>> Of course it doesn't really matter and there isn't a single convention
>> in the kernel, but the way I always thought about it is:
>> You sign-off at the bottom on the complete work including all the tags
>> may have added.
>
> I was thinking about it and looked at some of the kernel commits and
> there seem to be a mix. There does not seem to be a fixed convention
> on the order of these tags.
>
> My take away was that they seem to be ordered FIFO order. So I
> do my Signed-off-by:, then you have your Reviewed-by and when
> Miklos merges these, he will put his sign-off-by.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>
> If you do not disagree with above, I will make order of Reviewed-by
> consistent in whole series and push again metacopy-next branch.
>

I am fine with the above.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
  2018-05-07 19:14   ` Amir Goldstein
@ 2018-05-10  9:19   ` Miklos Szeredi
  2018-05-10  9:36     ` Miklos Szeredi
                       ` (2 more replies)
  2018-05-11 14:30   ` Vivek Goyal
  2018-05-11 15:52   ` Vivek Goyal
  3 siblings, 3 replies; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-10  9:19 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Amir Goldstein

On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> It also allows for presence of metacopy dentries in lower layer.
>
> During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> set OVL_UPPERDATA bit in flags.
>
> We don't support metacopy feature with nfs_export. So in nfs_export code,
> we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
>
> Do not follow metacopy origin if we find a metacopy only inode and metacopy
> feature is not enabled for that mount. Like redirect, this can have security
> implications where an attacker could hand craft upper and try to gain
> access to file on lower which it should not have to begin with.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  fs/overlayfs/export.c    |   3 ++
>  fs/overlayfs/inode.c     |  11 ++++-
>  fs/overlayfs/namei.c     | 108 +++++++++++++++++++++++++++++++++++++++++------
>  fs/overlayfs/overlayfs.h |   1 +
>  fs/overlayfs/util.c      |  22 ++++++++++
>  5 files changed, 130 insertions(+), 15 deletions(-)
>
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 0549286cc55e..52a09a9f74b7 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -314,6 +314,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>                 return ERR_CAST(inode);
>         }
>
> +       if (upper)
> +               ovl_set_flag(OVL_UPPERDATA, inode);
> +
>         dentry = d_find_any_alias(inode);
>         if (!dentry) {
>                 dentry = d_alloc_anon(inode->i_sb);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index c128d5d54d0f..83b276ce0240 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -770,7 +770,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
>         bool bylower = ovl_hash_bylower(oip->sb, upperdentry, lowerdentry,
>                                         oip->index);
>         int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
> -       bool is_dir;
> +       bool is_dir, metacopy = false;
>         unsigned long ino = 0;
>         int err = -ENOMEM;
>
> @@ -830,6 +830,15 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
>         if (oip->index)
>                 ovl_set_flag(OVL_INDEX, inode);
>
> +       if (upperdentry) {
> +               err = ovl_check_metacopy_xattr(upperdentry);
> +               if (err < 0)
> +                       goto out_err;
> +               metacopy = err;
> +               if (!metacopy)
> +                       ovl_set_flag(OVL_UPPERDATA, inode);
> +       }
> +
>         OVL_I(inode)->redirect = oip->redirect;
>
>         /* Check for non-merge dir that may have whiteouts */
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 8fd817bf5529..b2ff08985e29 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -24,6 +24,7 @@ struct ovl_lookup_data {
>         bool stop;
>         bool last;
>         char *redirect;
> +       bool metacopy;
>  };
>
>  static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
> @@ -253,19 +254,29 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
>                 goto put_and_out;
>         }
>         if (!d_can_lookup(this)) {
> -               d->stop = true;
> -               if (d->is_dir)
> +               if (d->is_dir) {
> +                       d->stop = true;
>                         goto put_and_out;
> -
> +               }
>                 /*
>                  * NB: handle failure to lookup non-last element when non-dir
>                  * redirects become possible
>                  */
>                 WARN_ON(!last_element);

This warning now triggers if we have a metacopy inode on upper
(d->is_dir == false) but lookup on lower layer fails in the middle
because a path element is not a directory (contrary to the comment,
this is not related to redirects, but to non-dir "merge" objects,
which we call metacopy).

I think we should handle this case together with the d->is_dir case
above: stop and get out.   Lookup should handle the case where we
failed to find the data dentry.   BTW, I think EIO is more appropriate
for this than ESTALE.  The former means (among other things)
filesystem image is corrupted.  The latter means some inconsistencies
were found when performing an operation.   They are similar, but while
EIO is permanent (until the source of the error is corrected) ESTALE
can go away (e.g. if redoing the lookup or mounting the filesystem
fresh).

> +               err = ovl_check_metacopy_xattr(this);
> +               if (err < 0)
> +                       goto out_err;
> +               d->stop = !err;
> +               d->metacopy = !!err;
>                 goto out;
>         }
> -       if (last_element)
> +       if (last_element) {
> +               if (d->metacopy) {
> +                       err = -ESTALE;

Again, just stop and get out and caller will detect the error.

> +                       goto out_err;
> +               }
>                 d->is_dir = true;
> +       }
>         if (d->last)
>                 goto out;
>
> @@ -823,7 +834,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> -       struct ovl_path *stack = NULL;
> +       struct ovl_path *stack = NULL, *origin_path = NULL;
>         struct dentry *upperdir, *upperdentry = NULL;
>         struct dentry *origin = NULL;
>         struct dentry *index = NULL;
> @@ -834,6 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         struct dentry *this;
>         unsigned int i;
>         int err;
> +       bool metacopy = false;
>         struct ovl_lookup_data d = {
>                 .name = dentry->d_name,
>                 .is_dir = false,
> @@ -841,6 +853,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 .stop = false,
>                 .last = ofs->config.redirect_follow ? false : !poe->numlower,
>                 .redirect = NULL,
> +               .metacopy = false,
>         };
>
>         if (dentry->d_name.len > ofs->namelen)
> @@ -859,7 +872,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                         goto out;
>                 }
>                 if (upperdentry && !d.is_dir) {
> -                       BUG_ON(!d.stop || d.redirect);
> +                       unsigned int origin_ctr = 0;
> +                       BUG_ON(d.redirect);
>                         /*
>                          * Lookup copy up origin by decoding origin file handle.
>                          * We may get a disconnected dentry, which is fine,
> @@ -870,9 +884,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                          * number - it's the same as if we held a reference
>                          * to a dentry in lower layer that was moved under us.
>                          */
> -                       err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
> +                       err = ovl_check_origin(ofs, upperdentry, &origin_path,
> +                                              &origin_ctr);
>                         if (err)
>                                 goto out_put_upper;
> +
> +                       if (d.metacopy)
> +                               metacopy = true;
>                 }
>
>                 if (d.redirect) {
> @@ -913,7 +931,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                  * If no origin fh is stored in upper of a merge dir, store fh
>                  * of lower dir and set upper parent "impure".
>                  */
> -               if (upperdentry && !ctr && !ofs->noxattr) {
> +               if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
>                         err = ovl_fix_origin(dentry, this, upperdentry);
>                         if (err) {
>                                 dput(this);
> @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                  * When "verify_lower" feature is enabled, do not merge with a
>                  * lower dir that does not match a stored origin xattr. In any
>                  * case, only verified origin is used for index lookup.
> +                *
> +                * For non-dir dentry, make sure dentry found by lookup
> +                * matches the origin stored in upper. Otherwise its an
> +                * error.

Umm, why we need to be so strict?  This would  break the case where
the layers are copied with xattr intact, but the origin pointer will
obviously be "wrong", which shouldn't be a problem, since that's only
needed to get a unique st_ino, nothing else.

>                  */
> -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
> +               if (upperdentry && !ctr &&
> +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
> +                    (!d.is_dir && origin_path))) {
>                         err = ovl_verify_origin(upperdentry, this, false);
>                         if (err) {
>                                 dput(this);
> -                               break;
> +                               if (d.is_dir)
> +                                       break;
> +                               goto out_put;
>                         }
> -
> -                       /* Bless lower dir as verified origin */
> +                       /* Bless lower as verified origin */
>                         origin = this;
>                 }
>
> +               if (d.metacopy)
> +                       metacopy = true;
> +               /*
> +                * Do not store intermediate metacopy dentries in chain,
> +                * except top most lower metacopy dentry

I don't get it.  We need the bottom most metacopy dentry, not the
topmost.  Am I missing something?

We also need to check file type here, only regular file makes sense as
metacopy, so if it's something else, then get out with EIO.

> +                */
> +               if (d.metacopy && ctr) {
> +                       dput(this);
> +                       continue;
> +               }
> +
>                 stack[ctr].dentry = this;
>                 stack[ctr].layer = lower.layer;
>                 ctr++;
> @@ -968,13 +1004,49 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 }
>         }
>
> +       if (metacopy) {
> +               /*
> +                * Found a metacopy dentry but did not find corresponding
> +                * data dentry
> +                */
> +               if (d.metacopy) {
> +                       err = -ESTALE;

-EIO.

> +                       goto out_put;
> +               }
> +
> +               err = -EPERM;
> +               if (!ofs->config.metacopy) {
> +                       pr_warn_ratelimited("overlay: refusing to follow"
> +                                           " metacopy origin for (%pd2)\n",
> +                                           dentry);
> +                       goto out_put;
> +               }
> +       } else if (!d.is_dir && upperdentry && !ctr && origin_path) {
> +               if (WARN_ON(stack != NULL)) {
> +                       err = -EIO;
> +                       goto out_put;
> +               }
> +               stack = origin_path;
> +               ctr = 1;
> +               origin_path = NULL;
> +       }
> +
>         /*
>          * Lookup index by lower inode and verify it matches upper inode.
>          * We only trust dir index if we verified that lower dir matches
>          * origin, otherwise dir index entries may be inconsistent and we
> -        * ignore them. Always lookup index of non-dir and non-upper.
> +        * ignore them.
> +        *
> +        * For non-dir upper metacopy dentry, we already set "origin" if we
> +        * verified that lower matched upper origin. If upper origin was
> +        * not present (because lower layer did not support fh encode/decode),
> +        * do not set "origin" and skip looking up index. This case should
> +        * be handled in same way as a non-dir upper without ORIGIN is
> +        * handled.
> +        *
> +        * Always lookup index of non-dir non-metacopy and non-upper.
>          */
> -       if (ctr && (!upperdentry || !d.is_dir))
> +       if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
>                 origin = stack[0].dentry;
>
>         if (origin && ovl_indexdir(dentry->d_sb) &&
> @@ -1015,6 +1087,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         }
>
>         revert_creds(old_cred);
> +       if (origin_path) {
> +               dput(origin_path->dentry);
> +               kfree(origin_path);
> +       }
>         dput(index);
>         kfree(stack);
>         kfree(d.redirect);
> @@ -1029,6 +1105,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 dput(stack[i].dentry);
>         kfree(stack);
>  out_put_upper:
> +       if (origin_path) {
> +               dput(origin_path->dentry);
> +               kfree(origin_path);
> +       }
>         dput(upperdentry);
>         kfree(upperredirect);
>  out:
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 2daea529b7eb..e8954fff1c45 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -274,6 +274,7 @@ bool ovl_need_index(struct dentry *dentry);
>  int ovl_nlink_start(struct dentry *dentry, bool *locked);
>  void ovl_nlink_end(struct dentry *dentry, bool locked);
>  int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
> +int ovl_check_metacopy_xattr(struct dentry *dentry);
>
>  static inline bool ovl_is_impuredir(struct dentry *dentry)
>  {
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index f8e3c95711b8..ab9a8fae0f99 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -778,3 +778,25 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
>         pr_err("overlayfs: failed to lock workdir+upperdir\n");
>         return -EIO;
>  }
> +
> +/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
> +int ovl_check_metacopy_xattr(struct dentry *dentry)
> +{
> +       int res;
> +
> +       /* Only regular files can have metacopy xattr */
> +       if (!S_ISREG(d_inode(dentry)->i_mode))
> +               return 0;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
> +       if (res < 0) {
> +               if (res == -ENODATA || res == -EOPNOTSUPP)
> +                       return 0;
> +               goto out;
> +       }
> +
> +       return 1;
> +out:
> +       pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
> +       return res;
> +}
> --
> 2.13.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10  9:19   ` Miklos Szeredi
@ 2018-05-10  9:36     ` Miklos Szeredi
  2018-05-10  9:52       ` Miklos Szeredi
                         ` (2 more replies)
  2018-05-10 13:14     ` Vivek Goyal
  2018-05-10 19:39     ` Vivek Goyal
  2 siblings, 3 replies; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-10  9:36 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 11:19 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>>
>> +               if (d.metacopy)
>> +                       metacopy = true;
>> +               /*
>> +                * Do not store intermediate metacopy dentries in chain,
>> +                * except top most lower metacopy dentry
>
> I don't get it.  We need the bottom most metacopy dentry, not the
> topmost.  Am I missing something?

Okay, it's more complicated.

1) there is an upper metacopy dentry:

 - store origin (pointed to by ORIGIN or topmost lower dentry) in stack[0]
 - store data dentry (lowest in metacopy chain) in stack[1], unless
it's the same as origin

2) there is no upper dentry, but a lower metacopy dentry

 - store metacopy dentry in stack[0]
 - store data dentry in stack[1]

Does that make more sense?

> We also need to check file type here, only regular file makes sense as
> metacopy, so if it's something else, then get out with EIO.

I meant file type of *data* inode.  Type of metacopy inode is already
checked by ovl_check_metacopy_xattr().

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10  9:36     ` Miklos Szeredi
@ 2018-05-10  9:52       ` Miklos Szeredi
  2018-05-10 13:17       ` Vivek Goyal
  2018-05-10 15:32       ` Vivek Goyal
  2 siblings, 0 replies; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-10  9:52 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 11:36 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, May 10, 2018 at 11:19 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>>>
>>> +               if (d.metacopy)
>>> +                       metacopy = true;
>>> +               /*
>>> +                * Do not store intermediate metacopy dentries in chain,
>>> +                * except top most lower metacopy dentry
>>
>> I don't get it.  We need the bottom most metacopy dentry, not the
>> topmost.  Am I missing something?
>
> Okay, it's more complicated.
>
> 1) there is an upper metacopy dentry:
>
>  - store origin (pointed to by ORIGIN or topmost lower dentry) in stack[0]

Ugh.  So if there's no ORIGIN, we don't want to store anything in
stack[0].   Wondering if we really need the data dentry in the
stack...  Ah, you add OVL_CONST_INO flag later in the series.  Good;
then we can just store data dentry in stack[0] if there's no origin.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10  9:19   ` Miklos Szeredi
  2018-05-10  9:36     ` Miklos Szeredi
@ 2018-05-10 13:14     ` Vivek Goyal
  2018-05-10 14:43       ` Amir Goldstein
  2018-05-10 19:39     ` Vivek Goyal
  2 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-10 13:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 11:19:23AM +0200, Miklos Szeredi wrote:
> On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> > It also allows for presence of metacopy dentries in lower layer.
> >
> > During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> > set OVL_UPPERDATA bit in flags.
> >
> > We don't support metacopy feature with nfs_export. So in nfs_export code,
> > we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
> >
> > Do not follow metacopy origin if we find a metacopy only inode and metacopy
> > feature is not enabled for that mount. Like redirect, this can have security
> > implications where an attacker could hand craft upper and try to gain
> > access to file on lower which it should not have to begin with.
> >
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  fs/overlayfs/export.c    |   3 ++
> >  fs/overlayfs/inode.c     |  11 ++++-
> >  fs/overlayfs/namei.c     | 108 +++++++++++++++++++++++++++++++++++++++++------
> >  fs/overlayfs/overlayfs.h |   1 +
> >  fs/overlayfs/util.c      |  22 ++++++++++
> >  5 files changed, 130 insertions(+), 15 deletions(-)
> >
> > diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> > index 0549286cc55e..52a09a9f74b7 100644
> > --- a/fs/overlayfs/export.c
> > +++ b/fs/overlayfs/export.c
> > @@ -314,6 +314,9 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
> >                 return ERR_CAST(inode);
> >         }
> >
> > +       if (upper)
> > +               ovl_set_flag(OVL_UPPERDATA, inode);
> > +
> >         dentry = d_find_any_alias(inode);
> >         if (!dentry) {
> >                 dentry = d_alloc_anon(inode->i_sb);
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index c128d5d54d0f..83b276ce0240 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -770,7 +770,7 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
> >         bool bylower = ovl_hash_bylower(oip->sb, upperdentry, lowerdentry,
> >                                         oip->index);
> >         int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
> > -       bool is_dir;
> > +       bool is_dir, metacopy = false;
> >         unsigned long ino = 0;
> >         int err = -ENOMEM;
> >
> > @@ -830,6 +830,15 @@ struct inode *ovl_get_inode(struct ovl_inode_params *oip)
> >         if (oip->index)
> >                 ovl_set_flag(OVL_INDEX, inode);
> >
> > +       if (upperdentry) {
> > +               err = ovl_check_metacopy_xattr(upperdentry);
> > +               if (err < 0)
> > +                       goto out_err;
> > +               metacopy = err;
> > +               if (!metacopy)
> > +                       ovl_set_flag(OVL_UPPERDATA, inode);
> > +       }
> > +
> >         OVL_I(inode)->redirect = oip->redirect;
> >
> >         /* Check for non-merge dir that may have whiteouts */
> > diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> > index 8fd817bf5529..b2ff08985e29 100644
> > --- a/fs/overlayfs/namei.c
> > +++ b/fs/overlayfs/namei.c
> > @@ -24,6 +24,7 @@ struct ovl_lookup_data {
> >         bool stop;
> >         bool last;
> >         char *redirect;
> > +       bool metacopy;
> >  };
> >
> >  static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
> > @@ -253,19 +254,29 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
> >                 goto put_and_out;
> >         }
> >         if (!d_can_lookup(this)) {
> > -               d->stop = true;
> > -               if (d->is_dir)
> > +               if (d->is_dir) {
> > +                       d->stop = true;
> >                         goto put_and_out;
> > -
> > +               }
> >                 /*
> >                  * NB: handle failure to lookup non-last element when non-dir
> >                  * redirects become possible
> >                  */
> >                 WARN_ON(!last_element);
> 
> This warning now triggers if we have a metacopy inode on upper
> (d->is_dir == false) but lookup on lower layer fails in the middle
> because a path element is not a directory (contrary to the comment,
> this is not related to redirects, but to non-dir "merge" objects,
> which we call metacopy).

Hi Miklos,

Ok, got it. I hand crafted a redirect xattr on upper and could trigger
this warning.

> 
> I think we should handle this case together with the d->is_dir case
> above: stop and get out.   Lookup should handle the case where we
> failed to find the data dentry.  

Agreed. Will change it.

if (d->is_dir || !last_element) {
	d->stop  = true;
	goto put_and_out;	
}

> BTW, I think EIO is more appropriate
> for this than ESTALE.  The former means (among other things)
> filesystem image is corrupted.  The latter means some inconsistencies
> were found when performing an operation.   They are similar, but while
> EIO is permanent (until the source of the error is corrected) ESTALE
> can go away (e.g. if redoing the lookup or mounting the filesystem
> fresh).

Ok. Thanks for explaining this subtle difference between -EIO and -ESTALE.
I was wondering what to return in what situation.

> 
> > +               err = ovl_check_metacopy_xattr(this);
> > +               if (err < 0)
> > +                       goto out_err;
> > +               d->stop = !err;
> > +               d->metacopy = !!err;
> >                 goto out;
> >         }
> > -       if (last_element)
> > +       if (last_element) {
> > +               if (d->metacopy) {
> > +                       err = -ESTALE;
> 
> Again, just stop and get out and caller will detect the error.

Will do.

> 
> > +                       goto out_err;
> > +               }
> >                 d->is_dir = true;
> > +       }
> >         if (d->last)
> >                 goto out;
> >
> > @@ -823,7 +834,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
> >         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
> >         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> > -       struct ovl_path *stack = NULL;
> > +       struct ovl_path *stack = NULL, *origin_path = NULL;
> >         struct dentry *upperdir, *upperdentry = NULL;
> >         struct dentry *origin = NULL;
> >         struct dentry *index = NULL;
> > @@ -834,6 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >         struct dentry *this;
> >         unsigned int i;
> >         int err;
> > +       bool metacopy = false;
> >         struct ovl_lookup_data d = {
> >                 .name = dentry->d_name,
> >                 .is_dir = false,
> > @@ -841,6 +853,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                 .stop = false,
> >                 .last = ofs->config.redirect_follow ? false : !poe->numlower,
> >                 .redirect = NULL,
> > +               .metacopy = false,
> >         };
> >
> >         if (dentry->d_name.len > ofs->namelen)
> > @@ -859,7 +872,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                         goto out;
> >                 }
> >                 if (upperdentry && !d.is_dir) {
> > -                       BUG_ON(!d.stop || d.redirect);
> > +                       unsigned int origin_ctr = 0;
> > +                       BUG_ON(d.redirect);
> >                         /*
> >                          * Lookup copy up origin by decoding origin file handle.
> >                          * We may get a disconnected dentry, which is fine,
> > @@ -870,9 +884,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                          * number - it's the same as if we held a reference
> >                          * to a dentry in lower layer that was moved under us.
> >                          */
> > -                       err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
> > +                       err = ovl_check_origin(ofs, upperdentry, &origin_path,
> > +                                              &origin_ctr);
> >                         if (err)
> >                                 goto out_put_upper;
> > +
> > +                       if (d.metacopy)
> > +                               metacopy = true;
> >                 }
> >
> >                 if (d.redirect) {
> > @@ -913,7 +931,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                  * If no origin fh is stored in upper of a merge dir, store fh
> >                  * of lower dir and set upper parent "impure".
> >                  */
> > -               if (upperdentry && !ctr && !ofs->noxattr) {
> > +               if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
> >                         err = ovl_fix_origin(dentry, this, upperdentry);
> >                         if (err) {
> >                                 dput(this);
> > @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                  * When "verify_lower" feature is enabled, do not merge with a
> >                  * lower dir that does not match a stored origin xattr. In any
> >                  * case, only verified origin is used for index lookup.
> > +                *
> > +                * For non-dir dentry, make sure dentry found by lookup
> > +                * matches the origin stored in upper. Otherwise its an
> > +                * error.
> 
> Umm, why we need to be so strict?  This would  break the case where
> the layers are copied with xattr intact, but the origin pointer will
> obviously be "wrong", which shouldn't be a problem, since that's only
> needed to get a unique st_ino, nothing else.

Hmm...., right this breaks the case of copied up layer. The very reason
we moved to using path based lookup for metacopy data dentry.

So if we have a origin on upper for metacopy file which does not match
lower dentry found using path based lookup, we can ignore the origin
information and don't lookup for index either. That also means that
inode will be reported of upper. Given we will not use index, that
probably will mean broken hardlinks and some strange corner cases. I will
make this change and run the tests on copied layers and see what breaks.


> 
> >                  */
> > -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
> > +               if (upperdentry && !ctr &&
> > +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
> > +                    (!d.is_dir && origin_path))) {
> >                         err = ovl_verify_origin(upperdentry, this, false);
> >                         if (err) {
> >                                 dput(this);
> > -                               break;
> > +                               if (d.is_dir)
> > +                                       break;
> > +                               goto out_put;
> >                         }
> > -
> > -                       /* Bless lower dir as verified origin */
> > +                       /* Bless lower as verified origin */
> >                         origin = this;
> >                 }
> >
> > +               if (d.metacopy)
> > +                       metacopy = true;
> > +               /*
> > +                * Do not store intermediate metacopy dentries in chain,
> > +                * except top most lower metacopy dentry
> 
> I don't get it.  We need the bottom most metacopy dentry, not the
> topmost.  Am I missing something?

We are supporting mid layer metacopy dentries. That means there can be
a chain of metacopy dentries in lower layers and then data dentry at the
end (non-metacopy). We store both the ends of chain in lowerstack.

So if upper is metacopy but lower is not, then lowerstack[0] is data
dentry.

But if upper is metacoy and lower is metacopy as well, then
lowerstack[0] is topmost metacopy dentry in chain and lowerstack[1] is bottom
most data dentry (non-metacopy).

Storing topmost metacopy dentry in lower layers helps with verifying
origin logic and looking up index and gels with rest of the logic.

> 
> We also need to check file type here, only regular file makes sense as
> metacopy, so if it's something else, then get out with EIO.

Ok, will do.

> 
> > +                */
> > +               if (d.metacopy && ctr) {
> > +                       dput(this);
> > +                       continue;
> > +               }
> > +
> >                 stack[ctr].dentry = this;
> >                 stack[ctr].layer = lower.layer;
> >                 ctr++;
> > @@ -968,13 +1004,49 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                 }
> >         }
> >
> > +       if (metacopy) {
> > +               /*
> > +                * Found a metacopy dentry but did not find corresponding
> > +                * data dentry
> > +                */
> > +               if (d.metacopy) {
> > +                       err = -ESTALE;
> 
> -EIO.

Will do.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10  9:36     ` Miklos Szeredi
  2018-05-10  9:52       ` Miklos Szeredi
@ 2018-05-10 13:17       ` Vivek Goyal
  2018-05-10 15:32       ` Vivek Goyal
  2 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-10 13:17 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 11:36:16AM +0200, Miklos Szeredi wrote:
> On Thu, May 10, 2018 at 11:19 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >>
> >> +               if (d.metacopy)
> >> +                       metacopy = true;
> >> +               /*
> >> +                * Do not store intermediate metacopy dentries in chain,
> >> +                * except top most lower metacopy dentry
> >
> > I don't get it.  We need the bottom most metacopy dentry, not the
> > topmost.  Am I missing something?
> 
> Okay, it's more complicated.
> 
> 1) there is an upper metacopy dentry:
> 
>  - store origin (pointed to by ORIGIN or topmost lower dentry) in stack[0]
>  - store data dentry (lowest in metacopy chain) in stack[1], unless
> it's the same as origin
> 
> 2) there is no upper dentry, but a lower metacopy dentry
> 
>  - store metacopy dentry in stack[0]
>  - store data dentry in stack[1]
> 
> Does that make more sense?

Right. That's what I am doing. lower layers can be a metacopy chain
itself (with data dentry at the end). So lowerstack[0] stores topmost
metacopy dentry and lowerstack[1] stores bottom most data dentry.

If upper metacopy dentry is present, then lowerstack[0] will be ORIGIN
too (in the case of non-copied layers). Now I will relax the check and
will have to keep some state to figure out of lowerstack[0] is verified
ORIGIN or not.

> 
> > We also need to check file type here, only regular file makes sense as
> > metacopy, so if it's something else, then get out with EIO.
> 
> I meant file type of *data* inode.  Type of metacopy inode is already
> checked by ovl_check_metacopy_xattr().

Ok.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10 13:14     ` Vivek Goyal
@ 2018-05-10 14:43       ` Amir Goldstein
  2018-05-10 19:42         ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-10 14:43 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, overlayfs

On Thu, May 10, 2018 at 4:14 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, May 10, 2018 at 11:19:23AM +0200, Miklos Szeredi wrote:
>> On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
>> > It also allows for presence of metacopy dentries in lower layer.
>> >
>> > During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
>> > set OVL_UPPERDATA bit in flags.
>> >
>> > We don't support metacopy feature with nfs_export. So in nfs_export code,
>> > we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
>> >
>> > Do not follow metacopy origin if we find a metacopy only inode and metacopy
>> > feature is not enabled for that mount. Like redirect, this can have security
>> > implications where an attacker could hand craft upper and try to gain
>> > access to file on lower which it should not have to begin with.
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
>> > ---
[...]

>> > @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>> >                  * When "verify_lower" feature is enabled, do not merge with a
>> >                  * lower dir that does not match a stored origin xattr. In any
>> >                  * case, only verified origin is used for index lookup.
>> > +                *
>> > +                * For non-dir dentry, make sure dentry found by lookup
>> > +                * matches the origin stored in upper. Otherwise its an
>> > +                * error.
>>
>> Umm, why we need to be so strict?  This would  break the case where
>> the layers are copied with xattr intact, but the origin pointer will
>> obviously be "wrong", which shouldn't be a problem, since that's only
>> needed to get a unique st_ino, nothing else.
>
> Hmm...., right this breaks the case of copied up layer. The very reason
> we moved to using path based lookup for metacopy data dentry.
>
> So if we have a origin on upper for metacopy file which does not match
> lower dentry found using path based lookup, we can ignore the origin
> information and don't lookup for index either. That also means that
> inode will be reported of upper. Given we will not use index, that
> probably will mean broken hardlinks and some strange corner cases. I will
> make this change and run the tests on copied layers and see what breaks.
>
>

OK, so maybe just relax below to:

>>
>> >                  */
>> > -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
>> > +               if (upperdentry && !ctr &&
>> > +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
>> > +                    (!d.is_dir && origin_path))) {
>> >                         err = ovl_verify_origin(upperdentry, this, false);
>> >                         if (err) {
>> >                                 dput(this);
>> > -                               break;
>> > +                               if (d.is_dir)
>> > +                                       break;

+                                       else if (ovl_verify_lower(dentry->d_sb))
+                                             goto out_put;

>> >                         }
+                           } else {
>> > -
>> > -                       /* Bless lower dir as verified origin */
>> > +                       /* Bless lower as verified origin */
>> >                         origin = this;

+                           }
>> >                 }

So at least we have the correct logic in place w.r.t
ovl_verify_lower() (i.e. don't follow unverified origin)
even though this feature is only enabled for nfs_export
and metacopy does not yet mix with nfs_export.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10  9:36     ` Miklos Szeredi
  2018-05-10  9:52       ` Miklos Szeredi
  2018-05-10 13:17       ` Vivek Goyal
@ 2018-05-10 15:32       ` Vivek Goyal
  2018-05-10 20:21         ` Miklos Szeredi
  2 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-10 15:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 11:36:16AM +0200, Miklos Szeredi wrote:
[..]
> > We also need to check file type here, only regular file makes sense as
> > metacopy, so if it's something else, then get out with EIO.
> 
> I meant file type of *data* inode.  Type of metacopy inode is already
> checked by ovl_check_metacopy_xattr().

Hi Miklos,

IIUC, ovl_lookup_single() will make sure we don't return directory dentry
for a metacopy upper (After your suggested changes).

        if (last_element) {
                if (d->metacopy) {
                        d->stop = true;
                        goto put_and_out;
                }
                d->is_dir = true;
        }

If that's the case, then we probably don't need additional check in
ovl_lookup().

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10  9:19   ` Miklos Szeredi
  2018-05-10  9:36     ` Miklos Szeredi
  2018-05-10 13:14     ` Vivek Goyal
@ 2018-05-10 19:39     ` Vivek Goyal
  2018-05-10 20:13       ` Miklos Szeredi
  2 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-10 19:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 11:19:23AM +0200, Miklos Szeredi wrote:

[..]
> > @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >                  * When "verify_lower" feature is enabled, do not merge with a
> >                  * lower dir that does not match a stored origin xattr. In any
> >                  * case, only verified origin is used for index lookup.
> > +                *
> > +                * For non-dir dentry, make sure dentry found by lookup
> > +                * matches the origin stored in upper. Otherwise its an
> > +                * error.
> 
> Umm, why we need to be so strict?  This would  break the case where
> the layers are copied with xattr intact, but the origin pointer will
> obviously be "wrong", which shouldn't be a problem, since that's only
> needed to get a unique st_ino, nothing else.

So I have few questions at this point of time, I am not clear about what
should be the behavior.

- Layer copying does not work with index=on. So if index is on, it is fair
  to assume that layer copying will not be allowed and that configuration
  is not supported. If yes, then we could enforce this check with index=on
  and not in other cases.

- If we do not enforce ORIGIN verification for non-dir metacopy and
  overlay is mounted again after layer copy, then before data copy up we
  will report inode number of lowerstack[0]. And after data copy up it
  could be inode number of any of the following.

  A. lower inode (if previous overlay dentry has not been flushed, it will
  		  have CONST_INO set.).
  B. upper inode (if previous overlay dentry got flushed out, and new
     lookup could not decode ORIGIN).

  C. older lower inode (if previous overlay dentry got flushed out, and
     new lookup decoded previous inode before layer copy up.)

I guess both B and C will happen with existing code even with normal copy
up. So this behavior with metacopy files might not be a huge concern.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10 14:43       ` Amir Goldstein
@ 2018-05-10 19:42         ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-10 19:42 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, overlayfs

On Thu, May 10, 2018 at 05:43:10PM +0300, Amir Goldstein wrote:
> On Thu, May 10, 2018 at 4:14 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Thu, May 10, 2018 at 11:19:23AM +0200, Miklos Szeredi wrote:
> >> On Mon, May 7, 2018 at 7:40 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> > This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> >> > It also allows for presence of metacopy dentries in lower layer.
> >> >
> >> > During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> >> > set OVL_UPPERDATA bit in flags.
> >> >
> >> > We don't support metacopy feature with nfs_export. So in nfs_export code,
> >> > we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
> >> >
> >> > Do not follow metacopy origin if we find a metacopy only inode and metacopy
> >> > feature is not enabled for that mount. Like redirect, this can have security
> >> > implications where an attacker could hand craft upper and try to gain
> >> > access to file on lower which it should not have to begin with.
> >> >
> >> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> >> > ---
> [...]
> 
> >> > @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >> >                  * When "verify_lower" feature is enabled, do not merge with a
> >> >                  * lower dir that does not match a stored origin xattr. In any
> >> >                  * case, only verified origin is used for index lookup.
> >> > +                *
> >> > +                * For non-dir dentry, make sure dentry found by lookup
> >> > +                * matches the origin stored in upper. Otherwise its an
> >> > +                * error.
> >>
> >> Umm, why we need to be so strict?  This would  break the case where
> >> the layers are copied with xattr intact, but the origin pointer will
> >> obviously be "wrong", which shouldn't be a problem, since that's only
> >> needed to get a unique st_ino, nothing else.
> >
> > Hmm...., right this breaks the case of copied up layer. The very reason
> > we moved to using path based lookup for metacopy data dentry.
> >
> > So if we have a origin on upper for metacopy file which does not match
> > lower dentry found using path based lookup, we can ignore the origin
> > information and don't lookup for index either. That also means that
> > inode will be reported of upper. Given we will not use index, that
> > probably will mean broken hardlinks and some strange corner cases. I will
> > make this change and run the tests on copied layers and see what breaks.
> >
> >
> 
> OK, so maybe just relax below to:
> 
> >>
> >> >                  */
> >> > -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
> >> > +               if (upperdentry && !ctr &&
> >> > +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
> >> > +                    (!d.is_dir && origin_path))) {
> >> >                         err = ovl_verify_origin(upperdentry, this, false);
> >> >                         if (err) {
> >> >                                 dput(this);
> >> > -                               break;
> >> > +                               if (d.is_dir)
> >> > +                                       break;
> 
> +                                       else if (ovl_verify_lower(dentry->d_sb))

Amir, 

As I asked in other email, should we make it conditional based on
config.index instead?  IOW, if indexing is enabled, we will have ORIGIN on
upper and we need to make sure it matches path based looked up lower. And
layer copying will not work in this case. Anyway, IIUC, with index=on, layer
copying does not work (Atleast lower layer can't be copied).

Layer copying will work for the cases of index=off. And in that case we
will not enforce ORIGIN verification of non-dir metacopy. Given index
is off, we don't have to worry about using this lower to lookup for
index. We can use it to report inode number of lower.

And this means we will have broken hard links with layer copy use case.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10 19:39     ` Vivek Goyal
@ 2018-05-10 20:13       ` Miklos Szeredi
  2018-05-11  7:29         ` Miklos Szeredi
  0 siblings, 1 reply; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-10 20:13 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 9:39 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, May 10, 2018 at 11:19:23AM +0200, Miklos Szeredi wrote:
>
> [..]
>> > @@ -925,18 +943,36 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>> >                  * When "verify_lower" feature is enabled, do not merge with a
>> >                  * lower dir that does not match a stored origin xattr. In any
>> >                  * case, only verified origin is used for index lookup.
>> > +                *
>> > +                * For non-dir dentry, make sure dentry found by lookup
>> > +                * matches the origin stored in upper. Otherwise its an
>> > +                * error.
>>
>> Umm, why we need to be so strict?  This would  break the case where
>> the layers are copied with xattr intact, but the origin pointer will
>> obviously be "wrong", which shouldn't be a problem, since that's only
>> needed to get a unique st_ino, nothing else.
>
> So I have few questions at this point of time, I am not clear about what
> should be the behavior.
>
> - Layer copying does not work with index=on. So if index is on, it is fair
>   to assume that layer copying will not be allowed and that configuration
>   is not supported. If yes, then we could enforce this check with index=on
>   and not in other cases.

It does work, except for hard links.  If we unconditionally add
redirects to hard links, then fsck.overlay can recreate the index on
the copied version.


>
> - If we do not enforce ORIGIN verification for non-dir metacopy and
>   overlay is mounted again after layer copy, then before data copy up we
>   will report inode number of lowerstack[0]. And after data copy up it
>   could be inode number of any of the following.
>
>   A. lower inode (if previous overlay dentry has not been flushed, it will
>                   have CONST_INO set.).
>   B. upper inode (if previous overlay dentry got flushed out, and new
>      lookup could not decode ORIGIN).
>
>   C. older lower inode (if previous overlay dentry got flushed out, and
>      new lookup decoded previous inode before layer copy up.)
>
> I guess both B and C will happen with existing code even with normal copy
> up. So this behavior with metacopy files might not be a huge concern.

How about this:

 - if upper has REDIRECT, then use that to determine origin (just like
directories)
 - else if upper has METACOPY, then go down one level to determine
origin (just like directories)
 - else if upper has ORIGIN, then use that as origin (not applicable
to directories (*))
 - else no origin (just like directories)

Haven't yet thought about how that works out in various layer copy scenarios...

Thanks,
Miklos

(*) There's a similar case for directories caused by
ovl_clear_empty().  If the opaque empty dir created in place of the
directory with whiteouts manages to survive due to failure to remove,
then it will have ORIGIN but layers won't be followed because it's
opaque.  We do not correctly handle this corner case, though, so we'd
see the inode number of the directory change in this case.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10 15:32       ` Vivek Goyal
@ 2018-05-10 20:21         ` Miklos Szeredi
  0 siblings, 0 replies; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-10 20:21 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 5:32 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, May 10, 2018 at 11:36:16AM +0200, Miklos Szeredi wrote:
> [..]
>> > We also need to check file type here, only regular file makes sense as
>> > metacopy, so if it's something else, then get out with EIO.
>>
>> I meant file type of *data* inode.  Type of metacopy inode is already
>> checked by ovl_check_metacopy_xattr().
>
> Hi Miklos,
>
> IIUC, ovl_lookup_single() will make sure we don't return directory dentry
> for a metacopy upper (After your suggested changes).
>
>         if (last_element) {
>                 if (d->metacopy) {
>                         d->stop = true;
>                         goto put_and_out;
>                 }
>                 d->is_dir = true;
>         }
>
> If that's the case, then we probably don't need additional check in
> ovl_lookup().

It should check d_is_reg() explicitly.  Being non-dir doesn't mean
it's a regular file.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-10 20:13       ` Miklos Szeredi
@ 2018-05-11  7:29         ` Miklos Szeredi
  2018-05-11  7:52           ` Amir Goldstein
  0 siblings, 1 reply; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-11  7:29 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Amir Goldstein

On Thu, May 10, 2018 at 10:13 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:

>  - if upper has REDIRECT, then use that to determine origin (just like
> directories)
>  - else if upper has METACOPY, then go down one level to determine
> origin (just like directories)
>  - else if upper has ORIGIN, then use that as origin (not applicable
> to directories (*))
>  - else no origin (just like directories)

Desirable way to find origin summarized in a table:

lookup by   dir            reg       special
--------------------------------------------
redirect    REDIRECT       REDIRECT  n/a
next layer  default        METACOPY  n/a
origin fh   OPAQUE+ORIGIN  ORIGIN    ORIGIN
stop        OPAQUE         default   default

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-11  7:29         ` Miklos Szeredi
@ 2018-05-11  7:52           ` Amir Goldstein
  2018-05-11  8:13             ` Miklos Szeredi
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-11  7:52 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Vivek Goyal, overlayfs

On Fri, May 11, 2018 at 10:29 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, May 10, 2018 at 10:13 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
>>  - if upper has REDIRECT, then use that to determine origin (just like
>> directories)
>>  - else if upper has METACOPY, then go down one level to determine
>> origin (just like directories)
>>  - else if upper has ORIGIN, then use that as origin (not applicable
>> to directories (*))
>>  - else no origin (just like directories)
>
> Desirable way to find origin summarized in a table:
>
> lookup by   dir            reg       special
> --------------------------------------------
> redirect    REDIRECT       REDIRECT  n/a
> next layer  default        METACOPY  n/a
> origin fh   OPAQUE+ORIGIN  ORIGIN    ORIGIN
> stop        OPAQUE         default   default
>

OK, but how to determine the value of indexed_by and hashed_by
when next_layer and origin_fh do not agree?

Between the lines I read that your answer is next_layer
overrules origin_fh (as we have for directories).

What I am concerned about is inconsistency of the
form that index entry name, does not match the ORIGIN
xattr of that index entry.

First of all this will fail index verification on mount and may
need to be fixed.

I am also concerned that there are bugs lurking in the form
of hardlink inode that can be hashed in two different ways
depending on how is is looked up.

I am not saying this is not solvable, just that it is more complex
than the table above and I wouldn't want Vivek to have to untangle
all this mess with current patch series.

So for compromise, I suggest to follow Vivek's suggestion to
enforce next_layer == origin_fh if index=on,metacopy=on.

This does not break the copied layer use case, because with
copied layers, lower layer origin_fh verification on mount will
fail and won't allow to set index=on.

This is similar to the compromise that we have made with
nfs_export so we won't need to deal with ovl_fh generations
when index name does not match origin fh.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-11  7:52           ` Amir Goldstein
@ 2018-05-11  8:13             ` Miklos Szeredi
  2018-05-11 12:28               ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Miklos Szeredi @ 2018-05-11  8:13 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Vivek Goyal, overlayfs

On Fri, May 11, 2018 at 9:52 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, May 11, 2018 at 10:29 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, May 10, 2018 at 10:13 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>>>  - if upper has REDIRECT, then use that to determine origin (just like
>>> directories)
>>>  - else if upper has METACOPY, then go down one level to determine
>>> origin (just like directories)
>>>  - else if upper has ORIGIN, then use that as origin (not applicable
>>> to directories (*))
>>>  - else no origin (just like directories)
>>
>> Desirable way to find origin summarized in a table:
>>
>> lookup by   dir            reg       special
>> --------------------------------------------
>> redirect    REDIRECT       REDIRECT  n/a
>> next layer  default        METACOPY  n/a
>> origin fh   OPAQUE+ORIGIN  ORIGIN    ORIGIN
>> stop        OPAQUE         default   default
>>
>
> OK, but how to determine the value of indexed_by and hashed_by
> when next_layer and origin_fh do not agree?
>
> Between the lines I read that your answer is next_layer
> overrules origin_fh (as we have for directories).

Right.

>
> What I am concerned about is inconsistency of the
> form that index entry name, does not match the ORIGIN
> xattr of that index entry.
>
> First of all this will fail index verification on mount and may
> need to be fixed.
>
> I am also concerned that there are bugs lurking in the form
> of hardlink inode that can be hashed in two different ways
> depending on how is is looked up.

Hah, good point.

Basically what you are saying is that an overlay created with an older
version (always using ORIGIN) and modified with a newer version (maybe
using REDIRECT or METACOPY, maybe ORIGIN) is not going to work
consistently.

I suspect the proper way out of that is to never use ORIGIN, just do
what we do for directories.  Lookup using dcache isn't going to be all
that much slower than lookup using file handles.   Looking up absolute
redirects may be slower, but I think mass renaming of files is not a
very common use case.

> I am not saying this is not solvable, just that it is more complex
> than the table above and I wouldn't want Vivek to have to untangle
> all this mess with current patch series.
>
> So for compromise, I suggest to follow Vivek's suggestion to
> enforce next_layer == origin_fh if index=on,metacopy=on.
>
> This does not break the copied layer use case, because with
> copied layers, lower layer origin_fh verification on mount will
> fail and won't allow to set index=on.
>
> This is similar to the compromise that we have made with
> nfs_export so we won't need to deal with ovl_fh generations
> when index name does not match origin fh.

Okay, lets do that for now.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-11  8:13             ` Miklos Szeredi
@ 2018-05-11 12:28               ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-11 12:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Amir Goldstein, overlayfs

On Fri, May 11, 2018 at 10:13:34AM +0200, Miklos Szeredi wrote:

[..]
> > I am not saying this is not solvable, just that it is more complex
> > than the table above and I wouldn't want Vivek to have to untangle
> > all this mess with current patch series.
> >
> > So for compromise, I suggest to follow Vivek's suggestion to
> > enforce next_layer == origin_fh if index=on,metacopy=on.
> >
> > This does not break the copied layer use case, because with
> > copied layers, lower layer origin_fh verification on mount will
> > fail and won't allow to set index=on.
> >
> > This is similar to the compromise that we have made with
> > nfs_export so we won't need to deal with ovl_fh generations
> > when index name does not match origin fh.
> 
> Okay, lets do that for now.

Cool. let me modify the patch.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
  2018-05-07 19:14   ` Amir Goldstein
  2018-05-10  9:19   ` Miklos Szeredi
@ 2018-05-11 14:30   ` Vivek Goyal
  2018-05-11 15:05     ` Amir Goldstein
  2018-05-11 15:52   ` Vivek Goyal
  3 siblings, 1 reply; 77+ messages in thread
From: Vivek Goyal @ 2018-05-11 14:30 UTC (permalink / raw)
  To: linux-unionfs, Miklos Szeredi, Amir Goldstein

Hi Miklos, Amir,

Please find attached V2 of the patch. I have taken care of comments. I
have again pushed latest patches on metacopy-next branch.

https://github.com/rhvgoyal/linux/commits/metacopy-next

Changes:
- Use EIO instead of ESTALE at few places.
- Make sure metacopy dentry or data dentry is a regular file.
- Modified ovl_lookup_single() for better error handling of bad
  redirects (Miklos's feedback).
- In ovl_lookup(), for non-dir, if lower does not match against ORIGIN,
  then we error out only if index=on. Otherwise we continue to use the
  dentry found by path based lookup.

Thanks
Vivek

Subject: ovl: Modify ovl_lookup() and friends to lookup metacopy dentry

This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
It also allows for presence of metacopy dentries in lower layer.

During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
set OVL_UPPERDATA bit in flags.

We don't support metacopy feature with nfs_export. So in nfs_export code,
we set OVL_UPPERDATA flag set unconditionally if upper inode exists.

Do not follow metacopy origin if we find a metacopy only inode and metacopy
feature is not enabled for that mount. Like redirect, this can have security
implications where an attacker could hand craft upper and try to gain
access to file on lower which it should not have to begin with. 

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c    |    3 +
 fs/overlayfs/inode.c     |   11 +++-
 fs/overlayfs/namei.c     |  125 ++++++++++++++++++++++++++++++++++++++++-------
 fs/overlayfs/overlayfs.h |    1 
 fs/overlayfs/util.c      |   22 ++++++++
 5 files changed, 143 insertions(+), 19 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-05-10 15:53:00.858878145 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-05-11 09:17:38.751646662 -0400
@@ -24,6 +24,7 @@ struct ovl_lookup_data {
 	bool stop;
 	bool last;
 	char *redirect;
+	bool metacopy;
 };
 
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
@@ -253,19 +254,33 @@ static int ovl_lookup_single(struct dent
 		goto put_and_out;
 	}
 	if (!d_can_lookup(this)) {
-		d->stop = true;
-		if (d->is_dir)
+		if (d->is_dir || !last_element) {
+			d->stop = true;
 			goto put_and_out;
-
+		}
+		err = ovl_check_metacopy_xattr(this);
+		if (err < 0)
+			goto out_err;
 		/*
-		 * NB: handle failure to lookup non-last element when non-dir
-		 * redirects become possible
+		 * This dentry should be a regular file if this is
+		 * a metacopy dentry or previous layer lookup found
+		 * a metacopy dentry.
 		 */
-		WARN_ON(!last_element);
+		if ((err || d->metacopy) && !d_is_reg(this)) {
+			d->stop = true;
+			goto put_and_out;
+		}
+		d->stop = !err;
+		d->metacopy = !!err;
 		goto out;
 	}
-	if (last_element)
+	if (last_element) {
+		if (d->metacopy) {
+			d->stop = true;
+			goto put_and_out;
+		}
 		d->is_dir = true;
+	}
 	if (d->last)
 		goto out;
 
@@ -823,7 +838,7 @@ struct dentry *ovl_lookup(struct inode *
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct ovl_path *stack = NULL;
+	struct ovl_path *stack = NULL, *origin_path = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *origin = NULL;
 	struct dentry *index = NULL;
@@ -834,6 +849,7 @@ struct dentry *ovl_lookup(struct inode *
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	bool metacopy = false;
 	struct ovl_lookup_data d = {
 		.name = dentry->d_name,
 		.is_dir = false,
@@ -841,6 +857,7 @@ struct dentry *ovl_lookup(struct inode *
 		.stop = false,
 		.last = ofs->config.redirect_follow ? false : !poe->numlower,
 		.redirect = NULL,
+		.metacopy = false,
 	};
 
 	if (dentry->d_name.len > ofs->namelen)
@@ -859,7 +876,8 @@ struct dentry *ovl_lookup(struct inode *
 			goto out;
 		}
 		if (upperdentry && !d.is_dir) {
-			BUG_ON(!d.stop || d.redirect);
+			unsigned int origin_ctr = 0;
+			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
@@ -870,9 +888,13 @@ struct dentry *ovl_lookup(struct inode *
 			 * number - it's the same as if we held a reference
 			 * to a dentry in lower layer that was moved under us.
 			 */
-			err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
+			err = ovl_check_origin(ofs, upperdentry, &origin_path,
+					       &origin_ctr);
 			if (err)
 				goto out_put_upper;
+
+			if (d.metacopy)
+				metacopy = true;
 		}
 
 		if (d.redirect) {
@@ -913,7 +935,7 @@ struct dentry *ovl_lookup(struct inode *
 		 * If no origin fh is stored in upper of a merge dir, store fh
 		 * of lower dir and set upper parent "impure".
 		 */
-		if (upperdentry && !ctr && !ofs->noxattr) {
+		if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
 			err = ovl_fix_origin(dentry, this, upperdentry);
 			if (err) {
 				dput(this);
@@ -925,16 +947,39 @@ struct dentry *ovl_lookup(struct inode *
 		 * When "verify_lower" feature is enabled, do not merge with a
 		 * lower dir that does not match a stored origin xattr. In any
 		 * case, only verified origin is used for index lookup.
+		 *
+		 * For non-dir dentry, if index=on, then ensure origin
+		 * matches the dentry found using path based lookup,
+		 * otherwise error out. If index=off, then do not error
+		 * out if origin does not match.
 		 */
-		if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
+		if (upperdentry && !ctr &&
+		    ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
+		     (!d.is_dir && origin_path))) {
 			err = ovl_verify_origin(upperdentry, this, false);
 			if (err) {
-				dput(this);
-				break;
+				if (d.is_dir) {
+					dput(this);
+					break;
+				} else if (ofs->config.index) {
+					dput(this);
+					goto out_put;
+				}
+			} else {
+				/* Bless lower as verified origin */
+				origin = this;
 			}
+		}
 
-			/* Bless lower dir as verified origin */
-			origin = this;
+		if (d.metacopy)
+			metacopy = true;
+		/*
+		 * Do not store intermediate metacopy dentries in chain,
+		 * except top most lower metacopy dentry
+		 */
+		if (d.metacopy && ctr) {
+			dput(this);
+			continue;
 		}
 
 		stack[ctr].dentry = this;
@@ -968,13 +1013,49 @@ struct dentry *ovl_lookup(struct inode *
 		}
 	}
 
+	if (metacopy) {
+		/*
+		 * Found a metacopy dentry but did not find corresponding
+		 * data dentry
+		 */
+		if (d.metacopy) {
+			err = -EIO;
+			goto out_put;
+		}
+
+		err = -EPERM;
+		if (!ofs->config.metacopy) {
+			pr_warn_ratelimited("overlay: refusing to follow"
+					    " metacopy origin for (%pd2)\n",
+					    dentry);
+			goto out_put;
+		}
+	} else if (!d.is_dir && upperdentry && !ctr && origin_path) {
+		if (WARN_ON(stack != NULL)) {
+			err = -EIO;
+			goto out_put;
+		}
+		stack = origin_path;
+		ctr = 1;
+		origin_path = NULL;
+	}
+
 	/*
 	 * Lookup index by lower inode and verify it matches upper inode.
 	 * We only trust dir index if we verified that lower dir matches
 	 * origin, otherwise dir index entries may be inconsistent and we
-	 * ignore them. Always lookup index of non-dir and non-upper.
+	 * ignore them.
+	 *
+	 * For non-dir upper metacopy dentry, we already set "origin" if we
+	 * verified that lower matched upper origin. If upper origin was
+	 * not present (because lower layer did not support fh encode/decode),
+	 * or origin did not match, do not set "origin" and skip looking up
+	 * index. This case should be handled in same way as a non-dir upper
+	 * without ORIGIN is handled.
+	 *
+	 * Always lookup index of non-dir non-metacopy and non-upper.
 	 */
-	if (ctr && (!upperdentry || !d.is_dir))
+	if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
 		origin = stack[0].dentry;
 
 	if (origin && ovl_indexdir(dentry->d_sb) &&
@@ -1019,6 +1100,10 @@ struct dentry *ovl_lookup(struct inode *
 	}
 
 	revert_creds(old_cred);
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
@@ -1033,6 +1118,10 @@ out_put:
 		dput(stack[i].dentry);
 	kfree(stack);
 out_put_upper:
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(upperdentry);
 	kfree(upperredirect);
 out:
Index: rhvgoyal-linux/fs/overlayfs/export.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/export.c	2018-05-10 15:53:00.836878145 -0400
+++ rhvgoyal-linux/fs/overlayfs/export.c	2018-05-10 15:53:01.054878145 -0400
@@ -317,6 +317,9 @@ static struct dentry *ovl_obtain_alias(s
 		return ERR_CAST(inode);
 	}
 
+	if (upper)
+		ovl_set_flag(OVL_UPPERDATA, inode);
+
 	dentry = d_find_any_alias(inode);
 	if (!dentry) {
 		dentry = d_alloc_anon(inode->i_sb);
Index: rhvgoyal-linux/fs/overlayfs/overlayfs.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/overlayfs.h	2018-05-10 15:53:01.007878145 -0400
+++ rhvgoyal-linux/fs/overlayfs/overlayfs.h	2018-05-11 09:17:02.037646662 -0400
@@ -274,6 +274,7 @@ bool ovl_need_index(struct dentry *dentr
 int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
+int ovl_check_metacopy_xattr(struct dentry *dentry);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2018-05-10 15:53:01.031878145 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2018-05-11 09:17:01.656646662 -0400
@@ -771,7 +771,7 @@ struct inode *ovl_get_inode(struct super
 	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
 					oip->index);
 	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
-	bool is_dir;
+	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = -ENOMEM;
 
@@ -831,6 +831,15 @@ struct inode *ovl_get_inode(struct super
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	if (upperdentry) {
+		err = ovl_check_metacopy_xattr(upperdentry);
+		if (err < 0)
+			goto out_err;
+		metacopy = err;
+		if (!metacopy)
+			ovl_set_flag(OVL_UPPERDATA, inode);
+	}
+
 	OVL_I(inode)->redirect = oip->redirect;
 
 	/* Check for non-merge dir that may have whiteouts */
Index: rhvgoyal-linux/fs/overlayfs/util.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/util.c	2018-05-10 15:53:01.008878145 -0400
+++ rhvgoyal-linux/fs/overlayfs/util.c	2018-05-11 09:17:02.037646662 -0400
@@ -778,3 +778,25 @@ err:
 	pr_err("overlayfs: failed to lock workdir+upperdir\n");
 	return -EIO;
 }
+
+/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
+int ovl_check_metacopy_xattr(struct dentry *dentry)
+{
+	int res;
+
+	/* Only regular files can have metacopy xattr */
+	if (!S_ISREG(d_inode(dentry)->i_mode))
+		return 0;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto out;
+	}
+
+	return 1;
+out:
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	return res;
+}

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-11 14:30   ` Vivek Goyal
@ 2018-05-11 15:05     ` Amir Goldstein
  2018-05-11 15:14       ` Vivek Goyal
  0 siblings, 1 reply; 77+ messages in thread
From: Amir Goldstein @ 2018-05-11 15:05 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: overlayfs, Miklos Szeredi

On Fri, May 11, 2018 at 5:30 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Hi Miklos, Amir,
>
> Please find attached V2 of the patch. I have taken care of comments. I
> have again pushed latest patches on metacopy-next branch.
>
> https://github.com/rhvgoyal/linux/commits/metacopy-next
>
> Changes:
> - Use EIO instead of ESTALE at few places.
> - Make sure metacopy dentry or data dentry is a regular file.
> - Modified ovl_lookup_single() for better error handling of bad
>   redirects (Miklos's feedback).
> - In ovl_lookup(), for non-dir, if lower does not match against ORIGIN,
>   then we error out only if index=on. Otherwise we continue to use the
>   dentry found by path based lookup.
>
> Thanks
> Vivek
>
> Subject: ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
>
> This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
> It also allows for presence of metacopy dentries in lower layer.
>
> During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
> set OVL_UPPERDATA bit in flags.
>
> We don't support metacopy feature with nfs_export. So in nfs_export code,
> we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
>
> Do not follow metacopy origin if we find a metacopy only inode and metacopy
> feature is not enabled for that mount. Like redirect, this can have security
> implications where an attacker could hand craft upper and try to gain
> access to file on lower which it should not have to begin with.
>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/export.c    |    3 +
>  fs/overlayfs/inode.c     |   11 +++-
>  fs/overlayfs/namei.c     |  125 ++++++++++++++++++++++++++++++++++++++++-------
>  fs/overlayfs/overlayfs.h |    1
>  fs/overlayfs/util.c      |   22 ++++++++
>  5 files changed, 143 insertions(+), 19 deletions(-)
>
> Index: rhvgoyal-linux/fs/overlayfs/namei.c
> ===================================================================
> --- rhvgoyal-linux.orig/fs/overlayfs/namei.c    2018-05-10 15:53:00.858878145 -0400
> +++ rhvgoyal-linux/fs/overlayfs/namei.c 2018-05-11 09:17:38.751646662 -0400
> @@ -24,6 +24,7 @@ struct ovl_lookup_data {
>         bool stop;
>         bool last;
>         char *redirect;
> +       bool metacopy;
>  };
>
>  static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
> @@ -253,19 +254,33 @@ static int ovl_lookup_single(struct dent
>                 goto put_and_out;
>         }
>         if (!d_can_lookup(this)) {
> -               d->stop = true;
> -               if (d->is_dir)
> +               if (d->is_dir || !last_element) {
> +                       d->stop = true;
>                         goto put_and_out;
> -
> +               }
> +               err = ovl_check_metacopy_xattr(this);
> +               if (err < 0)
> +                       goto out_err;
>                 /*
> -                * NB: handle failure to lookup non-last element when non-dir
> -                * redirects become possible
> +                * This dentry should be a regular file if this is
> +                * a metacopy dentry or previous layer lookup found
> +                * a metacopy dentry.
>                  */
> -               WARN_ON(!last_element);
> +               if ((err || d->metacopy) && !d_is_reg(this)) {
> +                       d->stop = true;
> +                       goto put_and_out;
> +               }
> +               d->stop = !err;
> +               d->metacopy = !!err;
>                 goto out;
>         }
> -       if (last_element)
> +       if (last_element) {
> +               if (d->metacopy) {
> +                       d->stop = true;
> +                       goto put_and_out;
> +               }
>                 d->is_dir = true;
> +       }
>         if (d->last)
>                 goto out;
>
> @@ -823,7 +838,7 @@ struct dentry *ovl_lookup(struct inode *
>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> -       struct ovl_path *stack = NULL;
> +       struct ovl_path *stack = NULL, *origin_path = NULL;
>         struct dentry *upperdir, *upperdentry = NULL;
>         struct dentry *origin = NULL;
>         struct dentry *index = NULL;
> @@ -834,6 +849,7 @@ struct dentry *ovl_lookup(struct inode *
>         struct dentry *this;
>         unsigned int i;
>         int err;
> +       bool metacopy = false;
>         struct ovl_lookup_data d = {
>                 .name = dentry->d_name,
>                 .is_dir = false,
> @@ -841,6 +857,7 @@ struct dentry *ovl_lookup(struct inode *
>                 .stop = false,
>                 .last = ofs->config.redirect_follow ? false : !poe->numlower,
>                 .redirect = NULL,
> +               .metacopy = false,
>         };
>
>         if (dentry->d_name.len > ofs->namelen)
> @@ -859,7 +876,8 @@ struct dentry *ovl_lookup(struct inode *
>                         goto out;
>                 }
>                 if (upperdentry && !d.is_dir) {
> -                       BUG_ON(!d.stop || d.redirect);
> +                       unsigned int origin_ctr = 0;
> +                       BUG_ON(d.redirect);
>                         /*
>                          * Lookup copy up origin by decoding origin file handle.
>                          * We may get a disconnected dentry, which is fine,
> @@ -870,9 +888,13 @@ struct dentry *ovl_lookup(struct inode *
>                          * number - it's the same as if we held a reference
>                          * to a dentry in lower layer that was moved under us.
>                          */
> -                       err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
> +                       err = ovl_check_origin(ofs, upperdentry, &origin_path,
> +                                              &origin_ctr);
>                         if (err)
>                                 goto out_put_upper;
> +
> +                       if (d.metacopy)
> +                               metacopy = true;
>                 }
>
>                 if (d.redirect) {
> @@ -913,7 +935,7 @@ struct dentry *ovl_lookup(struct inode *
>                  * If no origin fh is stored in upper of a merge dir, store fh
>                  * of lower dir and set upper parent "impure".
>                  */
> -               if (upperdentry && !ctr && !ofs->noxattr) {
> +               if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
>                         err = ovl_fix_origin(dentry, this, upperdentry);
>                         if (err) {
>                                 dput(this);
> @@ -925,16 +947,39 @@ struct dentry *ovl_lookup(struct inode *
>                  * When "verify_lower" feature is enabled, do not merge with a
>                  * lower dir that does not match a stored origin xattr. In any
>                  * case, only verified origin is used for index lookup.
> +                *
> +                * For non-dir dentry, if index=on, then ensure origin
> +                * matches the dentry found using path based lookup,
> +                * otherwise error out. If index=off, then do not error
> +                * out if origin does not match.
>                  */
> -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
> +               if (upperdentry && !ctr &&
> +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
> +                    (!d.is_dir && origin_path))) {
>                         err = ovl_verify_origin(upperdentry, this, false);
>                         if (err) {
> -                               dput(this);
> -                               break;
> +                               if (d.is_dir) {
> +                                       dput(this);
> +                                       break;
> +                               } else if (ofs->config.index) {
> +                                       dput(this);
> +                                       goto out_put;
> +                               }
> +                       } else {
> +                               /* Bless lower as verified origin */
> +                               origin = this;
>                         }
> +               }
>

Logically, this looks correct, but if we don't need the blessing for index=off
and we won't fail lookup, then there is no need to ovl_verify_origin() at
all if index=off, so this shrinks a few lines of code and saves the extra
check.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-11 15:05     ` Amir Goldstein
@ 2018-05-11 15:14       ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-11 15:14 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Fri, May 11, 2018 at 06:05:11PM +0300, Amir Goldstein wrote:

[..]
> > +                *
> > +                * For non-dir dentry, if index=on, then ensure origin
> > +                * matches the dentry found using path based lookup,
> > +                * otherwise error out. If index=off, then do not error
> > +                * out if origin does not match.
> >                  */
> > -               if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
> > +               if (upperdentry && !ctr &&
> > +                   ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
> > +                    (!d.is_dir && origin_path))) {
> >                         err = ovl_verify_origin(upperdentry, this, false);
> >                         if (err) {
> > -                               dput(this);
> > -                               break;
> > +                               if (d.is_dir) {
> > +                                       dput(this);
> > +                                       break;
> > +                               } else if (ofs->config.index) {
> > +                                       dput(this);
> > +                                       goto out_put;
> > +                               }
> > +                       } else {
> > +                               /* Bless lower as verified origin */
> > +                               origin = this;
> >                         }
> > +               }
> >
> 
> Logically, this looks correct, but if we don't need the blessing for index=off
> and we won't fail lookup, then there is no need to ovl_verify_origin() at
> all if index=off, so this shrinks a few lines of code and saves the extra
> check.

Makes sense. Will change. Call ovl_verify_origin() only if index is
on for non-dir files.

Vivek
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
  2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
                     ` (2 preceding siblings ...)
  2018-05-11 14:30   ` Vivek Goyal
@ 2018-05-11 15:52   ` Vivek Goyal
  3 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-11 15:52 UTC (permalink / raw)
  To: linux-unionfs; +Cc: miklos, amir73il

Hi,

Here is the V3 of the patch. This ensures that non-dir dentry found in lower using path based lookup matches ORIGIN only if index is enabled.

Subject: ovl: Modify ovl_lookup() and friends to lookup metacopy dentry

This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
It also allows for presence of metacopy dentries in lower layer.

During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
set OVL_UPPERDATA bit in flags.

We don't support metacopy feature with nfs_export. So in nfs_export code,
we set OVL_UPPERDATA flag set unconditionally if upper inode exists.

Do not follow metacopy origin if we find a metacopy only inode and metacopy
feature is not enabled for that mount. Like redirect, this can have security
implications where an attacker could hand craft upper and try to gain
access to file on lower which it should not have to begin with. 

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c    |    3 +
 fs/overlayfs/inode.c     |   11 ++++
 fs/overlayfs/namei.c     |  117 ++++++++++++++++++++++++++++++++++++++++-------
 fs/overlayfs/overlayfs.h |    1 
 fs/overlayfs/util.c      |   22 ++++++++
 5 files changed, 136 insertions(+), 18 deletions(-)

Index: rhvgoyal-linux/fs/overlayfs/namei.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/namei.c	2018-05-11 11:16:41.345039217 -0400
+++ rhvgoyal-linux/fs/overlayfs/namei.c	2018-05-11 11:29:32.245039217 -0400
@@ -24,6 +24,7 @@ struct ovl_lookup_data {
 	bool stop;
 	bool last;
 	char *redirect;
+	bool metacopy;
 };
 
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
@@ -253,19 +254,33 @@ static int ovl_lookup_single(struct dent
 		goto put_and_out;
 	}
 	if (!d_can_lookup(this)) {
-		d->stop = true;
-		if (d->is_dir)
+		if (d->is_dir || !last_element) {
+			d->stop = true;
 			goto put_and_out;
-
+		}
+		err = ovl_check_metacopy_xattr(this);
+		if (err < 0)
+			goto out_err;
 		/*
-		 * NB: handle failure to lookup non-last element when non-dir
-		 * redirects become possible
+		 * This dentry should be a regular file if this is
+		 * a metacopy dentry or previous layer lookup found
+		 * a metacopy dentry.
 		 */
-		WARN_ON(!last_element);
+		if ((err || d->metacopy) && !d_is_reg(this)) {
+			d->stop = true;
+			goto put_and_out;
+		}
+		d->stop = !err;
+		d->metacopy = !!err;
 		goto out;
 	}
-	if (last_element)
+	if (last_element) {
+		if (d->metacopy) {
+			d->stop = true;
+			goto put_and_out;
+		}
 		d->is_dir = true;
+	}
 	if (d->last)
 		goto out;
 
@@ -823,7 +838,7 @@ struct dentry *ovl_lookup(struct inode *
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct ovl_path *stack = NULL;
+	struct ovl_path *stack = NULL, *origin_path = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *origin = NULL;
 	struct dentry *index = NULL;
@@ -834,6 +849,7 @@ struct dentry *ovl_lookup(struct inode *
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	bool metacopy = false;
 	struct ovl_lookup_data d = {
 		.name = dentry->d_name,
 		.is_dir = false,
@@ -841,6 +857,7 @@ struct dentry *ovl_lookup(struct inode *
 		.stop = false,
 		.last = ofs->config.redirect_follow ? false : !poe->numlower,
 		.redirect = NULL,
+		.metacopy = false,
 	};
 
 	if (dentry->d_name.len > ofs->namelen)
@@ -859,7 +876,8 @@ struct dentry *ovl_lookup(struct inode *
 			goto out;
 		}
 		if (upperdentry && !d.is_dir) {
-			BUG_ON(!d.stop || d.redirect);
+			unsigned int origin_ctr = 0;
+			BUG_ON(d.redirect);
 			/*
 			 * Lookup copy up origin by decoding origin file handle.
 			 * We may get a disconnected dentry, which is fine,
@@ -870,9 +888,13 @@ struct dentry *ovl_lookup(struct inode *
 			 * number - it's the same as if we held a reference
 			 * to a dentry in lower layer that was moved under us.
 			 */
-			err = ovl_check_origin(ofs, upperdentry, &stack, &ctr);
+			err = ovl_check_origin(ofs, upperdentry, &origin_path,
+					       &origin_ctr);
 			if (err)
 				goto out_put_upper;
+
+			if (d.metacopy)
+				metacopy = true;
 		}
 
 		if (d.redirect) {
@@ -913,7 +935,7 @@ struct dentry *ovl_lookup(struct inode *
 		 * If no origin fh is stored in upper of a merge dir, store fh
 		 * of lower dir and set upper parent "impure".
 		 */
-		if (upperdentry && !ctr && !ofs->noxattr) {
+		if (upperdentry && !ctr && !ofs->noxattr && d.is_dir) {
 			err = ovl_fix_origin(dentry, this, upperdentry);
 			if (err) {
 				dput(this);
@@ -925,18 +947,35 @@ struct dentry *ovl_lookup(struct inode *
 		 * When "verify_lower" feature is enabled, do not merge with a
 		 * lower dir that does not match a stored origin xattr. In any
 		 * case, only verified origin is used for index lookup.
+		 *
+		 * For non-dir dentry, if index=on, then ensure origin
+		 * matches the dentry found using path based lookup,
+		 * otherwise error out.
 		 */
-		if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) {
+		if (upperdentry && !ctr &&
+		    ((d.is_dir && ovl_verify_lower(dentry->d_sb)) ||
+		     (!d.is_dir && ofs->config.index && origin_path))) {
 			err = ovl_verify_origin(upperdentry, this, false);
 			if (err) {
 				dput(this);
-				break;
+				if (d.is_dir)
+					break;
+				goto out_put;
 			}
-
-			/* Bless lower dir as verified origin */
 			origin = this;
 		}
 
+		if (d.metacopy)
+			metacopy = true;
+		/*
+		 * Do not store intermediate metacopy dentries in chain,
+		 * except top most lower metacopy dentry
+		 */
+		if (d.metacopy && ctr) {
+			dput(this);
+			continue;
+		}
+
 		stack[ctr].dentry = this;
 		stack[ctr].layer = lower.layer;
 		ctr++;
@@ -968,13 +1007,49 @@ struct dentry *ovl_lookup(struct inode *
 		}
 	}
 
+	if (metacopy) {
+		/*
+		 * Found a metacopy dentry but did not find corresponding
+		 * data dentry
+		 */
+		if (d.metacopy) {
+			err = -EIO;
+			goto out_put;
+		}
+
+		err = -EPERM;
+		if (!ofs->config.metacopy) {
+			pr_warn_ratelimited("overlay: refusing to follow"
+					    " metacopy origin for (%pd2)\n",
+					    dentry);
+			goto out_put;
+		}
+	} else if (!d.is_dir && upperdentry && !ctr && origin_path) {
+		if (WARN_ON(stack != NULL)) {
+			err = -EIO;
+			goto out_put;
+		}
+		stack = origin_path;
+		ctr = 1;
+		origin_path = NULL;
+	}
+
 	/*
 	 * Lookup index by lower inode and verify it matches upper inode.
 	 * We only trust dir index if we verified that lower dir matches
 	 * origin, otherwise dir index entries may be inconsistent and we
-	 * ignore them. Always lookup index of non-dir and non-upper.
+	 * ignore them.
+	 *
+	 * For non-dir upper metacopy dentry, we already set "origin" if we
+	 * verified that lower matched upper origin. If upper origin was
+	 * not present (because lower layer did not support fh encode/decode),
+	 * or indexing is not enabled, do not set "origin" and skip looking up
+	 * index. This case should be handled in same way as a non-dir upper
+	 * without ORIGIN is handled.
+	 *
+	 * Always lookup index of non-dir non-metacopy and non-upper.
 	 */
-	if (ctr && (!upperdentry || !d.is_dir))
+	if (ctr && (!upperdentry || (!d.is_dir && !metacopy)))
 		origin = stack[0].dentry;
 
 	if (origin && ovl_indexdir(dentry->d_sb) &&
@@ -1019,6 +1094,10 @@ struct dentry *ovl_lookup(struct inode *
 	}
 
 	revert_creds(old_cred);
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
@@ -1033,6 +1112,10 @@ out_put:
 		dput(stack[i].dentry);
 	kfree(stack);
 out_put_upper:
+	if (origin_path) {
+		dput(origin_path->dentry);
+		kfree(origin_path);
+	}
 	dput(upperdentry);
 	kfree(upperredirect);
 out:
Index: rhvgoyal-linux/fs/overlayfs/export.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/export.c	2018-05-11 11:16:41.319039217 -0400
+++ rhvgoyal-linux/fs/overlayfs/export.c	2018-05-11 11:16:41.529039217 -0400
@@ -317,6 +317,9 @@ static struct dentry *ovl_obtain_alias(s
 		return ERR_CAST(inode);
 	}
 
+	if (upper)
+		ovl_set_flag(OVL_UPPERDATA, inode);
+
 	dentry = d_find_any_alias(inode);
 	if (!dentry) {
 		dentry = d_alloc_anon(inode->i_sb);
Index: rhvgoyal-linux/fs/overlayfs/overlayfs.h
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/overlayfs.h	2018-05-11 11:16:41.482039217 -0400
+++ rhvgoyal-linux/fs/overlayfs/overlayfs.h	2018-05-11 11:16:41.529039217 -0400
@@ -274,6 +274,7 @@ bool ovl_need_index(struct dentry *dentr
 int ovl_nlink_start(struct dentry *dentry, bool *locked);
 void ovl_nlink_end(struct dentry *dentry, bool locked);
 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir);
+int ovl_check_metacopy_xattr(struct dentry *dentry);
 
 static inline bool ovl_is_impuredir(struct dentry *dentry)
 {
Index: rhvgoyal-linux/fs/overlayfs/inode.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/inode.c	2018-05-11 11:16:41.506039217 -0400
+++ rhvgoyal-linux/fs/overlayfs/inode.c	2018-05-11 11:16:41.530039217 -0400
@@ -771,7 +771,7 @@ struct inode *ovl_get_inode(struct super
 	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
 					oip->index);
 	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
-	bool is_dir;
+	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = -ENOMEM;
 
@@ -831,6 +831,15 @@ struct inode *ovl_get_inode(struct super
 	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
+	if (upperdentry) {
+		err = ovl_check_metacopy_xattr(upperdentry);
+		if (err < 0)
+			goto out_err;
+		metacopy = err;
+		if (!metacopy)
+			ovl_set_flag(OVL_UPPERDATA, inode);
+	}
+
 	OVL_I(inode)->redirect = oip->redirect;
 
 	/* Check for non-merge dir that may have whiteouts */
Index: rhvgoyal-linux/fs/overlayfs/util.c
===================================================================
--- rhvgoyal-linux.orig/fs/overlayfs/util.c	2018-05-11 11:16:41.483039217 -0400
+++ rhvgoyal-linux/fs/overlayfs/util.c	2018-05-11 11:16:41.531039217 -0400
@@ -778,3 +778,25 @@ err:
 	pr_err("overlayfs: failed to lock workdir+upperdir\n");
 	return -EIO;
 }
+
+/* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
+int ovl_check_metacopy_xattr(struct dentry *dentry)
+{
+	int res;
+
+	/* Only regular files can have metacopy xattr */
+	if (!S_ISREG(d_inode(dentry)->i_mode))
+		return 0;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto out;
+	}
+
+	return 1;
+out:
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	return res;
+}

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v15 00/30] overlayfs: Delayed copy up of data
  2018-05-08 14:16   ` Amir Goldstein
@ 2018-05-23 20:00     ` Vivek Goyal
  0 siblings, 0 replies; 77+ messages in thread
From: Vivek Goyal @ 2018-05-23 20:00 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, Miklos Szeredi

On Tue, May 08, 2018 at 05:16:54PM +0300, Amir Goldstein wrote:
> On Tue, May 8, 2018 at 4:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Mon, May 07, 2018 at 01:40:32PM -0400, Vivek Goyal wrote:
> >> Hi,
> >>
> >> This is V15 of overlayfs metadata only copy-up feature. These patches I
> >> have rebased on top of Miklos overlayfs-next tree's branch overlayfs-rorw.
> >>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw
> >>
> >> Patches are also available here.
> >>
> >> https://github.com/rhvgoyal/linux/commits/metacopy-v15
> >>
> >> I have run unionmount-testsuite and "./check -overlay -g quick" and that
> >> works. Only 4 overlay tests fail, which fail on vanilla kernel too.
> >>
> >
> > Hi Amir,
> >
> > I have taken care of your review comments and pushed new patches at
> > "metcopy-next" branch.
> >
> > https://github.com/rhvgoyal/linux/commits/metacopy-next
> 
> Looks good.

Given all the work w.r.t fixing races and making use of ovl_get_inode()
in ovl_instantiate(), I have rebased my patches on top of Amir's ovl-fixes
branch and pushed it to "metacopy-next-ovl-fixes" branch.

https://github.com/rhvgoyal/linux/commits/metacopy-next-ovl-fixes

With usage of ovl_get_inode() in ovl_instantiate(), I realized that I
should not have to call ovl_set_upperdata() now explicitly as
ovl_get_inode() will take care of that anyway. So I dropped it. Also
dropped similar call from export.c as it felt redundant.

I might do another posting of patches once all the ovl_get_inode()
related cleanup gets merged and shows up in Miklos's tree.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2018-05-23 20:00 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-07 17:40 [PATCH v15 00/30] overlayfs: Delayed copy up of data Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 01/30] ovl: Pass argument to ovl_get_inode() in a structure Vivek Goyal
2018-05-07 19:26   ` Amir Goldstein
2018-05-07 20:37     ` Vivek Goyal
2018-05-08  4:45       ` Amir Goldstein
2018-05-08 13:45     ` Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 02/30] ovl: Initialize ovl_inode->redirect in ovl_get_inode() Vivek Goyal
2018-05-08 13:56   ` Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 03/30] ovl: Move the copy up helpers to copy_up.c Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 04/30] ovl: Provide a mount option metacopy=on/off for metadata copyup Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 05/30] ovl: During copy up, first copy up metadata and then data Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 06/30] ovl: Copy up only metadata during copy up where it makes sense Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 07/30] ovl: Add helper ovl_already_copied_up() Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 08/30] ovl: A new xattr OVL_XATTR_METACOPY for file on upper Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 09/30] ovl: Use out_err instead of out_nomem Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 10/30] ovl: Modify ovl_lookup() and friends to lookup metacopy dentry Vivek Goyal
2018-05-07 19:14   ` Amir Goldstein
2018-05-10  9:19   ` Miklos Szeredi
2018-05-10  9:36     ` Miklos Szeredi
2018-05-10  9:52       ` Miklos Szeredi
2018-05-10 13:17       ` Vivek Goyal
2018-05-10 15:32       ` Vivek Goyal
2018-05-10 20:21         ` Miklos Szeredi
2018-05-10 13:14     ` Vivek Goyal
2018-05-10 14:43       ` Amir Goldstein
2018-05-10 19:42         ` Vivek Goyal
2018-05-10 19:39     ` Vivek Goyal
2018-05-10 20:13       ` Miklos Szeredi
2018-05-11  7:29         ` Miklos Szeredi
2018-05-11  7:52           ` Amir Goldstein
2018-05-11  8:13             ` Miklos Szeredi
2018-05-11 12:28               ` Vivek Goyal
2018-05-11 14:30   ` Vivek Goyal
2018-05-11 15:05     ` Amir Goldstein
2018-05-11 15:14       ` Vivek Goyal
2018-05-11 15:52   ` Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 11/30] ovl: Copy up meta inode data from lowest data inode Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 12/30] ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 13/30] ovl: Add an helper to get real " Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 14/30] ovl: Fix ovl_getattr() to get number of blocks from lower Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 15/30] ovl: Store lower data inode in ovl_inode Vivek Goyal
2018-05-07 18:59   ` Amir Goldstein
2018-05-08 13:47     ` Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 16/30] ovl: Add helper ovl_inode_real_data() Vivek Goyal
2018-05-07 18:18   ` Amir Goldstein
2018-05-07 17:40 ` [PATCH v15 17/30] ovl: Open file with data except for the case of fsync Vivek Goyal
2018-05-07 19:47   ` Amir Goldstein
2018-05-07 20:59     ` Vivek Goyal
2018-05-08  5:26       ` Amir Goldstein
2018-05-08 12:50         ` Vivek Goyal
2018-05-08 14:14           ` Amir Goldstein
2018-05-08 14:26             ` Vivek Goyal
2018-05-08 15:04               ` Amir Goldstein
2018-05-07 17:40 ` [PATCH v15 18/30] ovl: Do not expose metacopy only dentry from d_real() Vivek Goyal
2018-05-07 19:39   ` Amir Goldstein
2018-05-07 17:40 ` [PATCH v15 19/30] ovl: Move some dir related ovl_lookup_single() code in else block Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 20/30] ovl: Check redirects for metacopy files Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 21/30] ovl: Treat metacopy dentries as type OVL_PATH_MERGE Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 22/30] ovl: Add an inode flag OVL_CONST_INO Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 23/30] ovl: Do not set dentry type ORIGIN for broken hardlinks Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 24/30] ovl: Set redirect on metacopy files upon rename Vivek Goyal
2018-05-07 18:21   ` Amir Goldstein
2018-05-07 17:40 ` [PATCH v15 25/30] ovl: Set redirect on upper inode when it is linked Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 26/30] ovl: Check redirect on index as well Vivek Goyal
2018-05-07 18:43   ` Amir Goldstein
2018-05-08 12:58     ` Vivek Goyal
2018-05-07 17:40 ` [PATCH v15 27/30] ovl: Disbale metacopy for MAP_SHARED mmap() Vivek Goyal
2018-05-07 17:41 ` [PATCH v15 28/30] ovl: Do not do metadata only copy-up for truncate operation Vivek Goyal
2018-05-07 17:41 ` [PATCH v15 29/30] ovl: Do not do metacopy only for ioctl modifying file attr Vivek Goyal
2018-05-07 17:41 ` [PATCH v15 30/30] ovl: Enable metadata only feature Vivek Goyal
2018-05-07 18:10 ` [PATCH v15 00/30] overlayfs: Delayed copy up of data Amir Goldstein
2018-05-07 18:24   ` Vivek Goyal
2018-05-07 18:33     ` Amir Goldstein
2018-05-07 19:14       ` Vivek Goyal
2018-05-08 13:42 ` Vivek Goyal
2018-05-08 14:16   ` Amir Goldstein
2018-05-23 20:00     ` Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.