linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] overlayfs: allow moving directory trees
@ 2016-10-25  7:34 Miklos Szeredi
  2016-10-25  7:34 ` [PATCH 1/3] ovl: check fs features Miklos Szeredi
                   ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-25  7:34 UTC (permalink / raw)
  To: linux-unionfs; +Cc: Guillem Jover, Raphael Hertzog, linux-fsdevel, linux-kernel

This allows overlayfs to move directory trees (residing on lower layer)
without having to recursively copy up the whole tree first.

This series is available in git at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git#redirect

And is on top of the overlayfs-next branch.

---
Miklos Szeredi (3):
  ovl: check fs features
  vfs: export vfs_path_lookup()
  ovl: redirect on rename-dir

 Documentation/filesystems/overlayfs.txt | 33 ++++++++++-
 fs/internal.h                           |  2 -
 fs/overlayfs/copy_up.c                  | 20 ++-----
 fs/overlayfs/dir.c                      | 86 +++++++++++++++++++---------
 fs/overlayfs/namei.c                    | 99 ++++++++++++++++++++++++++++++---
 fs/overlayfs/overlayfs.h                |  5 ++
 fs/overlayfs/ovl_entry.h                |  4 ++
 fs/overlayfs/super.c                    | 56 +++++++++++++++++--
 fs/overlayfs/util.c                     | 19 +++++++
 include/linux/namei.h                   |  2 +
 10 files changed, 268 insertions(+), 58 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/3] ovl: check fs features
  2016-10-25  7:34 [PATCH 0/3] overlayfs: allow moving directory trees Miklos Szeredi
@ 2016-10-25  7:34 ` Miklos Szeredi
  2016-10-25 11:24   ` Amir Goldstein
  2016-10-25  7:34 ` [PATCH 2/3] vfs: export vfs_path_lookup() Miklos Szeredi
  2016-10-25  7:34 ` [PATCH 3/3] ovl: redirect on rename-dir Miklos Szeredi
  2 siblings, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-25  7:34 UTC (permalink / raw)
  To: linux-unionfs
  Cc: Guillem Jover, Raphael Hertzog, linux-fsdevel, linux-kernel, stable

To allow adding new, backward incompatible features to overlayfs, we need a
way to store the list of features in the overlay.  This is done via
"trusted.overlay.features" xattr on the root of the upper layer (or one of
the lower layers, that previously acted as an upper layer).  It's a comma
separated list of case sensitive strings.

If an overlay has an unknown feature, mount shall return an error.  So
mechanism should only be used for backward incompatible features.

This patch doesn't add any features.  If the "trusted.overlay.features"
xattr contains a non-empty list, then return EINVAL error for the mount.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
---
 Documentation/filesystems/overlayfs.txt | 12 ++++++++++
 fs/overlayfs/overlayfs.h                |  1 +
 fs/overlayfs/super.c                    | 41 +++++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 7aeb8e8d80cf..5108425157ac 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -175,6 +175,18 @@ The specified lower directories will be stacked beginning from the
 rightmost one and going left.  In the above example lower1 will be the
 top, lower2 the middle and lower3 the bottom layer.
 
+Filesystem features
+-------------------
+
+Features are enabled via "trusted.overlay.features" xattr on the root of the
+upper layer.  E.g. the following command can be used to enable features "foo"
+and "bar" on the overlay:
+
+  setfattr -n "trusted.overlay.features" -v "foo,bar" /upper
+  mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,\
+workdir=/work /merged
+
+If an overlay has an unknown feature, mount shall return an error.
 
 Non-standard behavior
 ---------------------
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f6e4d3539a25..d61d5b9d0d91 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -19,6 +19,7 @@ enum ovl_path_type {
 
 #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
 #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
+#define OVL_XATTR_FEATURES OVL_XATTR_PREFIX "features"
 
 #define OVL_ISUPPER_MASK 1UL
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 30263a541fd5..d6dc8d905d00 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -397,6 +397,39 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
 	goto out_unlock;
 }
 
+static int ovl_check_features(struct dentry *root)
+{
+	int res;
+	char *buf, *tmp, *p;
+
+	res = vfs_getxattr(root, OVL_XATTR_FEATURES, NULL, 0);
+	if (res <= 0) {
+		if (res == -EOPNOTSUPP || res == -ENODATA)
+			res = 0;
+		return res;
+	}
+
+	buf = kmalloc(res + 1, GFP_TEMPORARY);
+	if (!buf)
+		return -ENOMEM;
+
+	res = vfs_getxattr(root, OVL_XATTR_FEATURES, buf, res);
+	if (res <= 0)
+		goto out_free;
+
+	buf[res] = '\0';
+	res = 0;
+	tmp = buf;
+	while ((p = strsep(&tmp, ",")) != NULL) {
+		res = -EINVAL;
+		pr_err("overlayfs: feature '%s' not supported\n", p);
+	}
+out_free:
+	kfree(buf);
+
+	return res;
+}
+
 static void ovl_unescape(char *s)
 {
 	char *d = s;
@@ -471,6 +504,10 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
 	if (err)
 		goto out;
 
+	err = ovl_check_features(path->dentry);
+	if (err)
+		goto out_put;
+
 	err = vfs_statfs(path, &statfs);
 	if (err) {
 		pr_err("overlayfs: statfs failed on '%s'\n", name);
@@ -693,6 +730,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			goto out_put_upperpath;
 		}
 
+		err = ovl_check_features(upperpath.dentry);
+		if (err)
+			goto out_put_upperpath;
+
 		err = ovl_mount_dir(ufs->config.workdir, &workpath);
 		if (err)
 			goto out_put_upperpath;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/3] vfs: export vfs_path_lookup()
  2016-10-25  7:34 [PATCH 0/3] overlayfs: allow moving directory trees Miklos Szeredi
  2016-10-25  7:34 ` [PATCH 1/3] ovl: check fs features Miklos Szeredi
@ 2016-10-25  7:34 ` Miklos Szeredi
  2016-10-25  7:34 ` [PATCH 3/3] ovl: redirect on rename-dir Miklos Szeredi
  2 siblings, 0 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-25  7:34 UTC (permalink / raw)
  To: linux-unionfs; +Cc: Guillem Jover, Raphael Hertzog, linux-fsdevel, linux-kernel

Exported to modules, but currently residing in fs/internal.h due to commit
197df04c749a ("rename user_path_umountat() to user_path_mountpoint_at()").

Move back to <linux/namei.h> so that overlayfs can make use of it.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/internal.h         | 2 --
 include/linux/namei.h | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index f4da3341b4a3..800319d11291 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -53,8 +53,6 @@ extern void __init chrdev_init(void);
  * namei.c
  */
 extern int user_path_mountpoint_at(int, const char __user *, unsigned int, struct path *);
-extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
-			   const char *, unsigned int, struct path *);
 
 /*
  * namespace.c
diff --git a/include/linux/namei.h b/include/linux/namei.h
index f29abda31e6d..2375bc50a9d3 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -78,6 +78,8 @@ extern struct dentry *user_path_create(int, const char __user *, struct path *,
 extern void done_path_create(struct path *, struct dentry *);
 extern struct dentry *kern_path_locked(const char *, struct path *);
 extern int kern_path_mountpoint(int, const char *, struct path *, unsigned int);
+extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
+			   const char *, unsigned int, struct path *);
 
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-25  7:34 [PATCH 0/3] overlayfs: allow moving directory trees Miklos Szeredi
  2016-10-25  7:34 ` [PATCH 1/3] ovl: check fs features Miklos Szeredi
  2016-10-25  7:34 ` [PATCH 2/3] vfs: export vfs_path_lookup() Miklos Szeredi
@ 2016-10-25  7:34 ` Miklos Szeredi
  2016-10-25 11:57   ` Raphael Hertzog
                     ` (2 more replies)
  2 siblings, 3 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-25  7:34 UTC (permalink / raw)
  To: linux-unionfs; +Cc: Guillem Jover, Raphael Hertzog, linux-fsdevel, linux-kernel

Current code returns EXDEV when a directory would need to be copied up to
move.  We could copy up the directory tree in this case, but there's
another solution: point to old lower directory from moved upper directory.

This is achieved with a "trusted.overlay.redirect" xattr storing the path
relative to the root of the overlay.  After such attribute has been set,
the directory can be moved without further actions required.

This is a backward incompatible feature, old kernels won't be able to
correctly mount an overlay containing redirected directories.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 Documentation/filesystems/overlayfs.txt | 21 ++++++-
 fs/overlayfs/copy_up.c                  | 20 ++-----
 fs/overlayfs/dir.c                      | 86 +++++++++++++++++++---------
 fs/overlayfs/namei.c                    | 99 ++++++++++++++++++++++++++++++---
 fs/overlayfs/overlayfs.h                |  4 ++
 fs/overlayfs/ovl_entry.h                |  4 ++
 fs/overlayfs/super.c                    | 25 +++++----
 fs/overlayfs/util.c                     | 19 +++++++
 8 files changed, 217 insertions(+), 61 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 5108425157ac..fae48ab3b36b 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -130,6 +130,23 @@ directory.
 Readdir on directories that are not merged is simply handled by the
 underlying directory (upper or lower).
 
+renaming directories
+--------------------
+
+When renaming a directory that is on the lower layer or merged (i.e. the
+directory was not created on the upper layer to start with) overlayfs can
+handle it in two different ways:
+
+1) return EXDEV error: this error is returned by rename(2) when trying to
+   move a file or directory across filesystem boundaries.  Hence
+   applications are usually prepared to hande this error (mv(1) for example
+   recursively copies the directory tree).  This is the default behavior.
+
+2) If the "redirect_dir" feature is enabled, then the directory will be
+   copied up (but not the contents).  Then the "trusted.overlay.redirect"
+   extended attribute is set to the path of the original location from the
+   root of the overlay.  Finally the directory is moved to the new
+   location.
 
 Non-directories
 ---------------
@@ -201,8 +218,8 @@ If a file with multiple hard links is copied up, then this will
 "break" the link.  Changes will not be propagated to other names
 referring to the same inode.
 
-Directory trees are not copied up.  If rename(2) is performed on a directory
-which is on the lower layer or is merged, then -EXDEV will be returned.
+Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
+directory will fail with EXDEV.
 
 Changes to underlying filesystems
 ---------------------------------
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 49f6158bb04c..0d7075208099 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -323,17 +323,11 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 /*
  * Copy up a single dentry
  *
- * Directory renames only allowed on "pure upper" (already created on
- * upper filesystem, never copied up).  Directories which are on lower or
- * are merged may not be renamed.  For these -EXDEV is returned and
- * userspace has to deal with it.  This means, when copying up a
- * directory we can rely on it and ancestors being stable.
- *
- * Non-directory renames start with copy up of source if necessary.  The
- * actual rename will only proceed once the copy up was successful.  Copy
- * up uses upper parent i_mutex for exclusion.  Since rename can change
- * d_parent it is possible that the copy up will lock the old parent.  At
- * that point the file will have already been copied up anyway.
+ * All renames start with copy up of source if necessary.  The actual
+ * rename will only proceed once the copy up was successful.  Copy up uses
+ * upper parent i_mutex for exclusion.  Since rename can change d_parent it
+ * is possible that the copy up will lock the old parent.  At that point
+ * the file will have already been copied up anyway.
  */
 int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		    struct path *lowerpath, struct kstat *stat)
@@ -345,7 +339,6 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	struct path parentpath;
 	struct dentry *lowerdentry = lowerpath->dentry;
 	struct dentry *upperdir;
-	struct dentry *upperdentry;
 	const char *link = NULL;
 
 	if (WARN_ON(!workdir))
@@ -371,8 +364,7 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		pr_err("overlayfs: failed to lock workdir+upperdir\n");
 		goto out_unlock;
 	}
-	upperdentry = ovl_dentry_upper(dentry);
-	if (upperdentry) {
+	if (ovl_dentry_upper(dentry)) {
 		/* Raced with another copy-up?  Nothing to do, then... */
 		err = 0;
 		goto out_unlock;
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 2c1057d747cb..065e0211f9b6 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -757,6 +757,40 @@ static bool ovl_type_merge_or_lower(struct dentry *dentry)
 	return OVL_TYPE_MERGE(type) || !OVL_TYPE_UPPER(type);
 }
 
+static bool ovl_can_move(struct dentry *dentry)
+{
+	return ovl_redirect_dir(dentry->d_sb) ||
+		!d_is_dir(dentry) || !ovl_type_merge_or_lower(dentry);
+}
+
+static int ovl_set_redirect(struct dentry *dentry)
+{
+	char *buf;
+	char *path;
+	int err;
+
+	if (ovl_dentry_is_redirect(dentry))
+		return 0;
+
+	buf = __getname();
+	if (!buf)
+		return -ENOMEM;
+
+	path = dentry_path_raw(dentry, buf, PATH_MAX);
+	err = PTR_ERR(path);
+	if (IS_ERR(path))
+		goto putname;
+
+	err = ovl_do_setxattr(ovl_dentry_upper(dentry), OVL_XATTR_REDIRECT,
+			      path, strlen(path), 0);
+putname:
+	__putname(buf);
+	if (!err)
+		ovl_dentry_set_redirect(dentry);
+
+	return err;
+}
+
 static int ovl_rename(struct inode *olddir, struct dentry *old,
 		      struct inode *newdir, struct dentry *new,
 		      unsigned int flags)
@@ -784,9 +818,9 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 
 	/* Don't copy up directory trees */
 	err = -EXDEV;
-	if (is_dir && ovl_type_merge_or_lower(old))
+	if (!ovl_can_move(old))
 		goto out;
-	if (!overwrite && new_is_dir && ovl_type_merge_or_lower(new))
+	if (!overwrite && !ovl_can_move(new))
 		goto out;
 
 	err = ovl_want_write(old);
@@ -837,7 +871,6 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 
 	trap = lock_rename(new_upperdir, old_upperdir);
 
-
 	olddentry = lookup_one_len(old->d_name.name, old_upperdir,
 				   old->d_name.len);
 	err = PTR_ERR(olddentry);
@@ -880,31 +913,34 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 	if (WARN_ON(olddentry->d_inode == newdentry->d_inode))
 		goto out_dput;
 
-	if (is_dir && !old_opaque && ovl_lower_positive(new)) {
-		err = ovl_set_opaque(olddentry);
-		if (err)
-			goto out_dput;
-		ovl_dentry_set_opaque(old, true);
+	if (is_dir) {
+		if (ovl_type_merge_or_lower(old)) {
+			err = ovl_set_redirect(old);
+			if (err)
+				goto out_dput;
+		} else if (!old_opaque && ovl_lower_positive(new)) {
+			err = ovl_set_opaque(olddentry);
+			if (err)
+				goto out_dput;
+			ovl_dentry_set_opaque(old, true);
+		}
 	}
-	if (!overwrite &&
-	    new_is_dir && !new_opaque && ovl_lower_positive(old)) {
-		err = ovl_set_opaque(newdentry);
-		if (err)
-			goto out_dput;
-		ovl_dentry_set_opaque(new, true);
+	if (!overwrite && new_is_dir) {
+		if (ovl_type_merge_or_lower(new)) {
+			err = ovl_set_redirect(new);
+			if (err)
+				goto out_dput;
+		} else if (!new_opaque && ovl_lower_positive(old)) {
+			err = ovl_set_opaque(newdentry);
+			if (err)
+				goto out_dput;
+			ovl_dentry_set_opaque(new, true);
+		}
 	}
 
-	if (old_opaque || new_opaque) {
-		err = ovl_do_rename(old_upperdir->d_inode, olddentry,
-				    new_upperdir->d_inode, newdentry,
-				    flags);
-	} else {
-		/* No debug for the plain case */
-		BUG_ON(flags & ~RENAME_EXCHANGE);
-		err = vfs_rename(old_upperdir->d_inode, olddentry,
-				 new_upperdir->d_inode, newdentry,
-				 NULL, flags);
-	}
+	err = ovl_do_rename(old_upperdir->d_inode, olddentry,
+			    new_upperdir->d_inode, newdentry,
+			    flags);
 	if (err)
 		goto out_dput;
 
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index f4057fcb0246..c7cacbb8ce09 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/fs.h>
+#include <linux/mount.h>
 #include <linux/namei.h>
 #include <linux/xattr.h>
 #include "overlayfs.h"
@@ -48,6 +49,28 @@ static bool ovl_is_opaquedir(struct dentry *dentry)
 	return false;
 }
 
+static int ovl_check_redirect(struct dentry *dentry, char **bufp)
+{
+	int res;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
+	if (res > 0) {
+		char *buf = kzalloc(res + 1, GFP_KERNEL);
+
+		if (!buf)
+			return -ENOMEM;
+
+		res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
+		if (res < 0)
+			return res;
+
+		kfree(*bufp);
+		*bufp = buf;
+	}
+
+	return 0;
+}
+
 /*
  * Returns next layer in stack starting from top.
  * Returns -1 if this is the last layer.
@@ -80,11 +103,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	unsigned int ctr = 0;
 	struct inode *inode = NULL;
 	bool upperopaque = false;
+	bool upperredirect = false;
 	bool stop = false;
 	bool isdir = false;
 	struct dentry *this;
 	unsigned int i;
 	int err;
+	char *redirect = NULL;
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	upperdir = ovl_upperdentry_dereference(poe);
@@ -110,11 +135,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 				isdir = true;
 				if (poe->numlower && ovl_is_opaquedir(this))
 					stop = upperopaque = true;
+				else if (ovl_redirect_dir(dentry->d_sb)) {
+					err = ovl_check_redirect(this,
+								 &redirect);
+					if (err)
+						goto out;
+
+					if (redirect)
+						upperredirect = true;
+				}
 			}
 		}
 		upperdentry = this;
 	}
 
+	if (redirect)
+		poe = dentry->d_sb->s_root->d_fsdata;
+
 	if (!stop && poe->numlower) {
 		err = -ENOMEM;
 		stack = kcalloc(poe->numlower, sizeof(struct path), GFP_KERNEL);
@@ -125,15 +162,39 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	for (i = 0; !stop && i < poe->numlower; i++) {
 		struct path lowerpath = poe->lowerstack[i];
 
-		this = ovl_lookup_real(lowerpath.dentry, &dentry->d_name);
-		err = PTR_ERR(this);
-		if (IS_ERR(this)) {
-			/*
-			 * If it's positive, then treat ENAMETOOLONG as ENOENT.
-			 */
-			if (err == -ENAMETOOLONG && (upperdentry || ctr))
-				continue;
-			goto out_put;
+		if (!redirect) {
+			this = ovl_lookup_real(lowerpath.dentry, &dentry->d_name);
+			err = PTR_ERR(this);
+			if (IS_ERR(this)) {
+				/*
+				 * If it's positive, then treat ENAMETOOLONG as ENOENT.
+				 */
+				if (err == -ENAMETOOLONG && (upperdentry || ctr))
+					continue;
+				goto out_put;
+			}
+		} else {
+			struct path thispath;
+
+			err = vfs_path_lookup(lowerpath.dentry, lowerpath.mnt,
+					      redirect, 0, &thispath);
+
+			if (err) {
+				if (err == -ENOENT || err == -ENAMETOOLONG)
+					this = NULL;
+			} else {
+				this = thispath.dentry;
+				mntput(thispath.mnt);
+				if (!this->d_inode) {
+					dput(this);
+					this = NULL;
+				} else if (ovl_dentry_weird(this)) {
+					dput(this);
+					err = -EREMOTE;
+				}
+			}
+			if (err)
+				goto out_put;
 		}
 		if (!this)
 			continue;
@@ -162,6 +223,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		stack[ctr].dentry = this;
 		stack[ctr].mnt = lowerpath.mnt;
 		ctr++;
+
+		if (!stop && i != poe->numlower - 1 &&
+		    d_is_dir(this) && ovl_redirect_dir(dentry->d_sb)) {
+			err = ovl_check_redirect(this, &redirect);
+			if (err)
+				goto out_put;
+
+			if (redirect && poe != dentry->d_sb->s_root->d_fsdata) {
+				poe = dentry->d_sb->s_root->d_fsdata;
+
+				for (i = 0; i < poe->numlower; i++)
+					if (poe->lowerstack[i].mnt == lowerpath.mnt)
+						break;
+				if (WARN_ON(i == poe->numlower))
+					break;
+			}
+		}
 	}
 
 	oe = ovl_alloc_entry(ctr);
@@ -192,9 +270,11 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	revert_creds(old_cred);
 	oe->opaque = upperopaque;
+	oe->redirect = upperredirect;
 	oe->__upperdentry = upperdentry;
 	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
 	kfree(stack);
+	kfree(redirect);
 	dentry->d_fsdata = oe;
 	d_add(dentry, inode);
 
@@ -209,6 +289,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 out_put_upper:
 	dput(upperdentry);
 out:
+	kfree(redirect);
 	revert_creds(old_cred);
 	return ERR_PTR(err);
 }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d61d5b9d0d91..9d80ce367ad8 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -20,6 +20,7 @@ enum ovl_path_type {
 #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
 #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
 #define OVL_XATTR_FEATURES OVL_XATTR_PREFIX "features"
+#define OVL_XATTR_REDIRECT OVL_XATTR_PREFIX "redirect"
 
 #define OVL_ISUPPER_MASK 1UL
 
@@ -157,6 +158,9 @@ void ovl_set_dir_cache(struct dentry *dentry, struct ovl_dir_cache *cache);
 bool ovl_dentry_is_opaque(struct dentry *dentry);
 bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry, bool opaque);
+bool ovl_redirect_dir(struct super_block *sb);
+bool ovl_dentry_is_redirect(struct dentry *dentry);
+void ovl_dentry_set_redirect(struct dentry *dentry);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 3b7ba59ad27e..2b22645535ff 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -26,6 +26,9 @@ struct ovl_fs {
 	struct ovl_config config;
 	/* creds of process who forced instantiation of super block */
 	const struct cred *creator_cred;
+
+	/* Check for "redirect" on directories */
+	bool redirect_dir;
 };
 
 /* private information held for every overlayfs dentry */
@@ -36,6 +39,7 @@ struct ovl_entry {
 		struct {
 			u64 version;
 			bool opaque;
+			bool redirect;
 		};
 		struct rcu_head rcu;
 	};
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d6dc8d905d00..fc22a8a2e0d0 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -397,7 +397,7 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
 	goto out_unlock;
 }
 
-static int ovl_check_features(struct dentry *root)
+static int ovl_check_features(struct ovl_fs *ufs, struct dentry *root)
 {
 	int res;
 	char *buf, *tmp, *p;
@@ -421,8 +421,12 @@ static int ovl_check_features(struct dentry *root)
 	res = 0;
 	tmp = buf;
 	while ((p = strsep(&tmp, ",")) != NULL) {
-		res = -EINVAL;
-		pr_err("overlayfs: feature '%s' not supported\n", p);
+		if (strcmp(p, "redirect_dir") == 0) {
+			ufs->redirect_dir = true;
+		} else {
+			res = -EINVAL;
+			pr_err("overlayfs: feature '%s' not supported\n", p);
+		}
 	}
 out_free:
 	kfree(buf);
@@ -494,8 +498,8 @@ static int ovl_mount_dir(const char *name, struct path *path)
 	return err;
 }
 
-static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
-			 int *stack_depth, bool *remote)
+static int ovl_lower_dir(const char *name, struct path *path,
+			 struct ovl_fs *ufs, int *stack_depth, bool *remote)
 {
 	int err;
 	struct kstatfs statfs;
@@ -504,7 +508,7 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
 	if (err)
 		goto out;
 
-	err = ovl_check_features(path->dentry);
+	err = ovl_check_features(ufs, path->dentry);
 	if (err)
 		goto out_put;
 
@@ -513,7 +517,7 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
 		pr_err("overlayfs: statfs failed on '%s'\n", name);
 		goto out_put;
 	}
-	*namelen = max(*namelen, statfs.f_namelen);
+	ufs->lower_namelen = max(ufs->lower_namelen, statfs.f_namelen);
 	*stack_depth = max(*stack_depth, path->mnt->mnt_sb->s_stack_depth);
 
 	if (ovl_dentry_remote(path->dentry))
@@ -730,7 +734,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			goto out_put_upperpath;
 		}
 
-		err = ovl_check_features(upperpath.dentry);
+		err = ovl_check_features(ufs, upperpath.dentry);
 		if (err)
 			goto out_put_upperpath;
 
@@ -771,9 +775,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	lower = lowertmp;
 	for (numlower = 0; numlower < stacklen; numlower++) {
-		err = ovl_lower_dir(lower, &stack[numlower],
-				    &ufs->lower_namelen, &sb->s_stack_depth,
-				    &remote);
+		err = ovl_lower_dir(lower, &stack[numlower], ufs,
+				    &sb->s_stack_depth, &remote);
 		if (err)
 			goto out_put_lowerpath;
 
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 0d45a84468d2..06dae4cabd53 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -176,6 +176,25 @@ void ovl_dentry_set_opaque(struct dentry *dentry, bool opaque)
 	oe->opaque = opaque;
 }
 
+bool ovl_redirect_dir(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return ofs->redirect_dir;
+}
+
+bool ovl_dentry_is_redirect(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+	return oe->redirect;
+}
+
+void ovl_dentry_set_redirect(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+	oe->redirect = true;
+}
+
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/3] ovl: check fs features
  2016-10-25  7:34 ` [PATCH 1/3] ovl: check fs features Miklos Szeredi
@ 2016-10-25 11:24   ` Amir Goldstein
  2016-11-05 20:40     ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-10-25 11:24 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-unionfs, Guillem Jover, Raphael Hertzog, linux-fsdevel,
	linux-kernel, stable

On Tue, Oct 25, 2016 at 10:34 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
> To allow adding new, backward incompatible features to overlayfs, we need a
> way to store the list of features in the overlay.  This is done via
> "trusted.overlay.features" xattr on the root of the upper layer (or one of
> the lower layers, that previously acted as an upper layer).  It's a comma
> separated list of case sensitive strings.
>
> If an overlay has an unknown feature, mount shall return an error.  So
> mechanism should only be used for backward incompatible features.

So maybe be explicit and call the attribute trusted.overlay.incompat_features,
to allow future addition of compat and rocompat feature sets?

>
> This patch doesn't add any features.  If the "trusted.overlay.features"
> xattr contains a non-empty list, then return EINVAL error for the mount.
>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> Cc: <stable@vger.kernel.org>
> ---
>  Documentation/filesystems/overlayfs.txt | 12 ++++++++++
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/super.c                    | 41 +++++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+)
>
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index 7aeb8e8d80cf..5108425157ac 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -175,6 +175,18 @@ The specified lower directories will be stacked beginning from the
>  rightmost one and going left.  In the above example lower1 will be the
>  top, lower2 the middle and lower3 the bottom layer.
>
> +Filesystem features
> +-------------------
> +
> +Features are enabled via "trusted.overlay.features" xattr on the root of the
> +upper layer.  E.g. the following command can be used to enable features "foo"
> +and "bar" on the overlay:
> +
> +  setfattr -n "trusted.overlay.features" -v "foo,bar" /upper
> +  mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,\
> +workdir=/work /merged
> +
> +If an overlay has an unknown feature, mount shall return an error.
>
>  Non-standard behavior
>  ---------------------
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index f6e4d3539a25..d61d5b9d0d91 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -19,6 +19,7 @@ enum ovl_path_type {
>
>  #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
>  #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
> +#define OVL_XATTR_FEATURES OVL_XATTR_PREFIX "features"
>
>  #define OVL_ISUPPER_MASK 1UL
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 30263a541fd5..d6dc8d905d00 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -397,6 +397,39 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
>         goto out_unlock;
>  }
>
> +static int ovl_check_features(struct dentry *root)
> +{
> +       int res;
> +       char *buf, *tmp, *p;
> +
> +       res = vfs_getxattr(root, OVL_XATTR_FEATURES, NULL, 0);
> +       if (res <= 0) {
> +               if (res == -EOPNOTSUPP || res == -ENODATA)
> +                       res = 0;
> +               return res;
> +       }
> +
> +       buf = kmalloc(res + 1, GFP_TEMPORARY);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       res = vfs_getxattr(root, OVL_XATTR_FEATURES, buf, res);
> +       if (res <= 0)
> +               goto out_free;
> +
> +       buf[res] = '\0';
> +       res = 0;
> +       tmp = buf;
> +       while ((p = strsep(&tmp, ",")) != NULL) {
> +               res = -EINVAL;
> +               pr_err("overlayfs: feature '%s' not supported\n", p);
> +       }
> +out_free:
> +       kfree(buf);
> +
> +       return res;
> +}
> +
>  static void ovl_unescape(char *s)
>  {
>         char *d = s;
> @@ -471,6 +504,10 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
>         if (err)
>                 goto out;
>
> +       err = ovl_check_features(path->dentry);
> +       if (err)
> +               goto out_put;
> +
>         err = vfs_statfs(path, &statfs);
>         if (err) {
>                 pr_err("overlayfs: statfs failed on '%s'\n", name);
> @@ -693,6 +730,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>                         goto out_put_upperpath;
>                 }
>
> +               err = ovl_check_features(upperpath.dentry);
> +               if (err)
> +                       goto out_put_upperpath;
> +
>                 err = ovl_mount_dir(ufs->config.workdir, &workpath);
>                 if (err)
>                         goto out_put_upperpath;
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-25  7:34 ` [PATCH 3/3] ovl: redirect on rename-dir Miklos Szeredi
@ 2016-10-25 11:57   ` Raphael Hertzog
  2016-10-26 11:12     ` Miklos Szeredi
  2016-10-25 12:49   ` Amir Goldstein
  2016-10-28 16:15   ` Al Viro
  2 siblings, 1 reply; 44+ messages in thread
From: Raphael Hertzog @ 2016-10-25 11:57 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, Guillem Jover, linux-fsdevel, linux-kernel

Hello Miklos,

thanks for your work on this patch set!

On Tue, 25 Oct 2016, Miklos Szeredi wrote:
> +renaming directories
> +--------------------
> +
> +When renaming a directory that is on the lower layer or merged (i.e. the
> +directory was not created on the upper layer to start with) overlayfs can
> +handle it in two different ways:
> +
> +1) return EXDEV error: this error is returned by rename(2) when trying to
> +   move a file or directory across filesystem boundaries.  Hence
> +   applications are usually prepared to hande this error (mv(1) for example
> +   recursively copies the directory tree).  This is the default behavior.
> +
> +2) If the "redirect_dir" feature is enabled, then the directory will be
> +   copied up (but not the contents).  Then the "trusted.overlay.redirect"
> +   extended attribute is set to the path of the original location from the
> +   root of the overlay.  Finally the directory is moved to the new
> +   location.

>From this, I understand that we will have to add "redirect_dir" to the
mount options for this feature to work. Is there any reason why you
put this as an opt-in feature?

Do you plan to make it the default in the future when it has been
available for a while?

Barring any regression introduced by your patch, it seems that the feature
is best available by default since it allows legitimate operations to
succeed that are otherwise refused. I understand that it makes it
impossible to mount the overlay filesystem with an older kernel but is
that problem more widespread than the one we're fixing here?  On my side,
overlayfs is only used in scenarios where the kernel is always the same
(or newer compared to what created the initial filesystem).

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-25  7:34 ` [PATCH 3/3] ovl: redirect on rename-dir Miklos Szeredi
  2016-10-25 11:57   ` Raphael Hertzog
@ 2016-10-25 12:49   ` Amir Goldstein
  2016-10-26 11:26     ` Miklos Szeredi
  2016-10-28 16:15   ` Al Viro
  2 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-10-25 12:49 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-unionfs, Guillem Jover, Raphael Hertzog, linux-fsdevel,
	linux-kernel

On Tue, Oct 25, 2016 at 10:34 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
> Current code returns EXDEV when a directory would need to be copied up to
> move.  We could copy up the directory tree in this case, but there's
> another solution: point to old lower directory from moved upper directory.
>
> This is achieved with a "trusted.overlay.redirect" xattr storing the path
> relative to the root of the overlay.  After such attribute has been set,
> the directory can be moved without further actions required.
>

Nice! I played with a similar, but simpler idea last month:
https://github.com/amir73il/linux/commit/a3907e732984fb4fd340544ca27c45cac8be8060
it only handle renames within same parent.
Down the road, you may want to consider differentiating this simpler case
of "name redirect", as it incurs less path lookups. An entry can always be
promoted from "name redirect" to "full path redirect"

> This is a backward incompatible feature, old kernels won't be able to
> correctly mount an overlay containing redirected directories.
>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  Documentation/filesystems/overlayfs.txt | 21 ++++++-
>  fs/overlayfs/copy_up.c                  | 20 ++-----
>  fs/overlayfs/dir.c                      | 86 +++++++++++++++++++---------
>  fs/overlayfs/namei.c                    | 99 ++++++++++++++++++++++++++++++---
>  fs/overlayfs/overlayfs.h                |  4 ++
>  fs/overlayfs/ovl_entry.h                |  4 ++
>  fs/overlayfs/super.c                    | 25 +++++----
>  fs/overlayfs/util.c                     | 19 +++++++
>  8 files changed, 217 insertions(+), 61 deletions(-)
>
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index 5108425157ac..fae48ab3b36b 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -130,6 +130,23 @@ directory.
>  Readdir on directories that are not merged is simply handled by the
>  underlying directory (upper or lower).
>
> +renaming directories
> +--------------------
> +
> +When renaming a directory that is on the lower layer or merged (i.e. the
> +directory was not created on the upper layer to start with) overlayfs can
> +handle it in two different ways:
> +
> +1) return EXDEV error: this error is returned by rename(2) when trying to
> +   move a file or directory across filesystem boundaries.  Hence
> +   applications are usually prepared to hande this error (mv(1) for example
> +   recursively copies the directory tree).  This is the default behavior.
> +
> +2) If the "redirect_dir" feature is enabled, then the directory will be
> +   copied up (but not the contents).  Then the "trusted.overlay.redirect"
> +   extended attribute is set to the path of the original location from the
> +   root of the overlay.  Finally the directory is moved to the new
> +   location.
>
>  Non-directories
>  ---------------
> @@ -201,8 +218,8 @@ If a file with multiple hard links is copied up, then this will
>  "break" the link.  Changes will not be propagated to other names
>  referring to the same inode.
>
> -Directory trees are not copied up.  If rename(2) is performed on a directory
> -which is on the lower layer or is merged, then -EXDEV will be returned.
> +Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
> +directory will fail with EXDEV.
>
>  Changes to underlying filesystems
>  ---------------------------------
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 49f6158bb04c..0d7075208099 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -323,17 +323,11 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>  /*
>   * Copy up a single dentry
>   *
> - * Directory renames only allowed on "pure upper" (already created on
> - * upper filesystem, never copied up).  Directories which are on lower or
> - * are merged may not be renamed.  For these -EXDEV is returned and
> - * userspace has to deal with it.  This means, when copying up a
> - * directory we can rely on it and ancestors being stable.
> - *
> - * Non-directory renames start with copy up of source if necessary.  The
> - * actual rename will only proceed once the copy up was successful.  Copy
> - * up uses upper parent i_mutex for exclusion.  Since rename can change
> - * d_parent it is possible that the copy up will lock the old parent.  At
> - * that point the file will have already been copied up anyway.
> + * All renames start with copy up of source if necessary.  The actual
> + * rename will only proceed once the copy up was successful.  Copy up uses
> + * upper parent i_mutex for exclusion.  Since rename can change d_parent it
> + * is possible that the copy up will lock the old parent.  At that point
> + * the file will have already been copied up anyway.
>   */
>  int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>                     struct path *lowerpath, struct kstat *stat)
> @@ -345,7 +339,6 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>         struct path parentpath;
>         struct dentry *lowerdentry = lowerpath->dentry;
>         struct dentry *upperdir;
> -       struct dentry *upperdentry;
>         const char *link = NULL;
>
>         if (WARN_ON(!workdir))
> @@ -371,8 +364,7 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>                 pr_err("overlayfs: failed to lock workdir+upperdir\n");
>                 goto out_unlock;
>         }
> -       upperdentry = ovl_dentry_upper(dentry);
> -       if (upperdentry) {
> +       if (ovl_dentry_upper(dentry)) {
>                 /* Raced with another copy-up?  Nothing to do, then... */
>                 err = 0;
>                 goto out_unlock;
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 2c1057d747cb..065e0211f9b6 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -757,6 +757,40 @@ static bool ovl_type_merge_or_lower(struct dentry *dentry)
>         return OVL_TYPE_MERGE(type) || !OVL_TYPE_UPPER(type);
>  }
>
> +static bool ovl_can_move(struct dentry *dentry)
> +{
> +       return ovl_redirect_dir(dentry->d_sb) ||
> +               !d_is_dir(dentry) || !ovl_type_merge_or_lower(dentry);
> +}
> +
> +static int ovl_set_redirect(struct dentry *dentry)
> +{
> +       char *buf;
> +       char *path;
> +       int err;
> +
> +       if (ovl_dentry_is_redirect(dentry))
> +               return 0;
> +
> +       buf = __getname();
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       path = dentry_path_raw(dentry, buf, PATH_MAX);
> +       err = PTR_ERR(path);
> +       if (IS_ERR(path))
> +               goto putname;
> +
> +       err = ovl_do_setxattr(ovl_dentry_upper(dentry), OVL_XATTR_REDIRECT,
> +                             path, strlen(path), 0);
> +putname:
> +       __putname(buf);
> +       if (!err)
> +               ovl_dentry_set_redirect(dentry);
> +
> +       return err;
> +}
> +
>  static int ovl_rename(struct inode *olddir, struct dentry *old,
>                       struct inode *newdir, struct dentry *new,
>                       unsigned int flags)
> @@ -784,9 +818,9 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>
>         /* Don't copy up directory trees */
>         err = -EXDEV;
> -       if (is_dir && ovl_type_merge_or_lower(old))
> +       if (!ovl_can_move(old))
>                 goto out;
> -       if (!overwrite && new_is_dir && ovl_type_merge_or_lower(new))
> +       if (!overwrite && !ovl_can_move(new))
>                 goto out;
>
>         err = ovl_want_write(old);
> @@ -837,7 +871,6 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>
>         trap = lock_rename(new_upperdir, old_upperdir);
>
> -
>         olddentry = lookup_one_len(old->d_name.name, old_upperdir,
>                                    old->d_name.len);
>         err = PTR_ERR(olddentry);
> @@ -880,31 +913,34 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>         if (WARN_ON(olddentry->d_inode == newdentry->d_inode))
>                 goto out_dput;
>
> -       if (is_dir && !old_opaque && ovl_lower_positive(new)) {
> -               err = ovl_set_opaque(olddentry);
> -               if (err)
> -                       goto out_dput;
> -               ovl_dentry_set_opaque(old, true);
> +       if (is_dir) {
> +               if (ovl_type_merge_or_lower(old)) {
> +                       err = ovl_set_redirect(old);

There is a fair chance of getting ENOSPC/EDQUOT here and confuse user space.
Would it be better to convert these non fatal errors with EXDEV, so
user space will
gracefully fallback to recursive rename/clone/copy?

> +                       if (err)
> +                               goto out_dput;
> +               } else if (!old_opaque && ovl_lower_positive(new)) {
> +                       err = ovl_set_opaque(olddentry);
> +                       if (err)
> +                               goto out_dput;
> +                       ovl_dentry_set_opaque(old, true);
> +               }
>         }
> -       if (!overwrite &&
> -           new_is_dir && !new_opaque && ovl_lower_positive(old)) {
> -               err = ovl_set_opaque(newdentry);
> -               if (err)
> -                       goto out_dput;
> -               ovl_dentry_set_opaque(new, true);
> +       if (!overwrite && new_is_dir) {
> +               if (ovl_type_merge_or_lower(new)) {
> +                       err = ovl_set_redirect(new);
> +                       if (err)
> +                               goto out_dput;
> +               } else if (!new_opaque && ovl_lower_positive(old)) {
> +                       err = ovl_set_opaque(newdentry);
> +                       if (err)
> +                               goto out_dput;
> +                       ovl_dentry_set_opaque(new, true);
> +               }
>         }
>
> -       if (old_opaque || new_opaque) {
> -               err = ovl_do_rename(old_upperdir->d_inode, olddentry,
> -                                   new_upperdir->d_inode, newdentry,
> -                                   flags);
> -       } else {
> -               /* No debug for the plain case */
> -               BUG_ON(flags & ~RENAME_EXCHANGE);
> -               err = vfs_rename(old_upperdir->d_inode, olddentry,
> -                                new_upperdir->d_inode, newdentry,
> -                                NULL, flags);
> -       }
> +       err = ovl_do_rename(old_upperdir->d_inode, olddentry,
> +                           new_upperdir->d_inode, newdentry,
> +                           flags);
>         if (err)
>                 goto out_dput;
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index f4057fcb0246..c7cacbb8ce09 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/fs.h>
> +#include <linux/mount.h>
>  #include <linux/namei.h>
>  #include <linux/xattr.h>
>  #include "overlayfs.h"
> @@ -48,6 +49,28 @@ static bool ovl_is_opaquedir(struct dentry *dentry)
>         return false;
>  }
>
> +static int ovl_check_redirect(struct dentry *dentry, char **bufp)
> +{
> +       int res;
> +
> +       res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0);
> +       if (res > 0) {
> +               char *buf = kzalloc(res + 1, GFP_KERNEL);
> +
> +               if (!buf)
> +                       return -ENOMEM;
> +
> +               res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res);
> +               if (res < 0)
> +                       return res;
> +
> +               kfree(*bufp);
> +               *bufp = buf;
> +       }
> +
> +       return 0;
> +}
> +
>  /*
>   * Returns next layer in stack starting from top.
>   * Returns -1 if this is the last layer.
> @@ -80,11 +103,13 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         unsigned int ctr = 0;
>         struct inode *inode = NULL;
>         bool upperopaque = false;
> +       bool upperredirect = false;
>         bool stop = false;
>         bool isdir = false;
>         struct dentry *this;
>         unsigned int i;
>         int err;
> +       char *redirect = NULL;
>
>         old_cred = ovl_override_creds(dentry->d_sb);
>         upperdir = ovl_upperdentry_dereference(poe);
> @@ -110,11 +135,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                                 isdir = true;
>                                 if (poe->numlower && ovl_is_opaquedir(this))
>                                         stop = upperopaque = true;
> +                               else if (ovl_redirect_dir(dentry->d_sb)) {
> +                                       err = ovl_check_redirect(this,
> +                                                                &redirect);
> +                                       if (err)
> +                                               goto out;
> +
> +                                       if (redirect)
> +                                               upperredirect = true;
> +                               }
>                         }
>                 }
>                 upperdentry = this;
>         }
>
> +       if (redirect)
> +               poe = dentry->d_sb->s_root->d_fsdata;
> +
>         if (!stop && poe->numlower) {
>                 err = -ENOMEM;
>                 stack = kcalloc(poe->numlower, sizeof(struct path), GFP_KERNEL);
> @@ -125,15 +162,39 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         for (i = 0; !stop && i < poe->numlower; i++) {
>                 struct path lowerpath = poe->lowerstack[i];
>
> -               this = ovl_lookup_real(lowerpath.dentry, &dentry->d_name);
> -               err = PTR_ERR(this);
> -               if (IS_ERR(this)) {
> -                       /*
> -                        * If it's positive, then treat ENAMETOOLONG as ENOENT.
> -                        */
> -                       if (err == -ENAMETOOLONG && (upperdentry || ctr))
> -                               continue;
> -                       goto out_put;
> +               if (!redirect) {
> +                       this = ovl_lookup_real(lowerpath.dentry, &dentry->d_name);
> +                       err = PTR_ERR(this);
> +                       if (IS_ERR(this)) {
> +                               /*
> +                                * If it's positive, then treat ENAMETOOLONG as ENOENT.
> +                                */
> +                               if (err == -ENAMETOOLONG && (upperdentry || ctr))
> +                                       continue;
> +                               goto out_put;
> +                       }
> +               } else {
> +                       struct path thispath;
> +
> +                       err = vfs_path_lookup(lowerpath.dentry, lowerpath.mnt,
> +                                             redirect, 0, &thispath);
> +
> +                       if (err) {
> +                               if (err == -ENOENT || err == -ENAMETOOLONG)
> +                                       this = NULL;
> +                       } else {
> +                               this = thispath.dentry;
> +                               mntput(thispath.mnt);
> +                               if (!this->d_inode) {
> +                                       dput(this);
> +                                       this = NULL;
> +                               } else if (ovl_dentry_weird(this)) {
> +                                       dput(this);
> +                                       err = -EREMOTE;
> +                               }
> +                       }
> +                       if (err)
> +                               goto out_put;
>                 }
>                 if (!this)
>                         continue;
> @@ -162,6 +223,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 stack[ctr].dentry = this;
>                 stack[ctr].mnt = lowerpath.mnt;
>                 ctr++;
> +
> +               if (!stop && i != poe->numlower - 1 &&
> +                   d_is_dir(this) && ovl_redirect_dir(dentry->d_sb)) {
> +                       err = ovl_check_redirect(this, &redirect);
> +                       if (err)
> +                               goto out_put;
> +
> +                       if (redirect && poe != dentry->d_sb->s_root->d_fsdata) {
> +                               poe = dentry->d_sb->s_root->d_fsdata;
> +

Now you are about to continue looping until new value of poe->numlower,
which is >= then olf value of poe->numlower, but 'stack' was allocated
according to old value of poe->numlower, so aren't you in danger of
overflowing it?

Please add a comment to explain the purpose of this loop rewind.


> +                               for (i = 0; i < poe->numlower; i++)
> +                                       if (poe->lowerstack[i].mnt == lowerpath.mnt)
> +                                               break;
> +                               if (WARN_ON(i == poe->numlower))
> +                                       break;
> +                       }
> +               }
>         }
>
>         oe = ovl_alloc_entry(ctr);
> @@ -192,9 +270,11 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>
>         revert_creds(old_cred);
>         oe->opaque = upperopaque;
> +       oe->redirect = upperredirect;
>         oe->__upperdentry = upperdentry;
>         memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
>         kfree(stack);
> +       kfree(redirect);
>         dentry->d_fsdata = oe;
>         d_add(dentry, inode);
>
> @@ -209,6 +289,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  out_put_upper:
>         dput(upperdentry);
>  out:
> +       kfree(redirect);
>         revert_creds(old_cred);
>         return ERR_PTR(err);
>  }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index d61d5b9d0d91..9d80ce367ad8 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -20,6 +20,7 @@ enum ovl_path_type {
>  #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
>  #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
>  #define OVL_XATTR_FEATURES OVL_XATTR_PREFIX "features"
> +#define OVL_XATTR_REDIRECT OVL_XATTR_PREFIX "redirect"
>
>  #define OVL_ISUPPER_MASK 1UL
>
> @@ -157,6 +158,9 @@ void ovl_set_dir_cache(struct dentry *dentry, struct ovl_dir_cache *cache);
>  bool ovl_dentry_is_opaque(struct dentry *dentry);
>  bool ovl_dentry_is_whiteout(struct dentry *dentry);
>  void ovl_dentry_set_opaque(struct dentry *dentry, bool opaque);
> +bool ovl_redirect_dir(struct super_block *sb);
> +bool ovl_dentry_is_redirect(struct dentry *dentry);
> +void ovl_dentry_set_redirect(struct dentry *dentry);
>  void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
>  void ovl_inode_init(struct inode *inode, struct inode *realinode,
>                     bool is_upper);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 3b7ba59ad27e..2b22645535ff 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -26,6 +26,9 @@ struct ovl_fs {
>         struct ovl_config config;
>         /* creds of process who forced instantiation of super block */
>         const struct cred *creator_cred;
> +
> +       /* Check for "redirect" on directories */
> +       bool redirect_dir;
>  };
>
>  /* private information held for every overlayfs dentry */
> @@ -36,6 +39,7 @@ struct ovl_entry {
>                 struct {
>                         u64 version;
>                         bool opaque;
> +                       bool redirect;
>                 };
>                 struct rcu_head rcu;
>         };
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index d6dc8d905d00..fc22a8a2e0d0 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -397,7 +397,7 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
>         goto out_unlock;
>  }
>
> -static int ovl_check_features(struct dentry *root)
> +static int ovl_check_features(struct ovl_fs *ufs, struct dentry *root)
>  {
>         int res;
>         char *buf, *tmp, *p;
> @@ -421,8 +421,12 @@ static int ovl_check_features(struct dentry *root)
>         res = 0;
>         tmp = buf;
>         while ((p = strsep(&tmp, ",")) != NULL) {
> -               res = -EINVAL;
> -               pr_err("overlayfs: feature '%s' not supported\n", p);
> +               if (strcmp(p, "redirect_dir") == 0) {
> +                       ufs->redirect_dir = true;
> +               } else {
> +                       res = -EINVAL;
> +                       pr_err("overlayfs: feature '%s' not supported\n", p);
> +               }
>         }
>  out_free:
>         kfree(buf);
> @@ -494,8 +498,8 @@ static int ovl_mount_dir(const char *name, struct path *path)
>         return err;
>  }
>
> -static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
> -                        int *stack_depth, bool *remote)
> +static int ovl_lower_dir(const char *name, struct path *path,
> +                        struct ovl_fs *ufs, int *stack_depth, bool *remote)
>  {
>         int err;
>         struct kstatfs statfs;
> @@ -504,7 +508,7 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
>         if (err)
>                 goto out;
>
> -       err = ovl_check_features(path->dentry);
> +       err = ovl_check_features(ufs, path->dentry);
>         if (err)
>                 goto out_put;
>
> @@ -513,7 +517,7 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen,
>                 pr_err("overlayfs: statfs failed on '%s'\n", name);
>                 goto out_put;
>         }
> -       *namelen = max(*namelen, statfs.f_namelen);
> +       ufs->lower_namelen = max(ufs->lower_namelen, statfs.f_namelen);
>         *stack_depth = max(*stack_depth, path->mnt->mnt_sb->s_stack_depth);
>
>         if (ovl_dentry_remote(path->dentry))
> @@ -730,7 +734,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>                         goto out_put_upperpath;
>                 }
>
> -               err = ovl_check_features(upperpath.dentry);
> +               err = ovl_check_features(ufs, upperpath.dentry);
>                 if (err)
>                         goto out_put_upperpath;
>
> @@ -771,9 +775,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>
>         lower = lowertmp;
>         for (numlower = 0; numlower < stacklen; numlower++) {
> -               err = ovl_lower_dir(lower, &stack[numlower],
> -                                   &ufs->lower_namelen, &sb->s_stack_depth,
> -                                   &remote);
> +               err = ovl_lower_dir(lower, &stack[numlower], ufs,
> +                                   &sb->s_stack_depth, &remote);
>                 if (err)
>                         goto out_put_lowerpath;
>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 0d45a84468d2..06dae4cabd53 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -176,6 +176,25 @@ void ovl_dentry_set_opaque(struct dentry *dentry, bool opaque)
>         oe->opaque = opaque;
>  }
>
> +bool ovl_redirect_dir(struct super_block *sb)
> +{
> +       struct ovl_fs *ofs = sb->s_fs_info;
> +
> +       return ofs->redirect_dir;
> +}
> +
> +bool ovl_dentry_is_redirect(struct dentry *dentry)
> +{
> +       struct ovl_entry *oe = dentry->d_fsdata;
> +       return oe->redirect;
> +}
> +
> +void ovl_dentry_set_redirect(struct dentry *dentry)
> +{
> +       struct ovl_entry *oe = dentry->d_fsdata;
> +       oe->redirect = true;
> +}
> +
>  void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
>  {
>         struct ovl_entry *oe = dentry->d_fsdata;
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-25 11:57   ` Raphael Hertzog
@ 2016-10-26 11:12     ` Miklos Szeredi
  2016-10-28 12:56       ` Raphael Hertzog
  2016-11-06 19:14       ` Konstantin Khlebnikov
  0 siblings, 2 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-26 11:12 UTC (permalink / raw)
  To: Raphael Hertzog
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, linux-fsdevel,
	linux-kernel

On Tue, Oct 25, 2016 at 1:57 PM, Raphael Hertzog <hertzog@debian.org> wrote:

> Do you plan to make it the default in the future when it has been
> available for a while?
>
> Barring any regression introduced by your patch, it seems that the feature
> is best available by default since it allows legitimate operations to
> succeed that are otherwise refused. I understand that it makes it
> impossible to mount the overlay filesystem with an older kernel but is
> that problem more widespread than the one we're fixing here?  On my side,
> overlayfs is only used in scenarios where the kernel is always the same
> (or newer compared to what created the initial filesystem).

I think it would be safe to make it the default if upperdir is empty.
Nonempty implies that it was created with old kernel (or it was
crafted by hand).   But there should be a way to explicitly turn it
off;  either because of the need for backward compatibility or because
the old format is simply easier to work with for humans.

How about:

- If upper is nonempty, then leave redirect feature alone except when
mount option "-oredirect=on" is used to force enabling it.
- If upper is empty, then enable redirect feature except when mount
option "-oredirect=off" is used to force disabling it.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-25 12:49   ` Amir Goldstein
@ 2016-10-26 11:26     ` Miklos Szeredi
  2016-10-26 12:11       ` Amir Goldstein
  2016-10-26 19:56       ` Amir Goldstein
  0 siblings, 2 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-26 11:26 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, Raphael Hertzog,
	linux-fsdevel, linux-kernel

On Tue, Oct 25, 2016 at 2:49 PM, Amir Goldstein <amir73il@gmail.com> wrote:

>> @@ -880,31 +913,34 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>>         if (WARN_ON(olddentry->d_inode == newdentry->d_inode))
>>                 goto out_dput;
>>
>> -       if (is_dir && !old_opaque && ovl_lower_positive(new)) {
>> -               err = ovl_set_opaque(olddentry);
>> -               if (err)
>> -                       goto out_dput;
>> -               ovl_dentry_set_opaque(old, true);
>> +       if (is_dir) {
>> +               if (ovl_type_merge_or_lower(old)) {
>> +                       err = ovl_set_redirect(old);
>
> There is a fair chance of getting ENOSPC/EDQUOT here and confuse user space.
> Would it be better to convert these non fatal errors with EXDEV, so
> user space will
> gracefully fallback to recursive rename/clone/copy?

Recursive copy up will surely consume more space than an xattr?

>> @@ -162,6 +223,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                 stack[ctr].dentry = this;
>>                 stack[ctr].mnt = lowerpath.mnt;
>>                 ctr++;
>> +
>> +               if (!stop && i != poe->numlower - 1 &&
>> +                   d_is_dir(this) && ovl_redirect_dir(dentry->d_sb)) {
>> +                       err = ovl_check_redirect(this, &redirect);
>> +                       if (err)
>> +                               goto out_put;
>> +
>> +                       if (redirect && poe != dentry->d_sb->s_root->d_fsdata) {
>> +                               poe = dentry->d_sb->s_root->d_fsdata;
>> +
>
> Now you are about to continue looping until new value of poe->numlower,
> which is >= then olf value of poe->numlower, but 'stack' was allocated
> according to old value of poe->numlower, so aren't you in danger of
> overflowing it?
>
> Please add a comment to explain the purpose of this loop rewind.

We are jumping to a stack possibly wider than the current one and need
to find the layer where to continue the downward traversal.  I'll add
the comment.

BTW I don't remember having tested this, so it might possibly be
buggy.  Automatic multi-layer testing would really be good.  What we
basically need is:

 - create normal (two layer) overlay (with interesting constructs,
whiteout, opaque dir, redirect)
 - umount
 - create three layer overlay where the two lower layers come from the
previous upper/lower layers
 - do more interesting things

There's one such test in xfstests but it would be good to have more.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-26 11:26     ` Miklos Szeredi
@ 2016-10-26 12:11       ` Amir Goldstein
  2016-10-26 12:51         ` Miklos Szeredi
  2016-10-26 19:56       ` Amir Goldstein
  1 sibling, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-10-26 12:11 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, Raphael Hertzog,
	linux-fsdevel, linux-kernel

On Wed, Oct 26, 2016 at 2:26 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Oct 25, 2016 at 2:49 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>>> @@ -880,31 +913,34 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>>>         if (WARN_ON(olddentry->d_inode == newdentry->d_inode))
>>>                 goto out_dput;
>>>
>>> -       if (is_dir && !old_opaque && ovl_lower_positive(new)) {
>>> -               err = ovl_set_opaque(olddentry);
>>> -               if (err)
>>> -                       goto out_dput;
>>> -               ovl_dentry_set_opaque(old, true);
>>> +       if (is_dir) {
>>> +               if (ovl_type_merge_or_lower(old)) {
>>> +                       err = ovl_set_redirect(old);
>>
>> There is a fair chance of getting ENOSPC/EDQUOT here and confuse user space.
>> Would it be better to convert these non fatal errors with EXDEV, so
>> user space will
>> gracefully fallback to recursive rename/clone/copy?
>
> Recursive copy up will surely consume more space than an xattr?

Surely. But that is not what I meant.
In Ext4, for example, the total size of extended attributes for an
inode is limited to a single block,
so you can get ENOSPC with alot of free space in the file system.

You can also get ERANGE if the redirect path length > 256.
I'm just saying that instead of failing the move with ERANGE or
uncalled for ENOSPC,
it is better for user space to get EXDEV, from which there is a sane fallback.
BTW, mv (from linux-utils) falls back from -EXDEV to recursive move,
which is quite
efficient with clone up.

>
>>> @@ -162,6 +223,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>                 stack[ctr].dentry = this;
>>>                 stack[ctr].mnt = lowerpath.mnt;
>>>                 ctr++;
>>> +
>>> +               if (!stop && i != poe->numlower - 1 &&
>>> +                   d_is_dir(this) && ovl_redirect_dir(dentry->d_sb)) {
>>> +                       err = ovl_check_redirect(this, &redirect);
>>> +                       if (err)
>>> +                               goto out_put;
>>> +
>>> +                       if (redirect && poe != dentry->d_sb->s_root->d_fsdata) {
>>> +                               poe = dentry->d_sb->s_root->d_fsdata;
>>> +
>>
>> Now you are about to continue looping until new value of poe->numlower,
>> which is >= then olf value of poe->numlower, but 'stack' was allocated
>> according to old value of poe->numlower, so aren't you in danger of
>> overflowing it?
>>
>> Please add a comment to explain the purpose of this loop rewind.
>
> We are jumping to a stack possibly wider than the current one and need
> to find the layer where to continue the downward traversal.  I'll add
> the comment.
>
> BTW I don't remember having tested this, so it might possibly be
> buggy.  Automatic multi-layer testing would really be good.  What we
> basically need is:
>
>  - create normal (two layer) overlay (with interesting constructs,
> whiteout, opaque dir, redirect)
>  - umount
>  - create three layer overlay where the two lower layers come from the
> previous upper/lower layers
>  - do more interesting things
>
> There's one such test in xfstests but it would be good to have more.
>
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-26 12:11       ` Amir Goldstein
@ 2016-10-26 12:51         ` Miklos Szeredi
  0 siblings, 0 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-26 12:51 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, Raphael Hertzog,
	linux-fsdevel, linux-kernel

On Wed, Oct 26, 2016 at 2:11 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Oct 26, 2016 at 2:26 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Tue, Oct 25, 2016 at 2:49 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>>>> @@ -880,31 +913,34 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>>>>         if (WARN_ON(olddentry->d_inode == newdentry->d_inode))
>>>>                 goto out_dput;
>>>>
>>>> -       if (is_dir && !old_opaque && ovl_lower_positive(new)) {
>>>> -               err = ovl_set_opaque(olddentry);
>>>> -               if (err)
>>>> -                       goto out_dput;
>>>> -               ovl_dentry_set_opaque(old, true);
>>>> +       if (is_dir) {
>>>> +               if (ovl_type_merge_or_lower(old)) {
>>>> +                       err = ovl_set_redirect(old);
>>>
>>> There is a fair chance of getting ENOSPC/EDQUOT here and confuse user space.
>>> Would it be better to convert these non fatal errors with EXDEV, so
>>> user space will
>>> gracefully fallback to recursive rename/clone/copy?
>>
>> Recursive copy up will surely consume more space than an xattr?
>
> Surely. But that is not what I meant.
> In Ext4, for example, the total size of extended attributes for an
> inode is limited to a single block,
> so you can get ENOSPC with alot of free space in the file system.

Ah.

>
> You can also get ERANGE if the redirect path length > 256.

The 256 limit is for the name of the xattr, not the value.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-26 11:26     ` Miklos Szeredi
  2016-10-26 12:11       ` Amir Goldstein
@ 2016-10-26 19:56       ` Amir Goldstein
  1 sibling, 0 replies; 44+ messages in thread
From: Amir Goldstein @ 2016-10-26 19:56 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, Raphael Hertzog,
	linux-fsdevel, linux-kernel

On Wed, Oct 26, 2016 at 2:26 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Oct 25, 2016 at 2:49 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>>> @@ -880,31 +913,34 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>>>         if (WARN_ON(olddentry->d_inode == newdentry->d_inode))
>>>                 goto out_dput;
>>>
>>> -       if (is_dir && !old_opaque && ovl_lower_positive(new)) {
>>> -               err = ovl_set_opaque(olddentry);
>>> -               if (err)
>>> -                       goto out_dput;
>>> -               ovl_dentry_set_opaque(old, true);
>>> +       if (is_dir) {
>>> +               if (ovl_type_merge_or_lower(old)) {
>>> +                       err = ovl_set_redirect(old);
>>
>> There is a fair chance of getting ENOSPC/EDQUOT here and confuse user space.
>> Would it be better to convert these non fatal errors with EXDEV, so
>> user space will
>> gracefully fallback to recursive rename/clone/copy?
>
> Recursive copy up will surely consume more space than an xattr?
>
>>> @@ -162,6 +223,23 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>                 stack[ctr].dentry = this;
>>>                 stack[ctr].mnt = lowerpath.mnt;
>>>                 ctr++;
>>> +
>>> +               if (!stop && i != poe->numlower - 1 &&
>>> +                   d_is_dir(this) && ovl_redirect_dir(dentry->d_sb)) {
>>> +                       err = ovl_check_redirect(this, &redirect);
>>> +                       if (err)
>>> +                               goto out_put;
>>> +
>>> +                       if (redirect && poe != dentry->d_sb->s_root->d_fsdata) {
>>> +                               poe = dentry->d_sb->s_root->d_fsdata;
>>> +
>>
>> Now you are about to continue looping until new value of poe->numlower,
>> which is >= then olf value of poe->numlower, but 'stack' was allocated
>> according to old value of poe->numlower, so aren't you in danger of
>> overflowing it?
>>
>> Please add a comment to explain the purpose of this loop rewind.
>
> We are jumping to a stack possibly wider than the current one and need
> to find the layer where to continue the downward traversal.  I'll add
> the comment.
>

OK. my point was that you need to allocate an sb max depth stack in advance,
in case you need to jump to a wider stack.

> BTW I don't remember having tested this, so it might possibly be
> buggy.  Automatic multi-layer testing would really be good.  What we
> basically need is:
>
>  - create normal (two layer) overlay (with interesting constructs,
> whiteout, opaque dir, redirect)
>  - umount
>  - create three layer overlay where the two lower layers come from the
> previous upper/lower layers
>  - do more interesting things
>
> There's one such test in xfstests but it would be good to have more.
>

I just sent 2 patches to fix 2 overlayfs xfstests failures.
With these 2 changes, the entire quick test group passes on my setup
(short of one test that also fails on ext4 and xfs).

Now I will start to think about instrumenting generic xfstests with lower/upper
files and then with rotating upper (i.e. layer stack).

Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-26 11:12     ` Miklos Szeredi
@ 2016-10-28 12:56       ` Raphael Hertzog
  2016-10-28 12:59         ` Miklos Szeredi
  2016-11-06 19:14       ` Konstantin Khlebnikov
  1 sibling, 1 reply; 44+ messages in thread
From: Raphael Hertzog @ 2016-10-28 12:56 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, linux-fsdevel,
	linux-kernel

Hi,

On Wed, 26 Oct 2016, Miklos Szeredi wrote:
> On Tue, Oct 25, 2016 at 1:57 PM, Raphael Hertzog <hertzog@debian.org> wrote:

[ redirect feature enabled by default ]

> I think it would be safe to make it the default if upperdir is empty.
> Nonempty implies that it was created with old kernel (or it was
> crafted by hand).   But there should be a way to explicitly turn it
> off;  either because of the need for backward compatibility or because
> the old format is simply easier to work with for humans.
> 
> How about:
> 
> - If upper is nonempty, then leave redirect feature alone except when
> mount option "-oredirect=on" is used to force enabling it.
> - If upper is empty, then enable redirect feature except when mount
> option "-oredirect=off" is used to force disabling it.

Looks good, but is there some stickyness of the setting? My use case is
schroot and it always starts with an empty upperdir when I create a
temporary/throw-away chroot but if I reboot the machine, then it's possible
that schroot will remount the overlayfs with a non-empty upperdir
(the temporary chroot is preserved until I explicitly end the chroot).

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-28 12:56       ` Raphael Hertzog
@ 2016-10-28 12:59         ` Miklos Szeredi
  0 siblings, 0 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-10-28 12:59 UTC (permalink / raw)
  To: Raphael Hertzog
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, linux-fsdevel, lkml

On Fri, Oct 28, 2016 at 2:56 PM, Raphael Hertzog <hertzog@debian.org> wrote:
> Hi,
>
> On Wed, 26 Oct 2016, Miklos Szeredi wrote:
>> On Tue, Oct 25, 2016 at 1:57 PM, Raphael Hertzog <hertzog@debian.org> wrote:
>
> [ redirect feature enabled by default ]
>
>> I think it would be safe to make it the default if upperdir is empty.
>> Nonempty implies that it was created with old kernel (or it was
>> crafted by hand).   But there should be a way to explicitly turn it
>> off;  either because of the need for backward compatibility or because
>> the old format is simply easier to work with for humans.
>>
>> How about:
>>
>> - If upper is nonempty, then leave redirect feature alone except when
>> mount option "-oredirect=on" is used to force enabling it.
>> - If upper is empty, then enable redirect feature except when mount
>> option "-oredirect=off" is used to force disabling it.
>
> Looks good, but is there some stickyness of the setting? My use case is
> schroot and it always starts with an empty upperdir when I create a
> temporary/throw-away chroot but if I reboot the machine, then it's possible
> that schroot will remount the overlayfs with a non-empty upperdir
> (the temporary chroot is preserved until I explicitly end the chroot).

Yes, there is stickiness (it stores the "redirect_dir" feature in an
extended attribute on upperdir).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-25  7:34 ` [PATCH 3/3] ovl: redirect on rename-dir Miklos Szeredi
  2016-10-25 11:57   ` Raphael Hertzog
  2016-10-25 12:49   ` Amir Goldstein
@ 2016-10-28 16:15   ` Al Viro
  2016-11-03 15:50     ` Miklos Szeredi
  2 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2016-10-28 16:15 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-unionfs, Guillem Jover, Raphael Hertzog, linux-fsdevel,
	linux-kernel

On Tue, Oct 25, 2016 at 09:34:47AM +0200, Miklos Szeredi wrote:
> Current code returns EXDEV when a directory would need to be copied up to
> move.  We could copy up the directory tree in this case, but there's
> another solution: point to old lower directory from moved upper directory.
> 
> This is achieved with a "trusted.overlay.redirect" xattr storing the path
> relative to the root of the overlay.  After such attribute has been set,
> the directory can be moved without further actions required.
> 
> This is a backward incompatible feature, old kernels won't be able to
> correctly mount an overlay containing redirected directories.

> +			err = vfs_path_lookup(lowerpath.dentry, lowerpath.mnt,
> +					      redirect, 0, &thispath);
> +
> +			if (err) {
> +				if (err == -ENOENT || err == -ENAMETOOLONG)
> +					this = NULL;
> +			} else {
> +				this = thispath.dentry;
> +				mntput(thispath.mnt);
> +				if (!this->d_inode) {
> +					dput(this);
> +					this = NULL;
> +				} else if (ovl_dentry_weird(this)) {
> +					dput(this);
> +					err = -EREMOTE;
> +				}
> +			}

I'm not happy with that one - you are relying upon the fairly subtle
assertions here.
	1)  Had lowerpath.mnt *not* been a privately cloned one with nothing
mounted on it, you would've been screwed.
	2) Had that thing contained a "jumper" symlink (a-la procfs ones),
you would've been screwed.  Currently only procfs has those, and it would've
been rejected before getting there, but this is brittle and non-obvious.
	3) Any automount point in there (nfs4 referrals, etc.) can
break the assumption that nothing could've been mounted on it.  And _that_
might have not been stepped onto; back when the path had been stored, there'd
been no automount point at all, so we have avoided ovl_dentry_weird() rejects,
and by now nothing on the path had been visited yet, so ovl_dentry_weird()
didn't have a chance to trigger.  Note that calling it on the last dentry
is no good - we might have crossed the automount point in the middle of that
path, so this last dentry might be nice and shiny - and on another filesystem.
So unlike (1) and (2) it's not just a fishy-looking thing that happens to
work for non-local reasons; AFAICS, it's actually a bug.

I'm not sure if vfs_path_lookup() is the right tool here.  It might be
usable for making such a tool, but as it is you are setting one hell of
a trap for yourself...

It might be made to work, if we figure out the right semantics for disabling
symlinks on per-vfsmount basis (and no, the posted nolinks patches are not
it) and mark these private clones with that and with similar "disable
automount traversals" flag (again, needs the right semantics; the area is
convoluted as it is).  But in that case I would strongly recommend adding
an exported wrapper around vfs_path_lookup() that would verify that these
flags *are* set.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-28 16:15   ` Al Viro
@ 2016-11-03 15:50     ` Miklos Szeredi
  2016-11-04  9:29       ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-03 15:50 UTC (permalink / raw)
  To: Al Viro
  Cc: Miklos Szeredi, linux-unionfs, Guillem Jover, Raphael Hertzog,
	linux-fsdevel, linux-kernel

On Fri, Oct 28, 2016 at 6:15 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Tue, Oct 25, 2016 at 09:34:47AM +0200, Miklos Szeredi wrote:
>> Current code returns EXDEV when a directory would need to be copied up to
>> move.  We could copy up the directory tree in this case, but there's
>> another solution: point to old lower directory from moved upper directory.
>>
>> This is achieved with a "trusted.overlay.redirect" xattr storing the path
>> relative to the root of the overlay.  After such attribute has been set,
>> the directory can be moved without further actions required.
>>
>> This is a backward incompatible feature, old kernels won't be able to
>> correctly mount an overlay containing redirected directories.
>
>> +                     err = vfs_path_lookup(lowerpath.dentry, lowerpath.mnt,
>> +                                           redirect, 0, &thispath);
>> +
>> +                     if (err) {
>> +                             if (err == -ENOENT || err == -ENAMETOOLONG)
>> +                                     this = NULL;
>> +                     } else {
>> +                             this = thispath.dentry;
>> +                             mntput(thispath.mnt);
>> +                             if (!this->d_inode) {
>> +                                     dput(this);
>> +                                     this = NULL;
>> +                             } else if (ovl_dentry_weird(this)) {
>> +                                     dput(this);
>> +                                     err = -EREMOTE;
>> +                             }
>> +                     }
>
> I'm not happy with that one - you are relying upon the fairly subtle
> assertions here.
>         1)  Had lowerpath.mnt *not* been a privately cloned one with nothing
> mounted on it, you would've been screwed.
>         2) Had that thing contained a "jumper" symlink (a-la procfs ones),
> you would've been screwed.  Currently only procfs has those, and it would've
> been rejected before getting there, but this is brittle and non-obvious.
>         3) Any automount point in there (nfs4 referrals, etc.) can
> break the assumption that nothing could've been mounted on it.  And _that_
> might have not been stepped onto; back when the path had been stored, there'd
> been no automount point at all, so we have avoided ovl_dentry_weird() rejects,
> and by now nothing on the path had been visited yet, so ovl_dentry_weird()
> didn't have a chance to trigger.  Note that calling it on the last dentry
> is no good - we might have crossed the automount point in the middle of that
> path, so this last dentry might be nice and shiny - and on another filesystem.
> So unlike (1) and (2) it's not just a fishy-looking thing that happens to
> work for non-local reasons; AFAICS, it's actually a bug.
>
> I'm not sure if vfs_path_lookup() is the right tool here.  It might be
> usable for making such a tool, but as it is you are setting one hell of
> a trap for yourself...

Agreed, it's not the right tool.   A custom loop of lookup_one_len's
should work much better and doesn't add all that much complexity.
Updated patch pushed to:

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect

This version also passes the recycling tests by Amir and enables the
redirect feature by default on an empty upperdir.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-03 15:50     ` Miklos Szeredi
@ 2016-11-04  9:29       ` Amir Goldstein
  2016-11-04 13:48         ` Miklos Szeredi
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-04  9:29 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Al Viro, Miklos Szeredi, linux-unionfs, Guillem Jover,
	Raphael Hertzog, linux-fsdevel, linux-kernel

On Thu, Nov 3, 2016 at 5:50 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Oct 28, 2016 at 6:15 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Tue, Oct 25, 2016 at 09:34:47AM +0200, Miklos Szeredi wrote:
...
>>
>> I'm not sure if vfs_path_lookup() is the right tool here.  It might be
>> usable for making such a tool, but as it is you are setting one hell of
>> a trap for yourself...
>
> Agreed, it's not the right tool.   A custom loop of lookup_one_len's
> should work much better and doesn't add all that much complexity.
> Updated patch pushed to:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect
>
> This version also passes the recycling tests by Amir and enables the
> redirect feature by default on an empty upperdir.
>

Miklos,

You did not address my comment about the 'stack' allocation overflow
in ovl_lookup
I believe the (possible) overflow is demonstrated by the following debug patch:

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index c7cacbb..7171bfb 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -231,5 +231,7 @@ struct dentry *ovl_lookup(struct inode *dir,
struct dentry *dentry,
                                goto out_put;

                        if (redirect && poe != dentry->d_sb->s_root->d_fsdata) {
+                               int stackroom = poe->numlower - ctr;
+
                                poe = dentry->d_sb->s_root->d_fsdata;

@@ -238,6 +240,8 @@ struct dentry *ovl_lookup(struct inode *dir,
struct dentry *dentry,
                                                break;
                                if (WARN_ON(i == poe->numlower))
                                        break;
+                               if (WARN_ON(poe->numlower - i - 1 > stackroom))
+                                       break;
                        }
                }
        }
-- 
2.7.4

In cases where a directory is moved into another directory with merge history
shorter then total number of layers,
lookup will need to grow the 'stack' while redirecting.
Bug will be hit only after remount or dcache drop, which was the
reason I wrote the
recycle test in the first place...

I instrumented unionmount-tests with test name prints to kmsg (a la xfstests)
Pushed to https://github.com/amir73il/unionmount-testsuite.git #ovl_rename_dir

And as you can see, 5 subtests hit the overflow warning.

[ 1759.692281] TEST rename-new-dir.py:161: Rename empty dir over
removed empty lower dir
[ 1759.747217] WARNING: CPU: 0 PID: 9065 at
/home/amir/src/linux/fs/overlayfs/namei.c:271 ovl_lookup+0x81d/0x870
[overlay]

[ 1759.748887] TEST rename-new-dir.py:172: Rename empty dir over
removed populated lower dir
[ 1759.836195] WARNING: CPU: 2 PID: 9065 at
/home/amir/src/linux/fs/overlayfs/namei.c:271 ovl_lookup+0x81d/0x870
[overlay]

[ 1763.519285] TEST rename-new-pop-dir.py:170: Rename new dir over
removed unioned empty dir
[ 1763.592055] WARNING: CPU: 3 PID: 9065 at
/home/amir/src/linux/fs/overlayfs/namei.c:271 ovl_lookup+0x81d/0x870
[overlay]

[ 1763.592989] TEST rename-new-pop-dir.py:183: Rename new dir over
removed unioned dir, different files
[ 1763.658290] WARNING: CPU: 3 PID: 9065 at
/home/amir/src/linux/fs/overlayfs/namei.c:271 ovl_lookup+0x81d/0x870
[overlay]

[ 1763.660379] TEST rename-new-pop-dir.py:197: Rename new dir over
removed unioned dir, same files
[ 1763.731482] WARNING: CPU: 0 PID: 9065 at
/home/amir/src/linux/fs/overlayfs/namei.c:271 ovl_lookup+0x81d/0x870
[overlay]

I hope I am not missing anything.

Cheers,
Amir.

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-04  9:29       ` Amir Goldstein
@ 2016-11-04 13:48         ` Miklos Szeredi
  0 siblings, 0 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-04 13:48 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Al Viro, Miklos Szeredi, linux-unionfs, Guillem Jover,
	Raphael Hertzog, linux-fsdevel, linux-kernel

On Fri, Nov 4, 2016 at 10:29 AM, Amir Goldstein <amir73il@gmail.com> wrote:

> You did not address my comment about the 'stack' allocation overflow
> in ovl_lookup
> I believe the (possible) overflow is demonstrated by the following debug patch:

Oops, missed that.  Good spotting!

And there's more shit that unionfs-testsuite didn't discover (not even
involving multiple layers):

rm -rf /lower /upper /work
mkdir -p /lower/a/b/c /upper /work
mount -t overlay overlay -oupperdir=/upper,lowerdir=/lower,workdir=/work /mnt
mv /mnt/a /mnt/z
mv /mnt/z/b /mnt/q
ls /mnt/q
umount /mnt
mount -t overlay overlay -oupperdir=/upper,lowerdir=/lower,workdir=/work /mnt
ls /mnt/q
umount /mnt

Next update coming up...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/3] ovl: check fs features
  2016-10-25 11:24   ` Amir Goldstein
@ 2016-11-05 20:40     ` Amir Goldstein
  0 siblings, 0 replies; 44+ messages in thread
From: Amir Goldstein @ 2016-11-05 20:40 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-unionfs, Guillem Jover, Raphael Hertzog, linux-fsdevel,
	linux-kernel, stable

On Tue, Oct 25, 2016 at 2:24 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Tue, Oct 25, 2016 at 10:34 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
>> To allow adding new, backward incompatible features to overlayfs, we need a
>> way to store the list of features in the overlay.  This is done via
>> "trusted.overlay.features" xattr on the root of the upper layer (or one of
>> the lower layers, that previously acted as an upper layer).  It's a comma
>> separated list of case sensitive strings.
>>
>> If an overlay has an unknown feature, mount shall return an error.  So
>> mechanism should only be used for backward incompatible features.
>
> So maybe be explicit and call the attribute trusted.overlay.incompat_features,
> to allow future addition of compat and rocompat feature sets?
>

On top of the proposed features xattr,
for the sake of being backward compatible with old kernels,
how about creating a file 3 levels down from work dir (e.g. /work/work/a/b/c)
This would cause old kernels to mount overlay read-only,
which is sufficient to keep them from corrupting the redirect structure.

And once again, I suggest learning from the elders fs, who have
successfully gone through many on-disk format upgrades,
and copy the design of the feature trio (compat/incompat/rocompat)

Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-10-26 11:12     ` Miklos Szeredi
  2016-10-28 12:56       ` Raphael Hertzog
@ 2016-11-06 19:14       ` Konstantin Khlebnikov
  2016-11-07  8:07         ` Miklos Szeredi
  2016-11-07 11:03         ` Raphael Hertzog
  1 sibling, 2 replies; 44+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-06 19:14 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Wed, Oct 26, 2016 at 2:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Oct 25, 2016 at 1:57 PM, Raphael Hertzog <hertzog@debian.org> wrote:
>
>> Do you plan to make it the default in the future when it has been
>> available for a while?
>>
>> Barring any regression introduced by your patch, it seems that the feature
>> is best available by default since it allows legitimate operations to
>> succeed that are otherwise refused. I understand that it makes it
>> impossible to mount the overlay filesystem with an older kernel but is
>> that problem more widespread than the one we're fixing here?  On my side,
>> overlayfs is only used in scenarios where the kernel is always the same
>> (or newer compared to what created the initial filesystem).
>
> I think it would be safe to make it the default if upperdir is empty.
> Nonempty implies that it was created with old kernel (or it was
> crafted by hand).   But there should be a way to explicitly turn it
> off;  either because of the need for backward compatibility or because
> the old format is simply easier to work with for humans.
>
> How about:
>
> - If upper is nonempty, then leave redirect feature alone except when
> mount option "-oredirect=on" is used to force enabling it.
> - If upper is empty, then enable redirect feature except when mount
> option "-oredirect=off" is used to force disabling it.

I don't like this empty-nonempty upper logic.

I think this feature should be off by default and be enabled
explicitly in mount option.
Available features could be listed in sysfs /sys/fs/overlay/..., like ext4 does.

Overlayfs mounting anyway is complicated operation.
User must know a lot about it and provide persistent state for each mount:
list layers in correct order, work and uppder directory on the same disk, etc.
Enabled features is a part of this state.

Probably this could be solved in userspace tool "mount.overlay" - it could load
features and layers from config file or xattr and set required mount
options automatically.

--
Konstantin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-06 19:14       ` Konstantin Khlebnikov
@ 2016-11-07  8:07         ` Miklos Szeredi
  2016-11-07  9:58           ` Konstantin Khlebnikov
  2016-11-07 11:03         ` Raphael Hertzog
  1 sibling, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-07  8:07 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Raphael Hertzog, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Sun, Nov 6, 2016 at 8:14 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Wed, Oct 26, 2016 at 2:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Tue, Oct 25, 2016 at 1:57 PM, Raphael Hertzog <hertzog@debian.org> wrote:
>>
>>> Do you plan to make it the default in the future when it has been
>>> available for a while?
>>>
>>> Barring any regression introduced by your patch, it seems that the feature
>>> is best available by default since it allows legitimate operations to
>>> succeed that are otherwise refused. I understand that it makes it
>>> impossible to mount the overlay filesystem with an older kernel but is
>>> that problem more widespread than the one we're fixing here?  On my side,
>>> overlayfs is only used in scenarios where the kernel is always the same
>>> (or newer compared to what created the initial filesystem).
>>
>> I think it would be safe to make it the default if upperdir is empty.
>> Nonempty implies that it was created with old kernel (or it was
>> crafted by hand).   But there should be a way to explicitly turn it
>> off;  either because of the need for backward compatibility or because
>> the old format is simply easier to work with for humans.
>>
>> How about:
>>
>> - If upper is nonempty, then leave redirect feature alone except when
>> mount option "-oredirect=on" is used to force enabling it.
>> - If upper is empty, then enable redirect feature except when mount
>> option "-oredirect=off" is used to force disabling it.
>
> I don't like this empty-nonempty upper logic.
>
> I think this feature should be off by default and be enabled
> explicitly in mount option.
> Available features could be listed in sysfs /sys/fs/overlay/..., like ext4 does.
>
> Overlayfs mounting anyway is complicated operation.
> User must know a lot about it and provide persistent state for each mount:
> list layers in correct order, work and uppder directory on the same disk, etc.
> Enabled features is a part of this state.
>
> Probably this could be solved in userspace tool "mount.overlay" - it could load
> features and layers from config file or xattr and set required mount
> options automatically.

It seems there are some conflicting opinions here.  I have mine too,
but I'll let this simmer and concentrate on actual features for now.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07  8:07         ` Miklos Szeredi
@ 2016-11-07  9:58           ` Konstantin Khlebnikov
  2016-11-07 10:04             ` Miklos Szeredi
  0 siblings, 1 reply; 44+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-07  9:58 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Mon, Nov 7, 2016 at 11:07 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Sun, Nov 6, 2016 at 8:14 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>> On Wed, Oct 26, 2016 at 2:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Tue, Oct 25, 2016 at 1:57 PM, Raphael Hertzog <hertzog@debian.org> wrote:
>>>
>>>> Do you plan to make it the default in the future when it has been
>>>> available for a while?
>>>>
>>>> Barring any regression introduced by your patch, it seems that the feature
>>>> is best available by default since it allows legitimate operations to
>>>> succeed that are otherwise refused. I understand that it makes it
>>>> impossible to mount the overlay filesystem with an older kernel but is
>>>> that problem more widespread than the one we're fixing here?  On my side,
>>>> overlayfs is only used in scenarios where the kernel is always the same
>>>> (or newer compared to what created the initial filesystem).
>>>
>>> I think it would be safe to make it the default if upperdir is empty.
>>> Nonempty implies that it was created with old kernel (or it was
>>> crafted by hand).   But there should be a way to explicitly turn it
>>> off;  either because of the need for backward compatibility or because
>>> the old format is simply easier to work with for humans.
>>>
>>> How about:
>>>
>>> - If upper is nonempty, then leave redirect feature alone except when
>>> mount option "-oredirect=on" is used to force enabling it.
>>> - If upper is empty, then enable redirect feature except when mount
>>> option "-oredirect=off" is used to force disabling it.
>>
>> I don't like this empty-nonempty upper logic.
>>
>> I think this feature should be off by default and be enabled
>> explicitly in mount option.
>> Available features could be listed in sysfs /sys/fs/overlay/..., like ext4 does.
>>
>> Overlayfs mounting anyway is complicated operation.
>> User must know a lot about it and provide persistent state for each mount:
>> list layers in correct order, work and uppder directory on the same disk, etc.
>> Enabled features is a part of this state.
>>
>> Probably this could be solved in userspace tool "mount.overlay" - it could load
>> features and layers from config file or xattr and set required mount
>> options automatically.
>
> It seems there are some conflicting opinions here.  I have mine too,
> but I'll let this simmer and concentrate on actual features for now.

Ok =) let's try keep as simple as possible.

I've stumbled on somehow related problem - concurrent copy-ups are
strictly serialized by rename locks.
Obviously, file copying could be done in parallel: locks are required
only for final rename.
Because of that overlay slower that aufs for some workloads.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07  9:58           ` Konstantin Khlebnikov
@ 2016-11-07 10:04             ` Miklos Szeredi
  2016-11-07 10:08               ` Konstantin Khlebnikov
  0 siblings, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-07 10:04 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Raphael Hertzog, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:

> I've stumbled on somehow related problem - concurrent copy-ups are
> strictly serialized by rename locks.
> Obviously, file copying could be done in parallel: locks are required
> only for final rename.
> Because of that overlay slower that aufs for some workloads.

Easy to fix: for each copy up create a separate subdir of "work".
Then the contention is only for the time of creating the subdir, which
is very short.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07 10:04             ` Miklos Szeredi
@ 2016-11-07 10:08               ` Konstantin Khlebnikov
  2016-11-07 13:38                 ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-07 10:08 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>
>> I've stumbled on somehow related problem - concurrent copy-ups are
>> strictly serialized by rename locks.
>> Obviously, file copying could be done in parallel: locks are required
>> only for final rename.
>> Because of that overlay slower that aufs for some workloads.
>
> Easy to fix: for each copy up create a separate subdir of "work".
> Then the contention is only for the time of creating the subdir, which
> is very short.

Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
I think proper synchronization for concurrent copy-up (for example
round flag on ovl_entry) and  locking rename only for rename could be
better.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-06 19:14       ` Konstantin Khlebnikov
  2016-11-07  8:07         ` Miklos Szeredi
@ 2016-11-07 11:03         ` Raphael Hertzog
  2016-11-07 11:31           ` Konstantin Khlebnikov
  1 sibling, 1 reply; 44+ messages in thread
From: Raphael Hertzog @ 2016-11-07 11:03 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Miklos Szeredi, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

Hello,

On Sun, 06 Nov 2016, Konstantin Khlebnikov wrote:
> > - If upper is nonempty, then leave redirect feature alone except when
> > mount option "-oredirect=on" is used to force enabling it.
> > - If upper is empty, then enable redirect feature except when mount
> > option "-oredirect=off" is used to force disabling it.
> 
> I don't like this empty-nonempty upper logic.

Why? (I don't have the feeling that your subsequent paragraphs answer this
question... unless "overlayfs mounting is hard, let's complicate it even
more" is your answer)

> I think this feature should be off by default and be enabled
> explicitly in mount option.
> Available features could be listed in sysfs /sys/fs/overlay/..., like ext4 does.

TTBOMK in ext4, they are set at mkfs time and the default feature set
comes from /etc/mke2fs.conf. There's nothing like that for overlayfs.

> Overlayfs mounting anyway is complicated operation.
> User must know a lot about it and provide persistent state for each mount:
> list layers in correct order, work and uppder directory on the same disk, etc.
> Enabled features is a part of this state.

A large part of the users are not direct overlayfs users, they use
application (like schroot and live-build in my case) that rely on
overlayfs to offer some user-modifiable throw-away chroots or some
persistency on top of a read-only image. In both cases, the upper
directory start empty.

I find it highly disturbing to have to modify all those applications just
to get the correct semantics to rename a directory.

> Probably this could be solved in userspace tool "mount.overlay" - it could load
> features and layers from config file or xattr and set required mount
> options automatically.

I'm all for having a better API to mount overlayfs, but I don't think
blocking on the "redirect=on" by default is a good way to get this.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07 11:03         ` Raphael Hertzog
@ 2016-11-07 11:31           ` Konstantin Khlebnikov
  2016-11-07 13:42             ` Raphael Hertzog
  0 siblings, 1 reply; 44+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-07 11:31 UTC (permalink / raw)
  To: Raphael Hertzog
  Cc: Miklos Szeredi, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Mon, Nov 7, 2016 at 2:03 PM, Raphael Hertzog <hertzog@debian.org> wrote:
> Hello,
>
> On Sun, 06 Nov 2016, Konstantin Khlebnikov wrote:
>> > - If upper is nonempty, then leave redirect feature alone except when
>> > mount option "-oredirect=on" is used to force enabling it.
>> > - If upper is empty, then enable redirect feature except when mount
>> > option "-oredirect=off" is used to force disabling it.
>>
>> I don't like this empty-nonempty upper logic.
>
> Why? (I don't have the feeling that your subsequent paragraphs answer this
> question... unless "overlayfs mounting is hard, let's complicate it even
> more" is your answer)

Mixing flags from mount options, xattrs and emptiness of upper layer
doesn't make it simpler.

We have clear statement that options in /proc/mounts describes overlay
instance, let's keep feeding state in this way.

>
>> I think this feature should be off by default and be enabled
>> explicitly in mount option.
>> Available features could be listed in sysfs /sys/fs/overlay/..., like ext4 does.
>
> TTBOMK in ext4, they are set at mkfs time and the default feature set
> comes from /etc/mke2fs.conf. There's nothing like that for overlayfs.

/etc/mke2fs.conf is used by mkfs.ext4 - state is saved in super-block
inside filesystem.

overlayfs have no persistent super-block.

>
>> Overlayfs mounting anyway is complicated operation.
>> User must know a lot about it and provide persistent state for each mount:
>> list layers in correct order, work and uppder directory on the same disk, etc.
>> Enabled features is a part of this state.
>
> A large part of the users are not direct overlayfs users, they use
> application (like schroot and live-build in my case) that rely on
> overlayfs to offer some user-modifiable throw-away chroots or some
> persistency on top of a read-only image. In both cases, the upper
> directory start empty.
>
> I find it highly disturbing to have to modify all those applications just
> to get the correct semantics to rename a directory.

Other application are already aware about overlay layout and use it.
We cannot enable by default new backward incompatible features.

Returning -EXDEV is a completely correct semantics for rename,
most applications employ broken assumptions about this syscall =)

>
>> Probably this could be solved in userspace tool "mount.overlay" - it could load
>> features and layers from config file or xattr and set required mount
>> options automatically.
>
> I'm all for having a better API to mount overlayfs, but I don't think
> blocking on the "redirect=on" by default is a good way to get this.
>
> Cheers,
> --
> Raphaël Hertzog ◈ Debian Developer
>
> Support Debian LTS: http://www.freexian.com/services/debian-lts.html
> Learn to master Debian: http://debian-handbook.info/get/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07 10:08               ` Konstantin Khlebnikov
@ 2016-11-07 13:38                 ` Amir Goldstein
  2016-11-10 22:56                   ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-07 13:38 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Miklos Szeredi, Raphael Hertzog, Miklos Szeredi, linux-unionfs,
	Guillem Jover, linux-fsdevel, Linux Kernel Mailing List

On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>
>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>> strictly serialized by rename locks.
>>> Obviously, file copying could be done in parallel: locks are required
>>> only for final rename.
>>> Because of that overlay slower that aufs for some workloads.
>>
>> Easy to fix: for each copy up create a separate subdir of "work".
>> Then the contention is only for the time of creating the subdir, which
>> is very short.
>
> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
> I think proper synchronization for concurrent copy-up (for example
> round flag on ovl_entry) and  locking rename only for rename could be
> better.

Removing s_vfs_rename_mutex from copy-up path is something I have been
pondering about.
Assuming that I understand Al's comment above vfs_rename() correctly,
the sole purpose of per-sb serialization is to prevent loop creations.
However, how can one create a loop by moving a non-directory?
So it looks like at least for the non-dir copy up case, a much finer grained
lock is in order.

Anyway, it's on my todo list, as concurrent operation performance on overlayfs
is important to out use case.

Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07 11:31           ` Konstantin Khlebnikov
@ 2016-11-07 13:42             ` Raphael Hertzog
  2016-11-10 22:39               ` Miklos Szeredi
  0 siblings, 1 reply; 44+ messages in thread
From: Raphael Hertzog @ 2016-11-07 13:42 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Miklos Szeredi, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Mon, 07 Nov 2016, Konstantin Khlebnikov wrote:
> > Why? (I don't have the feeling that your subsequent paragraphs answer this
> > question... unless "overlayfs mounting is hard, let's complicate it even
> > more" is your answer)
> 
> Mixing flags from mount options, xattrs and emptiness of upper layer
> doesn't make it simpler.

It depends for whom. It does make it simpler for applications that just
want overlayfs to work like other normal filesystems.

> We have clear statement that options in /proc/mounts describes overlay
> instance, let's keep feeding state in this way.

Didn't you say above that xattrs provide flags too?

> > I find it highly disturbing to have to modify all those applications just
> > to get the correct semantics to rename a directory.
> 
> Other application are already aware about overlay layout and use it.
> We cannot enable by default new backward incompatible features.

On the opposite, if we have to modify those applications to add a new
mount option, then they will no longer work with older versions of
overlayfs... so you move the complexity down to applications if they want
to work with multiple kernel versions.

There's no technical problem to enable a new backward incompatible
feature. It's just that you don't want to do it in case the user
wants to mount it again with an older kernel. So this is just policy.

So what about a new mount option that defines a compatibility level?

0: initial feature set
1: with renamedir flag

It would default to 1 but the user can set it to "0" to keep compatibility
with older versions of overlayfs. In the future, as more backward
incompatible changes are added, you add new levels and define the values
of the various flags based on this setting.

> Returning -EXDEV is a completely correct semantics for rename,
> most applications employ broken assumptions about this syscall =)

Maybe (I don't know what standards say), but then what matters is
real-life. And in real-life, it's somewhat unexpected to get back -EXDEV
when the rename() happens in the same directory (and has therefore no
chance to cross any mount boundary).

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Support Debian LTS: http://www.freexian.com/services/debian-lts.html
Learn to master Debian: http://debian-handbook.info/get/

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07 13:42             ` Raphael Hertzog
@ 2016-11-10 22:39               ` Miklos Szeredi
  2016-11-11  9:41                 ` Konstantin Khlebnikov
  2016-11-13 10:00                 ` Amir Goldstein
  0 siblings, 2 replies; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-10 22:39 UTC (permalink / raw)
  To: Raphael Hertzog
  Cc: Konstantin Khlebnikov, Miklos Szeredi, linux-unionfs,
	Guillem Jover, linux-fsdevel, Linux Kernel Mailing List

New version is at:

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect

News:
- it actually should work in all cases
- when rename is not cross directory, just store the new name instead
of a full path, as suggested by Amir
- when redirect path is too long fall back to EXDEV (the max length
should probably be a module param)

About turning the feature on or off.  Yes, maybe the empty checking is
too complicated.  Going one simpler:

 - default to old behavior, turn on with mount option
 - add module option and kernel compile option to turn on the feature by default

I guess distros wil simply enable this by default, since back
compatibility is basically a non-issue.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-07 13:38                 ` Amir Goldstein
@ 2016-11-10 22:56                   ` Amir Goldstein
  2016-11-11  9:46                     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-10 22:56 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Miklos Szeredi, Raphael Hertzog, Miklos Szeredi, linux-unionfs,
	Guillem Jover, linux-fsdevel, Linux Kernel Mailing List

On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>
>>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>>> strictly serialized by rename locks.
>>>> Obviously, file copying could be done in parallel: locks are required
>>>> only for final rename.
>>>> Because of that overlay slower that aufs for some workloads.
>>>
>>> Easy to fix: for each copy up create a separate subdir of "work".
>>> Then the contention is only for the time of creating the subdir, which
>>> is very short.
>>
>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
>> I think proper synchronization for concurrent copy-up (for example
>> round flag on ovl_entry) and  locking rename only for rename could be
>> better.
>
> Removing s_vfs_rename_mutex from copy-up path is something I have been
> pondering about.
> Assuming that I understand Al's comment above vfs_rename() correctly,
> the sole purpose of per-sb serialization is to prevent loop creations.
> However, how can one create a loop by moving a non-directory?
> So it looks like at least for the non-dir copy up case, a much finer grained
> lock is in order.
>


I posted patches to relax the s_vfs_rename_mutex for copy-up and
whiteout in some use cases.

Konstantin,

It would be useful to know if those patches help with your use case.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-10 22:39               ` Miklos Szeredi
@ 2016-11-11  9:41                 ` Konstantin Khlebnikov
  2016-11-13 10:00                 ` Amir Goldstein
  1 sibling, 0 replies; 44+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-11  9:41 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Miklos Szeredi, linux-unionfs, Guillem Jover,
	linux-fsdevel, Linux Kernel Mailing List

On Fri, Nov 11, 2016 at 1:39 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> New version is at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect
>
> News:
> - it actually should work in all cases
> - when rename is not cross directory, just store the new name instead
> of a full path, as suggested by Amir
> - when redirect path is too long fall back to EXDEV (the max length
> should probably be a module param)
>
> About turning the feature on or off.  Yes, maybe the empty checking is
> too complicated.  Going one simpler:
>
>  - default to old behavior, turn on with mount option
>  - add module option and kernel compile option to turn on the feature by default
>
> I guess distros wil simply enable this by default, since back
> compatibility is basically a non-issue.

Looks good.

I suppose module parameter exposed in /sys/module/overlay/parameters/ could be
documented as recommended way for detecting presence of that feature.

--
Konstantin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-10 22:56                   ` Amir Goldstein
@ 2016-11-11  9:46                     ` Konstantin Khlebnikov
  2016-11-11 10:06                       ` Miklos Szeredi
  0 siblings, 1 reply; 44+ messages in thread
From: Konstantin Khlebnikov @ 2016-11-11  9:46 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Raphael Hertzog, Miklos Szeredi, linux-unionfs,
	Guillem Jover, linux-fsdevel, Linux Kernel Mailing List

On Fri, Nov 11, 2016 at 1:56 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>
>>>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>>>> strictly serialized by rename locks.
>>>>> Obviously, file copying could be done in parallel: locks are required
>>>>> only for final rename.
>>>>> Because of that overlay slower that aufs for some workloads.
>>>>
>>>> Easy to fix: for each copy up create a separate subdir of "work".
>>>> Then the contention is only for the time of creating the subdir, which
>>>> is very short.
>>>
>>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
>>> I think proper synchronization for concurrent copy-up (for example
>>> round flag on ovl_entry) and  locking rename only for rename could be
>>> better.
>>
>> Removing s_vfs_rename_mutex from copy-up path is something I have been
>> pondering about.
>> Assuming that I understand Al's comment above vfs_rename() correctly,
>> the sole purpose of per-sb serialization is to prevent loop creations.
>> However, how can one create a loop by moving a non-directory?
>> So it looks like at least for the non-dir copy up case, a much finer grained
>> lock is in order.
>>
>
>
> I posted patches to relax the s_vfs_rename_mutex for copy-up and
> whiteout in some use cases.
>
> Konstantin,
>
> It would be useful to know if those patches help with your use case.
>

Well.. I think relaxing only s_vfs_rename_mutex wouldn't help much here.
Copying is still serialized by i_mutex on workdir?
Data copying should be done without rename locks at all.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-11  9:46                     ` Konstantin Khlebnikov
@ 2016-11-11 10:06                       ` Miklos Szeredi
  2016-11-11 12:42                         ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-11 10:06 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Amir Goldstein, Raphael Hertzog, Miklos Szeredi, linux-unionfs,
	Guillem Jover, linux-fsdevel, Linux Kernel Mailing List

On Fri, Nov 11, 2016 at 10:46 AM, Konstantin Khlebnikov
<koct9i@gmail.com> wrote:
> On Fri, Nov 11, 2016 at 1:56 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>>
>>>>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>>>>> strictly serialized by rename locks.
>>>>>> Obviously, file copying could be done in parallel: locks are required
>>>>>> only for final rename.
>>>>>> Because of that overlay slower that aufs for some workloads.
>>>>>
>>>>> Easy to fix: for each copy up create a separate subdir of "work".
>>>>> Then the contention is only for the time of creating the subdir, which
>>>>> is very short.
>>>>
>>>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
>>>> I think proper synchronization for concurrent copy-up (for example
>>>> round flag on ovl_entry) and  locking rename only for rename could be
>>>> better.
>>>
>>> Removing s_vfs_rename_mutex from copy-up path is something I have been
>>> pondering about.
>>> Assuming that I understand Al's comment above vfs_rename() correctly,
>>> the sole purpose of per-sb serialization is to prevent loop creations.
>>> However, how can one create a loop by moving a non-directory?
>>> So it looks like at least for the non-dir copy up case, a much finer grained
>>> lock is in order.
>>>
>>
>>
>> I posted patches to relax the s_vfs_rename_mutex for copy-up and
>> whiteout in some use cases.
>>
>> Konstantin,
>>
>> It would be useful to know if those patches help with your use case.
>>
>
> Well.. I think relaxing only s_vfs_rename_mutex wouldn't help much here.
> Copying is still serialized by i_mutex on workdir?
> Data copying should be done without rename locks at all.

We do need something to prevent multiple copy-ups starting up in
parallel on the same file, though.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-11 10:06                       ` Miklos Szeredi
@ 2016-11-11 12:42                         ` Amir Goldstein
  2016-11-13  9:11                           ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-11 12:42 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Raphael Hertzog, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Fri, Nov 11, 2016 at 12:06 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Nov 11, 2016 at 10:46 AM, Konstantin Khlebnikov
> <koct9i@gmail.com> wrote:
>> On Fri, Nov 11, 2016 at 1:56 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>>>
>>>>>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>>>>>> strictly serialized by rename locks.
>>>>>>> Obviously, file copying could be done in parallel: locks are required
>>>>>>> only for final rename.
>>>>>>> Because of that overlay slower that aufs for some workloads.
>>>>>>
>>>>>> Easy to fix: for each copy up create a separate subdir of "work".
>>>>>> Then the contention is only for the time of creating the subdir, which
>>>>>> is very short.
>>>>>
>>>>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
>>>>> I think proper synchronization for concurrent copy-up (for example
>>>>> round flag on ovl_entry) and  locking rename only for rename could be
>>>>> better.
>>>>
>>>> Removing s_vfs_rename_mutex from copy-up path is something I have been
>>>> pondering about.
>>>> Assuming that I understand Al's comment above vfs_rename() correctly,
>>>> the sole purpose of per-sb serialization is to prevent loop creations.
>>>> However, how can one create a loop by moving a non-directory?
>>>> So it looks like at least for the non-dir copy up case, a much finer grained
>>>> lock is in order.
>>>>
>>>
>>>
>>> I posted patches to relax the s_vfs_rename_mutex for copy-up and
>>> whiteout in some use cases.
>>>
>>> Konstantin,
>>>
>>> It would be useful to know if those patches help with your use case.
>>>
>>
>> Well.. I think relaxing only s_vfs_rename_mutex wouldn't help much here.
>> Copying is still serialized by i_mutex on workdir?
>> Data copying should be done without rename locks at all.
>
> We do need something to prevent multiple copy-ups starting up in
> parallel on the same file, though.
>

I guess an inode_lock on the copy-up victim should suffice?
I will look into it as soon as I am done with profiling.
So far I ran only 2 rm -rf threads on 2 different overlay mounts
on the same underlying fs
and s_vfs_rename_mutex was contended about ~4% of the time.

In this test, copy-up is not dominant - only ~2% for the directory
copy-ups, but vfs_whiteouts take 20% and the vfs_rename itself 10%,
both with s_vfs_rename_mutex held.

Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-11 12:42                         ` Amir Goldstein
@ 2016-11-13  9:11                           ` Amir Goldstein
  0 siblings, 0 replies; 44+ messages in thread
From: Amir Goldstein @ 2016-11-13  9:11 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Raphael Hertzog, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Fri, Nov 11, 2016 at 2:42 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Nov 11, 2016 at 12:06 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Fri, Nov 11, 2016 at 10:46 AM, Konstantin Khlebnikov
>> <koct9i@gmail.com> wrote:
>>> On Fri, Nov 11, 2016 at 1:56 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> On Mon, Nov 7, 2016 at 3:38 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>> On Mon, Nov 7, 2016 at 12:08 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>>> On Mon, Nov 7, 2016 at 1:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>>> On Mon, Nov 7, 2016 at 10:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>>>>>>>
>>>>>>>> I've stumbled on somehow related problem - concurrent copy-ups are
>>>>>>>> strictly serialized by rename locks.
>>>>>>>> Obviously, file copying could be done in parallel: locks are required
>>>>>>>> only for final rename.
>>>>>>>> Because of that overlay slower that aufs for some workloads.
>>>>>>>
>>>>>>> Easy to fix: for each copy up create a separate subdir of "work".
>>>>>>> Then the contention is only for the time of creating the subdir, which
>>>>>>> is very short.
>>>>>>
>>>>>> Yeah, but lock_rename() also takes per-sb s_vfs_rename_mutex (kludge by Al Viro)
>>>>>> I think proper synchronization for concurrent copy-up (for example
>>>>>> round flag on ovl_entry) and  locking rename only for rename could be
>>>>>> better.
>>>>>
>>>>> Removing s_vfs_rename_mutex from copy-up path is something I have been
>>>>> pondering about.
>>>>> Assuming that I understand Al's comment above vfs_rename() correctly,
>>>>> the sole purpose of per-sb serialization is to prevent loop creations.
>>>>> However, how can one create a loop by moving a non-directory?
>>>>> So it looks like at least for the non-dir copy up case, a much finer grained
>>>>> lock is in order.
>>>>>
>>>>
>>>>
>>>> I posted patches to relax the s_vfs_rename_mutex for copy-up and
>>>> whiteout in some use cases.
>>>>
>>>> Konstantin,
>>>>
>>>> It would be useful to know if those patches help with your use case.
>>>>
>>>
>>> Well.. I think relaxing only s_vfs_rename_mutex wouldn't help much here.
>>> Copying is still serialized by i_mutex on workdir?
>>> Data copying should be done without rename locks at all.
>>
>> We do need something to prevent multiple copy-ups starting up in
>> parallel on the same file, though.
>>
>
> I guess an inode_lock on the copy-up victim should suffice?

Just to follow up on this hijacked thread.
I posted patches to lock the overlay inode of copied up file
and relaxed the lock_rename during data copy up.

Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-10 22:39               ` Miklos Szeredi
  2016-11-11  9:41                 ` Konstantin Khlebnikov
@ 2016-11-13 10:00                 ` Amir Goldstein
  2016-11-14 16:25                   ` Amir Goldstein
  1 sibling, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-13 10:00 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Fri, Nov 11, 2016 at 12:39 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> New version is at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect
>
> News:
> - it actually should work in all cases
> - when rename is not cross directory, just store the new name instead
> of a full path, as suggested by Amir
> - when redirect path is too long fall back to EXDEV (the max length
> should probably be a module param)
>

Looks goods, except for the case of change from relative to absolute
redirect of the victim dentry. IIUC, ovl_set_redirect() will return immediately
because ovl_dentry_is_redirect() and will not get to setting the absolute
redirect.

It passed my sanity tests (including recycle test) and on top of my copy up
lock changes.

You can add Reviewed-by/Tested-by me to
  1380846 ovl: redirect on rename-dir

> About turning the feature on or off.  Yes, maybe the empty checking is
> too complicated.  Going one simpler:
>
>  - default to old behavior, turn on with mount option
>  - add module option and kernel compile option to turn on the feature by default
>

I don't see these changes on your branch.
Are these your plans for a future version?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-13 10:00                 ` Amir Goldstein
@ 2016-11-14 16:25                   ` Amir Goldstein
  2016-11-16 22:00                     ` Miklos Szeredi
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-14 16:25 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Sun, Nov 13, 2016 at 12:00 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Nov 11, 2016 at 12:39 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> New version is at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect
>>
>> News:
>> - it actually should work in all cases
>> - when rename is not cross directory, just store the new name instead
>> of a full path, as suggested by Amir
>> - when redirect path is too long fall back to EXDEV (the max length
>> should probably be a module param)
>>
>
> Looks goods, except for the case of change from relative to absolute
> redirect of the victim dentry. IIUC, ovl_set_redirect() will return immediately
> because ovl_dentry_is_redirect() and will not get to setting the absolute
> redirect.
>

I added some more tests to catch this problem at:
https://github.com/amir73il/unionmount-testsuite.git #ovl_rename_dir

In the new version, upper recycling is optional and you can set a bound
to the number of layers. This is needed to catch the bug because the
scenario is:
- Rename populated dir in same dir
- Move the populated dir to another dir
- Re-mount
- Populated dir is empty

The following test run demonstrates it:

$ sudo ./run --ov=0 rename-move-dir
...
TEST rename-move-dir.py:37: Rename populated dir and move into another
 ./run --rename /mnt/a/dir102 /mnt/a/dir102x
 ./run --rename /mnt/a/dir102 /mnt/a/dir102x -E ENOENT
 ./run --rename /mnt/a/dir102x /mnt/a/empty102/dir102
 ./run --rename /mnt/a/dir102x /mnt/a/empty102/dir102 -E ENOENT
 ./run --open-file /mnt/a/dir102 -r -d -E ENOENT
 ./run --open-file /mnt/a/dir102x -r -d -E ENOENT
 ./run --open-file /mnt/a/empty102/dir102 -r -d
/mnt/a/empty102/dir102/a: File is missing


Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-14 16:25                   ` Amir Goldstein
@ 2016-11-16 22:00                     ` Miklos Szeredi
  2016-11-18 15:37                       ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-16 22:00 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Mon, Nov 14, 2016 at 5:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Sun, Nov 13, 2016 at 12:00 PM, Amir Goldstein <amir73il@gmail.com> wrote:

>> Looks goods, except for the case of change from relative to absolute
>> redirect of the victim dentry. IIUC, ovl_set_redirect() will return immediately
>> because ovl_dentry_is_redirect() and will not get to setting the absolute
>> redirect.
>>
>
> I added some more tests to catch this problem at:
> https://github.com/amir73il/unionmount-testsuite.git #ovl_rename_dir

Thanks for testing.

Force pushed updated version to the usual place:

   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect

This also has the xattr feature thing replaced with mount option,
module param and kernel config option.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-16 22:00                     ` Miklos Szeredi
@ 2016-11-18 15:37                       ` Amir Goldstein
  2016-11-20 11:39                         ` Amir Goldstein
  2016-11-21  9:54                         ` Miklos Szeredi
  0 siblings, 2 replies; 44+ messages in thread
From: Amir Goldstein @ 2016-11-18 15:37 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Thu, Nov 17, 2016 at 12:00 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, Nov 14, 2016 at 5:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > On Sun, Nov 13, 2016 at 12:00 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
> >> Looks goods, except for the case of change from relative to absolute
> >> redirect of the victim dentry. IIUC, ovl_set_redirect() will return immediately
> >> because ovl_dentry_is_redirect() and will not get to setting the absolute
> >> redirect.
> >>
> >
> > I added some more tests to catch this problem at:
> > https://github.com/amir73il/unionmount-testsuite.git #ovl_rename_dir
>
> Thanks for testing.
>
> Force pushed updated version to the usual place:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect
>

Found one typo and one bug in error that can cause crash on dput(ERR_PTR(err)):

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 21ddac7..0daac51 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -15,7 +15,7 @@ config OVERLAY_FS_REDIRECT_DIR
        help
          If this config option is enabled then overlay filesystems will use
          redirects when renaming directories by default.  In this case it is
-         still possible possible to turn off redirects globally with the
+         still possible to turn off redirects globally with the
          "redirect_dir=off" module option or on a filesystem instance basis
          with the "redirect_dir=off" mount option.

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 5eaa9f9..a19fc5c 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -105,11 +105,12 @@ static int ovl_lookup_single(struct dentry
*base, struct ovl_lookup_data *d,

        this = lookup_one_len_unlocked(name, base, namelen);
        if (IS_ERR(this)) {
-               if (PTR_ERR(this) == -ENOENT ||
-                   PTR_ERR(this) == -ENAMETOOLONG) {
+               err = PTR_ERR(this);
+               if (err == -ENOENT || err == -ENAMETOOLONG) {
                        this = NULL;
+                       goto out;
                }
-               goto out;
+               return err;
        }
        if (!this->d_inode)
                goto put_and_out;

> This also has the xattr feature thing replaced with mount option,
> module param and kernel config option.
>

I like the kernel config/module param/mount option for
enabling/disabling the feature.

But I still think that we should write the features xattr on the first
redirect rename.
The features xattr tell us what can be found on the layer, so we would
be wise to
keep it around for all sorts of backward compatibility aspect.

Amir.

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-18 15:37                       ` Amir Goldstein
@ 2016-11-20 11:39                         ` Amir Goldstein
  2016-11-21  9:54                         ` Miklos Szeredi
  1 sibling, 0 replies; 44+ messages in thread
From: Amir Goldstein @ 2016-11-20 11:39 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Fri, Nov 18, 2016 at 5:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Nov 17, 2016 at 12:00 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>> On Mon, Nov 14, 2016 at 5:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> > On Sun, Nov 13, 2016 at 12:00 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>> >> Looks goods, except for the case of change from relative to absolute
>> >> redirect of the victim dentry. IIUC, ovl_set_redirect() will return immediately
>> >> because ovl_dentry_is_redirect() and will not get to setting the absolute
>> >> redirect.
>> >>
>> >
>> > I added some more tests to catch this problem at:
>> > https://github.com/amir73il/unionmount-testsuite.git #ovl_rename_dir
>>
>> Thanks for testing.
>>
>> Force pushed updated version to the usual place:
>>
>>    git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #redirect
>>
>
> Found one typo and one bug in error that can cause crash on dput(ERR_PTR(err)):
>
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index 21ddac7..0daac51 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -15,7 +15,7 @@ config OVERLAY_FS_REDIRECT_DIR
>         help
>           If this config option is enabled then overlay filesystems will use
>           redirects when renaming directories by default.  In this case it is
> -         still possible possible to turn off redirects globally with the
> +         still possible to turn off redirects globally with the
>           "redirect_dir=off" module option or on a filesystem instance basis
>           with the "redirect_dir=off" mount option.
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 5eaa9f9..a19fc5c 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -105,11 +105,12 @@ static int ovl_lookup_single(struct dentry
> *base, struct ovl_lookup_data *d,
>
>         this = lookup_one_len_unlocked(name, base, namelen);
>         if (IS_ERR(this)) {
> -               if (PTR_ERR(this) == -ENOENT ||
> -                   PTR_ERR(this) == -ENAMETOOLONG) {
> +               err = PTR_ERR(this);
> +               if (err == -ENOENT || err == -ENAMETOOLONG) {
>                         this = NULL;
> +                       goto out;
>                 }
> -               goto out;
> +               return err;
>         }
>         if (!this->d_inode)
>                 goto put_and_out;
>

I just realized that this bug is already in overlayfs-next, so posted
a patch to fix it.

>> This also has the xattr feature thing replaced with mount option,
>> module param and kernel config option.
>>
>
> I like the kernel config/module param/mount option for
> enabling/disabling the feature.
>
> But I still think that we should write the features xattr on the first
> redirect rename.
> The features xattr tell us what can be found on the layer, so we would
> be wise to
> keep it around for all sorts of backward compatibility aspect.
>
> Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-18 15:37                       ` Amir Goldstein
  2016-11-20 11:39                         ` Amir Goldstein
@ 2016-11-21  9:54                         ` Miklos Szeredi
  2016-11-21 10:13                           ` Amir Goldstein
  1 sibling, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-21  9:54 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Fri, Nov 18, 2016 at 4:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:

> Found one typo and one bug in error that can cause crash on dput(ERR_PTR(err)):

Thanks.

Fixes force pushed to overlayfs-next.

Also pushed the redirect patches to overlayfs-next, as they seem to
have matured enough.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-21  9:54                         ` Miklos Szeredi
@ 2016-11-21 10:13                           ` Amir Goldstein
  2016-11-21 10:16                             ` Miklos Szeredi
  0 siblings, 1 reply; 44+ messages in thread
From: Amir Goldstein @ 2016-11-21 10:13 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Raphael Hertzog, Konstantin Khlebnikov, Miklos Szeredi,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Mon, Nov 21, 2016 at 11:54 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Nov 18, 2016 at 4:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>> Found one typo and one bug in error that can cause crash on dput(ERR_PTR(err)):
>
> Thanks.
>
> Fixes force pushed to overlayfs-next.

All right. I had the (wrong) impression the next was not a rewindable branch.

>
> Also pushed the redirect patches to overlayfs-next, as they seem to
> have matured enough.
>

Agreed.

Amir.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-21 10:13                           ` Amir Goldstein
@ 2016-11-21 10:16                             ` Miklos Szeredi
  2016-11-22 13:42                               ` Amir Goldstein
  0 siblings, 1 reply; 44+ messages in thread
From: Miklos Szeredi @ 2016-11-21 10:16 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Raphael Hertzog, Konstantin Khlebnikov,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Mon, Nov 21, 2016 at 11:13 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Nov 21, 2016 at 11:54 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Fri, Nov 18, 2016 at 4:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>>> Found one typo and one bug in error that can cause crash on dput(ERR_PTR(err)):
>>
>> Thanks.
>>
>> Fixes force pushed to overlayfs-next.
>
> All right. I had the (wrong) impression the next was not a rewindable branch.

That depends on how many devel branches are broken by such an action.
In case of overlayfs-next, there doesn't appear to be too much of
that, so for now I feel free to mess with it.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] ovl: redirect on rename-dir
  2016-11-21 10:16                             ` Miklos Szeredi
@ 2016-11-22 13:42                               ` Amir Goldstein
  0 siblings, 0 replies; 44+ messages in thread
From: Amir Goldstein @ 2016-11-22 13:42 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Miklos Szeredi, Raphael Hertzog, Konstantin Khlebnikov,
	linux-unionfs, Guillem Jover, linux-fsdevel,
	Linux Kernel Mailing List

On Mon, Nov 21, 2016 at 12:16 PM, Miklos Szeredi <mszeredi@redhat.com> wrote:
> On Mon, Nov 21, 2016 at 11:13 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Mon, Nov 21, 2016 at 11:54 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Fri, Nov 18, 2016 at 4:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>
>>>> Found one typo and one bug in error that can cause crash on dput(ERR_PTR(err)):
>>>
>>> Thanks.
>>>
>>> Fixes force pushed to overlayfs-next.
>>
>> All right. I had the (wrong) impression the next was not a rewindable branch.
>
> That depends on how many devel branches are broken by such an action.
> In case of overlayfs-next, there doesn't appear to be too much of
> that, so for now I feel free to mess with it.
>

All right. Just posted another minor fix to display redirect=xx on /proc/mounts
when it is due. Feel free to squash or apply or whatever.

*Strictly* FYI, here is something I have been working on, on top of
redirect_dir,
which I need for one of our use cases.
Since it is working and passed all the sanity tests, I am letting you all know
about it in case it's relevant for anyone else.
Not even going to post the patches yet, just a link and an abstract.

Amir.

https://github.com/amir73il/linux.git #redirect_fh

===============================
  ovl: redirect merged dir by file handle on copy up

When mounted with mount option redirect_dir=fh,
every copy up of lower directory stores the lower
dir file handle in redirect xattr of upper dir.

After the redirect at copy up, renaming upper merged
directory requires no further action.

This method has some advantages over absolute path redirect:
- it is more compact in stored xattr size
- it is not limited by lengths of full paths
- lookup redirect is more efficient for very nested directories

It also has some disadvantages over absolute path redirect:
- it requires setting the redirect xattr for all layers of
  merged dirs
- it requires that all lower layers are on the same file system,
  which support exportfs ops
- file handles will become stale if overlay lower directories
  where to be copied to another location

Signed-off-by: Amir Goldstein <amir73il@gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2016-11-22 13:42 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-25  7:34 [PATCH 0/3] overlayfs: allow moving directory trees Miklos Szeredi
2016-10-25  7:34 ` [PATCH 1/3] ovl: check fs features Miklos Szeredi
2016-10-25 11:24   ` Amir Goldstein
2016-11-05 20:40     ` Amir Goldstein
2016-10-25  7:34 ` [PATCH 2/3] vfs: export vfs_path_lookup() Miklos Szeredi
2016-10-25  7:34 ` [PATCH 3/3] ovl: redirect on rename-dir Miklos Szeredi
2016-10-25 11:57   ` Raphael Hertzog
2016-10-26 11:12     ` Miklos Szeredi
2016-10-28 12:56       ` Raphael Hertzog
2016-10-28 12:59         ` Miklos Szeredi
2016-11-06 19:14       ` Konstantin Khlebnikov
2016-11-07  8:07         ` Miklos Szeredi
2016-11-07  9:58           ` Konstantin Khlebnikov
2016-11-07 10:04             ` Miklos Szeredi
2016-11-07 10:08               ` Konstantin Khlebnikov
2016-11-07 13:38                 ` Amir Goldstein
2016-11-10 22:56                   ` Amir Goldstein
2016-11-11  9:46                     ` Konstantin Khlebnikov
2016-11-11 10:06                       ` Miklos Szeredi
2016-11-11 12:42                         ` Amir Goldstein
2016-11-13  9:11                           ` Amir Goldstein
2016-11-07 11:03         ` Raphael Hertzog
2016-11-07 11:31           ` Konstantin Khlebnikov
2016-11-07 13:42             ` Raphael Hertzog
2016-11-10 22:39               ` Miklos Szeredi
2016-11-11  9:41                 ` Konstantin Khlebnikov
2016-11-13 10:00                 ` Amir Goldstein
2016-11-14 16:25                   ` Amir Goldstein
2016-11-16 22:00                     ` Miklos Szeredi
2016-11-18 15:37                       ` Amir Goldstein
2016-11-20 11:39                         ` Amir Goldstein
2016-11-21  9:54                         ` Miklos Szeredi
2016-11-21 10:13                           ` Amir Goldstein
2016-11-21 10:16                             ` Miklos Szeredi
2016-11-22 13:42                               ` Amir Goldstein
2016-10-25 12:49   ` Amir Goldstein
2016-10-26 11:26     ` Miklos Szeredi
2016-10-26 12:11       ` Amir Goldstein
2016-10-26 12:51         ` Miklos Szeredi
2016-10-26 19:56       ` Amir Goldstein
2016-10-28 16:15   ` Al Viro
2016-11-03 15:50     ` Miklos Szeredi
2016-11-04  9:29       ` Amir Goldstein
2016-11-04 13:48         ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).