All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] ovl: constant inode numbers (cont.)
@ 2017-06-01  9:02 Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 1/5] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
                   ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01  9:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

Miklos,

This series covers many of the constant inode number leftovers from v4.12.

Current status is:
- For non-hardlinks, st_ino is constant and consistent with d_ino
- For samefs and for non-dir, st_ino/d_ino is also persistent
- For samefs, st_dev;st_ino is also system-wide unique

The remaining leftovers:
- Constant st_ino/d_ino for hardlinks (WIP by me)
- System-wide unique st_dev;st_ino for non-samefs (WIP by Chandan?)

Tested constant and consistent d_ino with improved xfstest overlay/017,
already upstream, which currently fails only on the hardlink copy up test.

v3:
- Relax same fs contrains from v4.12
- Non "impure" dir optimizations

v2:
- Lookup overlay entry with lookup_one_len_noperm()

v1:
- Resurect Miklos's constant d_ino POC
- Lookup overlay entry with lookup_one_len()

Amir Goldstein (5):
  ovl: relax same fs constrain for ovl_check_origin()
  ovl: relax same fs constrain for constant st_ino
  vfs: factor out lookup_one_len_init()
  vfs: add helper lookup_one_len_noperm()
  ovl: consistent st_ino/d_ino

 fs/namei.c               | 117 +++++++++++++++++++++++------------------
 fs/overlayfs/inode.c     |  32 +++++++-----
 fs/overlayfs/namei.c     |  50 +++++++++++-------
 fs/overlayfs/overlayfs.h |   2 +-
 fs/overlayfs/readdir.c   | 133 ++++++++++++++++++++++++++++++++++++++++++++---
 include/linux/namei.h    |   1 +
 6 files changed, 244 insertions(+), 91 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v3 1/5] ovl: relax same fs constrain for ovl_check_origin()
  2017-06-01  9:02 [PATCH v3 0/5] ovl: constant inode numbers (cont.) Amir Goldstein
@ 2017-06-01  9:02 ` Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino Amir Goldstein
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01  9:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

For the case of all layers not on the same fs, try to decode the copy up
origin file handle on any of the lower layers.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c | 43 +++++++++++++++++++++++++------------------
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index f3136c31e72a..04b3d6cf4148 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -272,28 +272,24 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
 			    struct path **stackp, unsigned int *ctrp)
 {
-	struct super_block *same_sb = ovl_same_sb(dentry->d_sb);
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
 	struct vfsmount *mnt;
-	struct dentry *origin;
+	struct dentry *origin = NULL;
+	int i;
 
-	if (!same_sb || !roe->numlower)
-		return 0;
 
-       /*
-	* Since all layers are on the same fs, we use the first layer for
-	* decoding the file handle.  We may get a disconnected dentry,
-	* which is fine, because we only need to hold the origin inode in
-	* cache and use its inode number.  We may even get a connected dentry,
-	* that is not under the first layer's root.  That is also fine for
-	* using it's inode number - it's the same as if we held a reference
-	* to a dentry in first layer that was moved under us.
-	*/
-	mnt = roe->lowerstack[0].mnt;
-
-	origin = ovl_get_origin(upperdentry, mnt);
-	if (IS_ERR_OR_NULL(origin))
-		return PTR_ERR(origin);
+	for (i = 0; i < roe->numlower; i++) {
+		mnt = roe->lowerstack[i].mnt;
+		origin = ovl_get_origin(upperdentry, mnt);
+		if (IS_ERR(origin))
+			return PTR_ERR(origin);
+
+		if (origin)
+			break;
+	}
+
+	if (!origin)
+		return 0;
 
 	BUG_ON(*stackp || *ctrp);
 	*stackp = kmalloc(sizeof(struct path), GFP_TEMPORARY);
@@ -301,6 +297,7 @@ static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
 		dput(origin);
 		return -ENOMEM;
 	}
+
 	**stackp = (struct path) { .dentry = origin, .mnt = mnt };
 	*ctrp = 1;
 
@@ -372,6 +369,16 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 		if (upperdentry && !d.is_dir) {
 			BUG_ON(!d.stop || d.redirect);
+			/*
+			 * Lookup copy up origin by decoding origin file handle.
+			 * We may get a disconnected dentry, which is fine,
+			 * because we only need to hold the origin inode in
+			 * cache and use its inode number.  We may even get a
+			 * connected dentry, that is not under any of the lower
+			 * layers root.  That is also fine for using it's inode
+			 * number - it's the same as if we held a reference
+			 * to a dentry in lower layer that was moved under us.
+			 */
 			err = ovl_check_origin(dentry, upperdentry,
 					       &stack, &ctr);
 			if (err)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-01  9:02 [PATCH v3 0/5] ovl: constant inode numbers (cont.) Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 1/5] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
@ 2017-06-01  9:02 ` Amir Goldstein
  2017-06-01 12:30   ` Chandan Rajendra
  2017-06-01 19:55   ` Miklos Szeredi
  2017-06-01  9:02 ` [PATCH v3 3/5] vfs: factor out lookup_one_len_init() Amir Goldstein
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01  9:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

For the case of all layers not on the same fs, use the copy up
origin st_ino/st_dev for non-dir, if we know it.

This guarantied constant and persistent st_ino/st_dev for non-dir,
but not system-wide uniqueness, like in the same fs case.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d613e2c41242..5f285c179bbd 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -65,6 +65,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	struct path realpath;
 	const struct cred *old_cred;
 	bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
+	bool samefs = ovl_same_sb(dentry->d_sb);
 	int err;
 
 	type = ovl_path_real(dentry, &realpath);
@@ -74,16 +75,16 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		goto out;
 
 	/*
-	 * When all layers are on the same fs, all real inode number are
-	 * unique, so we use the overlay st_dev, which is friendly to du -x.
-	 *
-	 * We also use st_ino of the copy up origin, if we know it.
-	 * This guaranties constant st_dev/st_ino across copy up.
+	 * For non-dir or same fs, we use st_ino of the copy up origin, if we
+	 * know it. This guaranties constant st_dev/st_ino across copy up.
 	 *
 	 * If filesystem supports NFS export ops, this also guaranties
 	 * persistent st_ino across mount cycle.
+	 *
+	 * When all layers are on the same fs, all real inode number are
+	 * unique, so we use the overlay st_dev, which is friendly to du -x.
 	 */
-	if (ovl_same_sb(dentry->d_sb)) {
+	if (!is_dir || samefs) {
 		if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
 			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
@@ -94,7 +95,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (err)
 				goto out;
 
-			WARN_ON_ONCE(stat->dev != lowerstat.dev);
 			/*
 			 * Lower hardlinks are broken on copy up to different
 			 * upper files, so we cannot use the lower origin st_ino
@@ -102,17 +102,23 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			 */
 			if (is_dir || lowerstat.nlink == 1)
 				stat->ino = lowerstat.ino;
+
+			if (samefs)
+				WARN_ON_ONCE(stat->dev != lowerstat.dev);
+			else
+				stat->dev = lowerstat.dev;
 		}
-		stat->dev = dentry->d_sb->s_dev;
-	} else if (is_dir) {
+		if (samefs)
+			stat->dev = dentry->d_sb->s_dev;
+	} else {
 		/*
-		 * If not all layers are on the same fs the pair {real st_ino;
-		 * overlay st_dev} is not unique, so use the non persistent
-		 * overlay st_ino.
-		 *
 		 * Always use the overlay st_dev for directories, so 'find
 		 * -xdev' will scan the entire overlay mount and won't cross the
 		 * overlay mount boundaries.
+		 *
+		 * If not all layers are on the same fs the pair {real st_ino;
+		 * overlay st_dev} is not unique, so use the non persistent
+		 * overlay st_ino for directories.
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 3/5] vfs: factor out lookup_one_len_init()
  2017-06-01  9:02 [PATCH v3 0/5] ovl: constant inode numbers (cont.) Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 1/5] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino Amir Goldstein
@ 2017-06-01  9:02 ` Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 4/5] vfs: add helper lookup_one_len_noperm() Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 5/5] ovl: consistent st_ino/d_ino Amir Goldstein
  4 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01  9:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

In preparation of adding a new helper lookup_one_len_noperm(),
factor out the common bits of lookup_one_len{,_unlocked}().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/namei.c | 88 +++++++++++++++++++++++++++-----------------------------------
 1 file changed, 38 insertions(+), 50 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 6571a5f5112e..263caf0efafe 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2446,50 +2446,61 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 EXPORT_SYMBOL(vfs_path_lookup);
 
-/**
- * lookup_one_len - filesystem helper to lookup single pathname component
- * @name:	pathname component to lookup
- * @base:	base directory to lookup from
- * @len:	maximum length @len should be interpreted to
- *
- * Note that this routine is purely a helper for filesystem usage and should
- * not be called by generic code.
- *
- * The caller must hold base->i_mutex.
- */
-struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
+static int lookup_one_len_init(struct qstr *this, struct dentry *base,
+			       const char *name, int len)
 {
-	struct qstr this;
 	unsigned int c;
-	int err;
-
-	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	this.name = name;
-	this.len = len;
-	this.hash = full_name_hash(base, name, len);
+	this->name = name;
+	this->len = len;
+	this->hash = full_name_hash(base, name, len);
 	if (!len)
-		return ERR_PTR(-EACCES);
+		return -EACCES;
 
 	if (unlikely(name[0] == '.')) {
 		if (len < 2 || (len == 2 && name[1] == '.'))
-			return ERR_PTR(-EACCES);
+			return -EACCES;
 	}
 
 	while (len--) {
 		c = *(const unsigned char *)name++;
 		if (c == '/' || c == '\0')
-			return ERR_PTR(-EACCES);
+			return -EACCES;
 	}
 	/*
 	 * See if the low-level filesystem might want
 	 * to use its own hash..
 	 */
 	if (base->d_flags & DCACHE_OP_HASH) {
-		int err = base->d_op->d_hash(base, &this);
+		int err = base->d_op->d_hash(base, this);
+
 		if (err < 0)
-			return ERR_PTR(err);
+			return err;
 	}
+	return 0;
+}
+
+/**
+ * lookup_one_len - filesystem helper to lookup single pathname component
+ * @name:	pathname component to lookup
+ * @base:	base directory to lookup from
+ * @len:	maximum length @len should be interpreted to
+ *
+ * Note that this routine is purely a helper for filesystem usage and should
+ * not be called by generic code.
+ *
+ * The caller must hold base->i_mutex.
+ */
+struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
+{
+	struct qstr this;
+	int err;
+
+	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
+
+	err = lookup_one_len_init(&this, base, name, len);
+	if (err)
+		return ERR_PTR(err);
 
 	err = inode_permission(base->d_inode, MAY_EXEC);
 	if (err)
@@ -2515,35 +2526,12 @@ struct dentry *lookup_one_len_unlocked(const char *name,
 				       struct dentry *base, int len)
 {
 	struct qstr this;
-	unsigned int c;
 	int err;
 	struct dentry *ret;
 
-	this.name = name;
-	this.len = len;
-	this.hash = full_name_hash(base, name, len);
-	if (!len)
-		return ERR_PTR(-EACCES);
-
-	if (unlikely(name[0] == '.')) {
-		if (len < 2 || (len == 2 && name[1] == '.'))
-			return ERR_PTR(-EACCES);
-	}
-
-	while (len--) {
-		c = *(const unsigned char *)name++;
-		if (c == '/' || c == '\0')
-			return ERR_PTR(-EACCES);
-	}
-	/*
-	 * See if the low-level filesystem might want
-	 * to use its own hash..
-	 */
-	if (base->d_flags & DCACHE_OP_HASH) {
-		int err = base->d_op->d_hash(base, &this);
-		if (err < 0)
-			return ERR_PTR(err);
-	}
+	err = lookup_one_len_init(&this, base, name, len);
+	if (err)
+		return ERR_PTR(err);
 
 	err = inode_permission(base->d_inode, MAY_EXEC);
 	if (err)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 4/5] vfs: add helper lookup_one_len_noperm()
  2017-06-01  9:02 [PATCH v3 0/5] ovl: constant inode numbers (cont.) Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-06-01  9:02 ` [PATCH v3 3/5] vfs: factor out lookup_one_len_init() Amir Goldstein
@ 2017-06-01  9:02 ` Amir Goldstein
  2017-06-05 12:36   ` Amir Goldstein
  2017-06-01  9:02 ` [PATCH v3 5/5] ovl: consistent st_ino/d_ino Amir Goldstein
  4 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01  9:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

This is a variant of lookup_one_len() that does not check task
permissions.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/namei.c            | 29 +++++++++++++++++++++++++++++
 include/linux/namei.h |  1 +
 2 files changed, 30 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index 263caf0efafe..837da8bfb36f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2511,6 +2511,35 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 EXPORT_SYMBOL(lookup_one_len);
 
 /**
+ * lookup_one_len_noperm - filesystem helper to lookup single pathname component
+ * @name:	pathname component to lookup
+ * @base:	base directory to lookup from
+ * @len:	maximum length @len should be interpreted to
+ *
+ * Note that this routine is purely a helper for filesystem usage and should
+ * not be called by generic code.
+ *
+ * Unlike lookup_one_len, it does not check the task permissions.
+ *
+ * The caller must hold base->i_mutex.
+ */
+struct dentry *lookup_one_len_noperm(const char *name, struct dentry *base,
+				     int len)
+{
+	struct qstr this;
+	int err;
+
+	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
+
+	err = lookup_one_len_init(&this, base, name, len);
+	if (err)
+		return ERR_PTR(err);
+
+	return __lookup_hash(&this, base, 0);
+}
+EXPORT_SYMBOL(lookup_one_len);
+
+/**
  * lookup_one_len_unlocked - filesystem helper to lookup single pathname component
  * @name:	pathname component to lookup
  * @base:	base directory to lookup from
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 8b4794e83196..afafd38f3680 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -81,6 +81,7 @@ extern struct dentry *kern_path_locked(const char *, struct path *);
 extern int kern_path_mountpoint(int, const char *, struct path *, unsigned int);
 
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
+extern struct dentry *lookup_one_len_noperm(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
 
 extern int follow_down_one(struct path *);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-01  9:02 [PATCH v3 0/5] ovl: constant inode numbers (cont.) Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-06-01  9:02 ` [PATCH v3 4/5] vfs: add helper lookup_one_len_noperm() Amir Goldstein
@ 2017-06-01  9:02 ` Amir Goldstein
  2017-06-20 21:25   ` Miklos Szeredi
  4 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01  9:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

This patch is based on an earlier POC by Miklos Szeredi.

When iterating an "impure" upper directory which may contain copy up
entries, lookup the overlay dentries to see if we know their copy up
origin and if we do, call vfs_getattr() on the overlay dentry
to make sure that d_ino will be consistent with st_ino from stat(2).

This may cause performance overhead of lookup_slow() per upper entry
for entries are not cached and it may cause memory pressure overhead
populating dcache entries that would not have been populated otherwise.

The overhead is minimal if the iterated entries are already in dcache.
It is also quite useful for the common case of 'ls -l' that readdir()
pre populates the dcache with the listed entries, making the following
stat() calls faster.

One exception to consistent d_ino is the case of a merge dir that was
moved into an opaque parent prior to v4.12.  This case will not have
marked parent opaque dir "impure" and therefore, ovl_iterate() will
optimize this to direct iteration on upper dir.  This situation will sort
itself out after another move into the opquae parent.  Worst case is that
the moved dir d_ino is inconsistent with its st_ino, which is not a
regression from version < 4.12.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     |   7 ++-
 fs/overlayfs/overlayfs.h |   2 +-
 fs/overlayfs/readdir.c   | 133 ++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 132 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 04b3d6cf4148..4ca6061f7bfa 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -308,18 +308,21 @@ static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
  * Returns next layer in stack starting from top.
  * Returns -1 if this is the last layer.
  */
-int ovl_path_next(int idx, struct dentry *dentry, struct path *path)
+int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
 	BUG_ON(idx < 0);
 	if (idx == 0) {
 		ovl_path_upper(dentry, path);
-		if (path->dentry)
+		if (path->dentry) {
+			*idxp = 0;
 			return oe->numlower ? 1 : -1;
+		}
 		idx++;
 	}
 	BUG_ON(idx > oe->numlower);
+	*idxp = idx;
 	*path = oe->lowerstack[idx - 1];
 
 	return (idx < oe->numlower) ? idx + 1 : -1;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 0623cebeefff..513e25e56eed 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -233,7 +233,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 
 
 /* namei.c */
-int ovl_path_next(int idx, struct dentry *dentry, struct path *path);
+int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
 bool ovl_lower_positive(struct dentry *dentry);
 
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index f241b4ee3d8a..8b7e6cb435bb 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -15,15 +15,18 @@
 #include <linux/rbtree.h>
 #include <linux/security.h>
 #include <linux/cred.h>
+#include <linux/ratelimit.h>
 #include "overlayfs.h"
 
 struct ovl_cache_entry {
 	unsigned int len;
 	unsigned int type;
+	u64 real_ino;
 	u64 ino;
 	struct list_head l_node;
 	struct rb_node node;
 	struct ovl_cache_entry *next_maybe_whiteout;
+	int idx;
 	bool is_whiteout;
 	char name[];
 };
@@ -43,6 +46,7 @@ struct ovl_readdir_data {
 	struct list_head middle;
 	struct ovl_cache_entry *first_maybe_whiteout;
 	int count;
+	int idx;
 	int err;
 	bool d_type_supported;
 };
@@ -97,8 +101,20 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
 	p->name[len] = '\0';
 	p->len = len;
 	p->type = d_type;
-	p->ino = ino;
+	p->ino = p->real_ino = ino;
+	/*
+	 * For upper dir entries and for upper non-dir entries of impure dir,
+	 * d_ino will be updated during ovl_iterate() to the copy up origin ino.
+	 * A non-impure upper dir cannot have non-dir entries with origin xattr,
+	 * but it can have legacy merge dir entries without origin xattr.
+	 * stat will show the lower ino for those legacy merge dir entries, so
+	 * readdir should show it as well.
+	 */
+	if (rdd->idx == 0 && rdd->dentry &&
+	    (d_type == DT_DIR || ovl_dentry_is_impure(rdd->dentry)))
+		p->ino = 0;
 	p->is_whiteout = false;
+	p->idx = rdd->idx;
 
 	if (d_type == DT_CHR) {
 		p->next_maybe_whiteout = rdd->first_maybe_whiteout;
@@ -225,6 +241,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
 	}
 	revert_creds(old_cred);
 
+
 	return err;
 }
 
@@ -256,21 +273,41 @@ static inline int ovl_dir_read(struct path *realpath,
 	return err;
 }
 
+/* Can we iterate real dir directly? */
+static bool ovl_dir_is_real(struct dentry *dir)
+{
+	enum ovl_path_type type = ovl_path_type(dir);
+
+	if (OVL_TYPE_MERGE(type))
+		return false;
+
+	/* Upper dir may contain copied up entries that were moved into it */
+	return !OVL_TYPE_UPPER(type) || !ovl_dentry_is_impure(dir);
+}
+
 static void ovl_dir_reset(struct file *file)
 {
 	struct ovl_dir_file *od = file->private_data;
 	struct ovl_dir_cache *cache = od->cache;
 	struct dentry *dentry = file->f_path.dentry;
-	enum ovl_path_type type = ovl_path_type(dentry);
+	bool is_real;
 
 	if (cache && ovl_dentry_version_get(dentry) != cache->version) {
 		ovl_cache_put(od, dentry);
 		od->cache = NULL;
 		od->cursor = NULL;
 	}
-	WARN_ON(!od->is_real && !OVL_TYPE_MERGE(type));
-	if (od->is_real && OVL_TYPE_MERGE(type))
+	is_real = ovl_dir_is_real(dentry);
+	if (od->is_real != is_real) {
+		/*
+		 * is_real can only become false. This can happen after dir
+		 * itself is copied up or after child is moved into pure upper
+		 * dir and makes the dir impure.
+		 */
+		if (WARN_ON(is_real))
+			return;
 		od->is_real = false;
+	}
 }
 
 static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list)
@@ -287,7 +324,7 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list)
 	int idx, next;
 
 	for (idx = 0; idx != -1; idx = next) {
-		next = ovl_path_next(idx, dentry, &realpath);
+		next = ovl_path_next(idx, dentry, &realpath, &rdd.idx);
 
 		if (next != -1) {
 			err = ovl_dir_read(&realpath, &rdd);
@@ -353,11 +390,87 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
 	return cache;
 }
 
+/*
+ * Set d_ino for upper entries with known copy up origin.
+ *
+ * Non-upper entries should always report the uppermost real inode ino and
+ * this function should never be called for them.
+ *
+ * Upper entries should report the ino of their copy up origin if it is known.
+ * Call vfs_getattr() on the overlay entries to make sure that d_ino will be
+ * consistent with st_ino from stat(2).
+ */
+static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
+
+{
+	struct dentry *dir = path->dentry;
+	struct dentry *this = NULL;
+	enum ovl_path_type type;
+	u64 ino = p->real_ino;
+	int err = 0;
+
+	if (WARN_ON(p->idx))
+		goto out;
+
+	if (p->name[0] == '.') {
+		if (p->len == 1) {
+			this = dget(dir);
+			goto get;
+		}
+		if (p->len == 2 && p->name[1] == '.') {
+			/* we shall not be moved */
+			this = dget(dir->d_parent);
+			goto get;
+		}
+	}
+
+	/* Task doing readdir may not have exec permissions on dir */
+	this = lookup_one_len_noperm(p->name, dir, p->len);
+	if (IS_ERR_OR_NULL(this) || !this->d_inode) {
+		if (IS_ERR(this)) {
+			err = PTR_ERR(this);
+			this = NULL;
+			goto fail;
+		}
+		goto out;
+	}
+
+get:
+	type = ovl_path_type(this);
+	if (OVL_TYPE_ORIGIN(type)) {
+		struct kstat stat;
+		struct path statpath = *path;
+		bool samefs = ovl_same_sb(dir->d_sb);
+		bool is_dir = S_ISDIR(this->d_inode->i_mode);
+
+		statpath.dentry = this;
+		err = vfs_getattr(&statpath, &stat, STATX_INO, 0);
+		if (err)
+			goto fail;
+
+		if (samefs || is_dir)
+			WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
+
+		ino = stat.ino;
+	}
+
+out:
+	p->ino = ino;
+	dput(this);
+	return err;
+
+fail:
+	pr_warn_ratelimited("overlay: failed to look up (%s) for ino (%i)\n",
+			    p->name, err);
+	goto out;
+}
+
 static int ovl_iterate(struct file *file, struct dir_context *ctx)
 {
 	struct ovl_dir_file *od = file->private_data;
 	struct dentry *dentry = file->f_path.dentry;
 	struct ovl_cache_entry *p;
+	int err;
 
 	if (!ctx->pos)
 		ovl_dir_reset(file);
@@ -378,9 +491,15 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
 
 	while (od->cursor != &od->cache->entries) {
 		p = list_entry(od->cursor, struct ovl_cache_entry, l_node);
-		if (!p->is_whiteout)
+		if (!p->is_whiteout) {
+			if (!p->ino) {
+				err = ovl_cache_update_ino(&file->f_path, p);
+				if (err)
+					return err;
+			}
 			if (!dir_emit(ctx, p->name, p->len, p->ino, p->type))
 				break;
+		}
 		od->cursor = p->l_node.next;
 		ctx->pos++;
 	}
@@ -502,7 +621,7 @@ static int ovl_dir_open(struct inode *inode, struct file *file)
 		return PTR_ERR(realfile);
 	}
 	od->realfile = realfile;
-	od->is_real = !OVL_TYPE_MERGE(type);
+	od->is_real = ovl_dir_is_real(file->f_path.dentry);
 	od->is_upper = OVL_TYPE_UPPER(type);
 	file->private_data = od;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-01  9:02 ` [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino Amir Goldstein
@ 2017-06-01 12:30   ` Chandan Rajendra
  2017-06-01 13:32     ` Amir Goldstein
  2017-06-01 19:55   ` Miklos Szeredi
  1 sibling, 1 reply; 38+ messages in thread
From: Chandan Rajendra @ 2017-06-01 12:30 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

On Thursday, June 1, 2017 2:32:56 PM IST Amir Goldstein wrote:
> For the case of all layers not on the same fs, use the copy up
> origin st_ino/st_dev for non-dir, if we know it.
> 
> This guarantied constant and persistent st_ino/st_dev for non-dir,
> but not system-wide uniqueness, like in the same fs case.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c | 32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index d613e2c41242..5f285c179bbd 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -65,6 +65,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  	struct path realpath;
>  	const struct cred *old_cred;
>  	bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
> +	bool samefs = ovl_same_sb(dentry->d_sb);
>  	int err;
> 
>  	type = ovl_path_real(dentry, &realpath);
> @@ -74,16 +75,16 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		goto out;
> 
>  	/*
> -	 * When all layers are on the same fs, all real inode number are
> -	 * unique, so we use the overlay st_dev, which is friendly to du -x.
> -	 *
> -	 * We also use st_ino of the copy up origin, if we know it.
> -	 * This guaranties constant st_dev/st_ino across copy up.
> +	 * For non-dir or same fs, we use st_ino of the copy up origin, if we
> +	 * know it. This guaranties constant st_dev/st_ino across copy up.
>  	 *
>  	 * If filesystem supports NFS export ops, this also guaranties
>  	 * persistent st_ino across mount cycle.
> +	 *
> +	 * When all layers are on the same fs, all real inode number are
> +	 * unique, so we use the overlay st_dev, which is friendly to du -x.
>  	 */
> -	if (ovl_same_sb(dentry->d_sb)) {
> +	if (!is_dir || samefs) {
>  		if (OVL_TYPE_ORIGIN(type)) {
>  			struct kstat lowerstat;
>  			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
> @@ -94,7 +95,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  			if (err)
>  				goto out;
> 
> -			WARN_ON_ONCE(stat->dev != lowerstat.dev);
>  			/*
>  			 * Lower hardlinks are broken on copy up to different
>  			 * upper files, so we cannot use the lower origin st_ino
> @@ -102,17 +102,23 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  			 */
>  			if (is_dir || lowerstat.nlink == 1)
>  				stat->ino = lowerstat.ino;
> +
> +			if (samefs)
> +				WARN_ON_ONCE(stat->dev != lowerstat.dev);
> +			else
> +				stat->dev = lowerstat.dev;
>  		}
> -		stat->dev = dentry->d_sb->s_dev;
> -	} else if (is_dir) {
> +		if (samefs)
> +			stat->dev = dentry->d_sb->s_dev;
> +	} else {
>  		/*
> -		 * If not all layers are on the same fs the pair {real st_ino;
> -		 * overlay st_dev} is not unique, so use the non persistent
> -		 * overlay st_ino.
> -		 *
>  		 * Always use the overlay st_dev for directories, so 'find
>  		 * -xdev' will scan the entire overlay mount and won't cross the
>  		 * overlay mount boundaries.
> +		 *
> +		 * If not all layers are on the same fs the pair {real st_ino;
> +		 * overlay st_dev} is not unique, so use the non persistent
> +		 * overlay st_ino for directories.
>  		 */
>  		stat->dev = dentry->d_sb->s_dev;
>  		stat->ino = dentry->d_inode->i_ino;
> 

Hi Amir,

The following question is not directly related to your patch. 

With this patch applied, we have the following code snippet inside
ovl_getattr(),

       if (!is_dir || samefs) {
                if (OVL_TYPE_ORIGIN(type)) {
                        struct kstat lowerstat;
                        u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);

                        ovl_path_lower(dentry, &realpath);
                        err = vfs_getattr(&realpath, &lowerstat,
                                          lowermask, flags);
                        if (err)
                                goto out;

When invoking vfs_getattr() on a non-dir file, the code assumes that the
copy-up origin inode always exists in lowerdir. Are we assuming that after an
overlayfs filesystem has been unmounted, the lower filesystem should never be
manipulated (e.g. deleting the file which was the copy-up origin)? If the
copy-up origin file gets deleted, On a later mount of the overlayfs
filesystem, calls to stat(2) for the corresponding upperdir file will fail
because of vfs_getattr() returning an error code.

-- 
chandan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-01 12:30   ` Chandan Rajendra
@ 2017-06-01 13:32     ` Amir Goldstein
  0 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-01 13:32 UTC (permalink / raw)
  To: Chandan Rajendra; +Cc: Miklos Szeredi, overlayfs

On Thu, Jun 1, 2017 at 3:30 PM, Chandan Rajendra
<chandan@linux.vnet.ibm.com> wrote:
> On Thursday, June 1, 2017 2:32:56 PM IST Amir Goldstein wrote:
>> For the case of all layers not on the same fs, use the copy up
>> origin st_ino/st_dev for non-dir, if we know it.
>>
...
> Hi Amir,
>
> The following question is not directly related to your patch.
>
> With this patch applied, we have the following code snippet inside
> ovl_getattr(),
>
>        if (!is_dir || samefs) {
>                 if (OVL_TYPE_ORIGIN(type)) {
>                         struct kstat lowerstat;
>                         u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
>
>                         ovl_path_lower(dentry, &realpath);
>                         err = vfs_getattr(&realpath, &lowerstat,
>                                           lowermask, flags);
>                         if (err)
>                                 goto out;
>
> When invoking vfs_getattr() on a non-dir file, the code assumes that the
> copy-up origin inode always exists in lowerdir.

Code does not assume that.
Code verifies that the copy up origin stored in xattr overlay.origin
is still valid
and not stale in ovl_check_origin().

> Are we assuming that after an
> overlayfs filesystem has been unmounted, the lower filesystem should never be
> manipulated (e.g. deleting the file which was the copy-up origin)? If the
> copy-up origin file gets deleted, On a later mount of the overlayfs
> filesystem, calls to stat(2) for the corresponding upperdir file will fail
> because of vfs_getattr() returning an error code.
>

The case you are describing can happen. It will cause ovl_check_origin()
to NOT follow the stored copy up origin and in the snippet above
OVL_TYPE_ORIGIN(type) will be false.

Furthermore, even if copy up origin was not deleted, but the lower
layer has been copied (cp -a) or the lower filesystem does not support
NFS export, or lower filesystem does not have a valid UUID (e.g. tmpfs).
All those cases will end up with OVL_TYPE_ORIGIN(type) being false
and then inode numbers will not remain constant across a mount cycle
(or drop caches).

Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-01  9:02 ` [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino Amir Goldstein
  2017-06-01 12:30   ` Chandan Rajendra
@ 2017-06-01 19:55   ` Miklos Szeredi
  2017-06-02  6:32     ` Amir Goldstein
  1 sibling, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-01 19:55 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> For the case of all layers not on the same fs, use the copy up
> origin st_ino/st_dev for non-dir, if we know it.
>
> This guarantied constant and persistent st_ino/st_dev for non-dir,
> but not system-wide uniqueness, like in the same fs case.

I'd rather leave this until the st_dev uniqueness is sorted out.   Non
unique st_ino/st_dev is just asking for trouble.

Thanks,
Miklos

>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c | 32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index d613e2c41242..5f285c179bbd 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -65,6 +65,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>         struct path realpath;
>         const struct cred *old_cred;
>         bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
> +       bool samefs = ovl_same_sb(dentry->d_sb);
>         int err;
>
>         type = ovl_path_real(dentry, &realpath);
> @@ -74,16 +75,16 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>                 goto out;
>
>         /*
> -        * When all layers are on the same fs, all real inode number are
> -        * unique, so we use the overlay st_dev, which is friendly to du -x.
> -        *
> -        * We also use st_ino of the copy up origin, if we know it.
> -        * This guaranties constant st_dev/st_ino across copy up.
> +        * For non-dir or same fs, we use st_ino of the copy up origin, if we
> +        * know it. This guaranties constant st_dev/st_ino across copy up.
>          *
>          * If filesystem supports NFS export ops, this also guaranties
>          * persistent st_ino across mount cycle.
> +        *
> +        * When all layers are on the same fs, all real inode number are
> +        * unique, so we use the overlay st_dev, which is friendly to du -x.
>          */
> -       if (ovl_same_sb(dentry->d_sb)) {
> +       if (!is_dir || samefs) {
>                 if (OVL_TYPE_ORIGIN(type)) {
>                         struct kstat lowerstat;
>                         u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
> @@ -94,7 +95,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>                         if (err)
>                                 goto out;
>
> -                       WARN_ON_ONCE(stat->dev != lowerstat.dev);
>                         /*
>                          * Lower hardlinks are broken on copy up to different
>                          * upper files, so we cannot use the lower origin st_ino
> @@ -102,17 +102,23 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>                          */
>                         if (is_dir || lowerstat.nlink == 1)
>                                 stat->ino = lowerstat.ino;
> +
> +                       if (samefs)
> +                               WARN_ON_ONCE(stat->dev != lowerstat.dev);
> +                       else
> +                               stat->dev = lowerstat.dev;
>                 }
> -               stat->dev = dentry->d_sb->s_dev;
> -       } else if (is_dir) {
> +               if (samefs)
> +                       stat->dev = dentry->d_sb->s_dev;
> +       } else {
>                 /*
> -                * If not all layers are on the same fs the pair {real st_ino;
> -                * overlay st_dev} is not unique, so use the non persistent
> -                * overlay st_ino.
> -                *
>                  * Always use the overlay st_dev for directories, so 'find
>                  * -xdev' will scan the entire overlay mount and won't cross the
>                  * overlay mount boundaries.
> +                *
> +                * If not all layers are on the same fs the pair {real st_ino;
> +                * overlay st_dev} is not unique, so use the non persistent
> +                * overlay st_ino for directories.
>                  */
>                 stat->dev = dentry->d_sb->s_dev;
>                 stat->ino = dentry->d_inode->i_ino;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-01 19:55   ` Miklos Szeredi
@ 2017-06-02  6:32     ` Amir Goldstein
  2017-06-02 12:27       ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-02  6:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Thu, Jun 1, 2017 at 10:55 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> For the case of all layers not on the same fs, use the copy up
>> origin st_ino/st_dev for non-dir, if we know it.
>>
>> This guarantied constant and persistent st_ino/st_dev for non-dir,
>> but not system-wide uniqueness, like in the same fs case.
>
> I'd rather leave this until the st_dev uniqueness is sorted out.   Non
> unique st_ino/st_dev is just asking for trouble.
>

I don't mind leaving this out, but you do realize that this commit
doesn't change st_dev/st_ino uniqueness one way or the other, right?

- Directory st_dev/st_ino *remains* constant, unique and non-persistent
- Non-dir st_dev/st_ino *remains* non-unique, but becomes constant and
persistent

So I can't really see the harm in this patch.

Nevertheless, let me know if you want me to re-post consistent d_ino without
the non-samefs relax patches. Actually, consistent d_ino patch is
correct with or
without the non-samefs relax patches, but it misses an optimization for skipping
vfs_getattr() for (impure && !samefs) case.

Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-02  6:32     ` Amir Goldstein
@ 2017-06-02 12:27       ` Miklos Szeredi
  2017-06-02 13:23         ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-02 12:27 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Fri, Jun 2, 2017 at 8:32 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Jun 1, 2017 at 10:55 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> For the case of all layers not on the same fs, use the copy up
>>> origin st_ino/st_dev for non-dir, if we know it.
>>>
>>> This guarantied constant and persistent st_ino/st_dev for non-dir,
>>> but not system-wide uniqueness, like in the same fs case.
>>
>> I'd rather leave this until the st_dev uniqueness is sorted out.   Non
>> unique st_ino/st_dev is just asking for trouble.
>>
>
> I don't mind leaving this out, but you do realize that this commit
> doesn't change st_dev/st_ino uniqueness one way or the other, right?

Depends on how you look at it.

Before the patch it was (just the non-samefs, non-directory case):

 - if lower: st_ino/st_dev comes from lower
 - if upper: st_ino/st_dev comes from upper

This only violates the uniqueness rule if check+use of st_ino/st_dev
races with the copy up

After the patch it's:

 - if copied up: st_ino/st_dev comes from origin
 - otherwise same as above

Now the uniqueness is violated for all copied up files (origin and
overlay file will have the same st_ino/st_dev pair even though they
possibly have different contents).

So that transforms a race into a permanent error, which is not good.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-02 12:27       ` Miklos Szeredi
@ 2017-06-02 13:23         ` Amir Goldstein
  2017-06-02 13:26           ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-02 13:23 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Fri, Jun 2, 2017 at 3:27 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Jun 2, 2017 at 8:32 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Thu, Jun 1, 2017 at 10:55 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> For the case of all layers not on the same fs, use the copy up
>>>> origin st_ino/st_dev for non-dir, if we know it.
>>>>
>>>> This guarantied constant and persistent st_ino/st_dev for non-dir,
>>>> but not system-wide uniqueness, like in the same fs case.
>>>
>>> I'd rather leave this until the st_dev uniqueness is sorted out.   Non
>>> unique st_ino/st_dev is just asking for trouble.
>>>
>>
>> I don't mind leaving this out, but you do realize that this commit
>> doesn't change st_dev/st_ino uniqueness one way or the other, right?
>
> Depends on how you look at it.
>
> Before the patch it was (just the non-samefs, non-directory case):
>
>  - if lower: st_ino/st_dev comes from lower
>  - if upper: st_ino/st_dev comes from upper
>
> This only violates the uniqueness rule if check+use of st_ino/st_dev
> races with the copy up
>
> After the patch it's:
>
>  - if copied up: st_ino/st_dev comes from origin
>  - otherwise same as above
>
> Now the uniqueness is violated for all copied up files (origin and
> overlay file will have the same st_ino/st_dev pair even though they
> possibly have different contents).
>
> So that transforms a race into a permanent error, which is not good.
>

Right. of course.
So do you want me to re-post the consistent d_ino for samefs case only?

I would like to start posting the index/hardlink patches, so though I would
flush out the constino buffer first.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-02 13:23         ` Amir Goldstein
@ 2017-06-02 13:26           ` Miklos Szeredi
  2017-06-02 13:34             ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-02 13:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Fri, Jun 2, 2017 at 3:23 PM, Amir Goldstein <amir73il@gmail.com> wrote:

> I would like to start posting the index/hardlink patches, so though I would
> flush out the constino buffer first.

No need to resend, I'll just leave this out.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino
  2017-06-02 13:26           ` Miklos Szeredi
@ 2017-06-02 13:34             ` Amir Goldstein
  0 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-02 13:34 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Fri, Jun 2, 2017 at 4:26 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Jun 2, 2017 at 3:23 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>> I would like to start posting the index/hardlink patches, so though I would
>> flush out the constino buffer first.
>
> No need to resend, I'll just leave this out.
>

OK. Please do leave this patch in:
ovl: relax same fs constrain for ovl_check_origin()

index/hardlinks code does NOT assume samefs
and requires that lookup will find the origin also for non samefs case.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 4/5] vfs: add helper lookup_one_len_noperm()
  2017-06-01  9:02 ` [PATCH v3 4/5] vfs: add helper lookup_one_len_noperm() Amir Goldstein
@ 2017-06-05 12:36   ` Amir Goldstein
  0 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-05 12:36 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, overlayfs

On Thu, Jun 1, 2017 at 12:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> This is a variant of lookup_one_len() that does not check task
> permissions.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/namei.c            | 29 +++++++++++++++++++++++++++++
>  include/linux/namei.h |  1 +
>  2 files changed, 30 insertions(+)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 263caf0efafe..837da8bfb36f 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2511,6 +2511,35 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
>  EXPORT_SYMBOL(lookup_one_len);
>
>  /**
> + * lookup_one_len_noperm - filesystem helper to lookup single pathname component
> + * @name:      pathname component to lookup
> + * @base:      base directory to lookup from
> + * @len:       maximum length @len should be interpreted to
> + *
> + * Note that this routine is purely a helper for filesystem usage and should
> + * not be called by generic code.
> + *
> + * Unlike lookup_one_len, it does not check the task permissions.
> + *
> + * The caller must hold base->i_mutex.
> + */
> +struct dentry *lookup_one_len_noperm(const char *name, struct dentry *base,
> +                                    int len)
> +{
> +       struct qstr this;
> +       int err;
> +
> +       WARN_ON_ONCE(!inode_is_locked(base->d_inode));
> +
> +       err = lookup_one_len_init(&this, base, name, len);
> +       if (err)
> +               return ERR_PTR(err);
> +
> +       return __lookup_hash(&this, base, 0);
> +}
> +EXPORT_SYMBOL(lookup_one_len);

Miklos,

Chandan pointed out to me that I have a copy&paste typo:

+EXPORT_SYMBOL(lookup_one_len_noperm);

Please fix when applying if I don't end up sending another version.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-01  9:02 ` [PATCH v3 5/5] ovl: consistent st_ino/d_ino Amir Goldstein
@ 2017-06-20 21:25   ` Miklos Szeredi
  2017-06-21  8:03     ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-20 21:25 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> This patch is based on an earlier POC by Miklos Szeredi.
>
> When iterating an "impure" upper directory which may contain copy up
> entries, lookup the overlay dentries to see if we know their copy up
> origin and if we do, call vfs_getattr() on the overlay dentry
> to make sure that d_ino will be consistent with st_ino from stat(2).

Getting back to this, because there are issues:

 - ".." can have different origin even if dir is pure (pure upper
residing in merged dir, pure lower residing in dir which has another
lower layer (which is the origin) above it)

 - we check impure at open time, but dir can become impure between two
getdents*(2) calls, we can get wrong inode numbers

Becoming impure needs to be handled differently from what this patch
does, because we can't switch to cached readdir in the middle of the
directory listing.

To handle that we need to add our own actor to the is_real case to fix
up the inode numbers.  This would fix the ".." case as well.

The problem is, we are inside underlying fs code when  actor is
called, so there's not much we can do without risking deadlocks.  One
thing we can do (hopefully) is call d_lookup() to check for a cached
overlay dentry, and get the inode number out.

 We need to store the full inode number in the overlay inode, because
we can't do vfs_getattr() either, but that's a good optimization
anyway.

If the overlay dentry  is not in the cache we need to abort the
iteration, lookup the dentry, and continue iterating.

Thoughts?  I'll have a go at this tomorrow.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-20 21:25   ` Miklos Szeredi
@ 2017-06-21  8:03     ` Amir Goldstein
  2017-06-21  8:20       ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-21  8:03 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 12:25 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> This patch is based on an earlier POC by Miklos Szeredi.
>>
>> When iterating an "impure" upper directory which may contain copy up
>> entries, lookup the overlay dentries to see if we know their copy up
>> origin and if we do, call vfs_getattr() on the overlay dentry
>> to make sure that d_ino will be consistent with st_ino from stat(2).
>
> Getting back to this, because there are issues:
>
>  - ".." can have different origin even if dir is pure (pure upper
> residing in merged dir, pure lower residing in dir which has another
> lower layer (which is the origin) above it)

If we need to add complexity only for the sake of consistent d_ino for
"..", then IMO we should leave it inconsistent (as is it today).
Directories don't have consistent d_ino today and nobody complains
(right?). As I wrote before, 'find' doesn't look at d_ino for directories,
it always checks st_ino for dir (i.e. with find -ino N).
So as long as won't we are not aware of any application for which
that matters, let's leave directories at "best effort" w.r.t consistent
d_ino. They are already best effort w.r.t persistent st_ino (!samefs).


>
>  - we check impure at open time,

Not only. we also check impure (ovl_dir_is_real) at ovl_dir_reset(),
so at first call to getdents(2).

> but dir can become impure between two
> getdents*(2) calls, we can get wrong inode numbers
>
> Becoming impure needs to be handled differently from what this patch
> does, because we can't switch to cached readdir in the middle of the
> directory listing.
>
> To handle that we need to add our own actor to the is_real case to fix
> up the inode numbers.  This would fix the ".." case as well.
>
> The problem is, we are inside underlying fs code when  actor is
> called, so there's not much we can do without risking deadlocks.  One
> thing we can do (hopefully) is call d_lookup() to check for a cached
> overlay dentry, and get the inode number out.
>
>  We need to store the full inode number in the overlay inode, because
> we can't do vfs_getattr() either, but that's a good optimization
> anyway.
>
> If the overlay dentry  is not in the cache we need to abort the
> iteration, lookup the dentry, and continue iterating.
>
> Thoughts?  I'll have a go at this tomorrow.
>

Thinking out loud: If we store od->version for is_real
dir containing the dentry version when we decided that
od->is_real. Then we can compare version in ovl_iterate() and if
version has changed then call ovl_dir_reset() and if is_real became
false then "abort the iteration, lookup the dentry, and continue iterating."
without having to go through all the pain involved with acting in underlying
fs code context.

I might note, though you are probably aware, that dir becoming impure
due to copy up is not a problem, because then dir is either already merge
before copy up or we are already iterating real lower dir.

The rename origin into pure upper dir case, is synchronized with overlay
dir inode lock against any single getdents(2) call, so dir becoming impure
due to rename can only happen between two getdents(2) calls as you wrote.

Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  8:03     ` Amir Goldstein
@ 2017-06-21  8:20       ` Miklos Szeredi
  2017-06-21  8:38         ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-21  8:20 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 10:03 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Jun 21, 2017 at 12:25 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, Jun 1, 2017 at 11:02 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> This patch is based on an earlier POC by Miklos Szeredi.
>>>
>>> When iterating an "impure" upper directory which may contain copy up
>>> entries, lookup the overlay dentries to see if we know their copy up
>>> origin and if we do, call vfs_getattr() on the overlay dentry
>>> to make sure that d_ino will be consistent with st_ino from stat(2).
>>
>> Getting back to this, because there are issues:
>>
>>  - ".." can have different origin even if dir is pure (pure upper
>> residing in merged dir, pure lower residing in dir which has another
>> lower layer (which is the origin) above it)
>
> If we need to add complexity only for the sake of consistent d_ino for
> "..", then IMO we should leave it inconsistent (as is it today).
> Directories don't have consistent d_ino today and nobody complains
> (right?). As I wrote before, 'find' doesn't look at d_ino for directories,
> it always checks st_ino for dir (i.e. with find -ino N).
> So as long as won't we are not aware of any application for which
> that matters, let's leave directories at "best effort" w.r.t consistent
> d_ino. They are already best effort w.r.t persistent st_ino (!samefs).
>
>
>>
>>  - we check impure at open time,
>
> Not only. we also check impure (ovl_dir_is_real) at ovl_dir_reset(),
> so at first call to getdents(2).
>
>> but dir can become impure between two
>> getdents*(2) calls, we can get wrong inode numbers
>>
>> Becoming impure needs to be handled differently from what this patch
>> does, because we can't switch to cached readdir in the middle of the
>> directory listing.
>>
>> To handle that we need to add our own actor to the is_real case to fix
>> up the inode numbers.  This would fix the ".." case as well.
>>
>> The problem is, we are inside underlying fs code when  actor is
>> called, so there's not much we can do without risking deadlocks.  One
>> thing we can do (hopefully) is call d_lookup() to check for a cached
>> overlay dentry, and get the inode number out.
>>
>>  We need to store the full inode number in the overlay inode, because
>> we can't do vfs_getattr() either, but that's a good optimization
>> anyway.
>>
>> If the overlay dentry  is not in the cache we need to abort the
>> iteration, lookup the dentry, and continue iterating.
>>
>> Thoughts?  I'll have a go at this tomorrow.
>>
>
> Thinking out loud: If we store od->version for is_real
> dir containing the dentry version when we decided that
> od->is_real. Then we can compare version in ovl_iterate() and if
> version has changed then call ovl_dir_reset() and if is_real became
> false then "abort the iteration, lookup the dentry, and continue iterating."
> without having to go through all the pain involved with acting in underlying
> fs code context.
>
> I might note, though you are probably aware, that dir becoming impure
> due to copy up is not a problem, because then dir is either already merge
> before copy up or we are already iterating real lower dir.

Right.  The problem is when dir becomes impure due to rename between
two getdents(2) calls.  We can't call ovl_dir_reset() because the
offset is not zero.  We could do a "cache with head cut off" starting
from the current offset in the dir and finish with that.  Yeah, it's
probably less complexity than trying to intercept the actor...

BTW, the ".." case could be handled pretty simply, parent is locked,
we can query the inode number before calling underlying iterate.  But
surely we can leave this to the end.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  8:20       ` Miklos Szeredi
@ 2017-06-21  8:38         ` Miklos Szeredi
  2017-06-21  8:45           ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-21  8:38 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 10:20 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:

> Right.  The problem is when dir becomes impure due to rename between
> two getdents(2) calls.  We can't call ovl_dir_reset() because the
> offset is not zero.  We could do a "cache with head cut off" starting
> from the current offset in the dir and finish with that.  Yeah, it's
> probably less complexity than trying to intercept the actor...

On the other hand, we'll need to accommodate the native directory
indexing (i.e. seekdir(3) hell) in the cache, which is going to add to
complexity.  Ugh.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  8:38         ` Miklos Szeredi
@ 2017-06-21  8:45           ` Miklos Szeredi
  2017-06-21  8:49             ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-21  8:45 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 10:38 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 10:20 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
>> Right.  The problem is when dir becomes impure due to rename between
>> two getdents(2) calls.  We can't call ovl_dir_reset() because the
>> offset is not zero.  We could do a "cache with head cut off" starting
>> from the current offset in the dir and finish with that.  Yeah, it's
>> probably less complexity than trying to intercept the actor...
>
> On the other hand, we'll need to accommodate the native directory
> indexing (i.e. seekdir(3) hell) in the cache, which is going to add to
> complexity.  Ugh.

Another idea:  we are allowed to omit directory entries added after
the opendir/rewinddir.  So if there was a simple way to filter out
newly added entries which have origin then we are fine.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  8:45           ` Miklos Szeredi
@ 2017-06-21  8:49             ` Amir Goldstein
  2017-06-21  8:53               ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-21  8:49 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 11:45 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 10:38 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jun 21, 2017 at 10:20 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>>> Right.  The problem is when dir becomes impure due to rename between
>>> two getdents(2) calls.  We can't call ovl_dir_reset() because the
>>> offset is not zero.  We could do a "cache with head cut off" starting
>>> from the current offset in the dir and finish with that.  Yeah, it's
>>> probably less complexity than trying to intercept the actor...
>>
>> On the other hand, we'll need to accommodate the native directory
>> indexing (i.e. seekdir(3) hell) in the cache, which is going to add to
>> complexity.  Ugh.
>
> Another idea:  we are allowed to omit directory entries added after
> the opendir/rewinddir.  So if there was a simple way to filter out
> newly added entries which have origin then we are fine.
>

Then how about this: implement an actor that ONLY looks in
overlay dcache for the entries.
If entry is in dcache then can take inode number from overlay
inode.
If entry is not in dcache AND the file was just moved into dir
(quite unlinkely) then write off d_ino as a transient inconsistency.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  8:49             ` Amir Goldstein
@ 2017-06-21  8:53               ` Miklos Szeredi
  2017-06-21  9:05                 ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-21  8:53 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 10:49 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Jun 21, 2017 at 11:45 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jun 21, 2017 at 10:38 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Jun 21, 2017 at 10:20 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>
>>>> Right.  The problem is when dir becomes impure due to rename between
>>>> two getdents(2) calls.  We can't call ovl_dir_reset() because the
>>>> offset is not zero.  We could do a "cache with head cut off" starting
>>>> from the current offset in the dir and finish with that.  Yeah, it's
>>>> probably less complexity than trying to intercept the actor...
>>>
>>> On the other hand, we'll need to accommodate the native directory
>>> indexing (i.e. seekdir(3) hell) in the cache, which is going to add to
>>> complexity.  Ugh.
>>
>> Another idea:  we are allowed to omit directory entries added after
>> the opendir/rewinddir.  So if there was a simple way to filter out
>> newly added entries which have origin then we are fine.
>>
>
> Then how about this: implement an actor that ONLY looks in
> overlay dcache for the entries.
> If entry is in dcache then can take inode number from overlay
> inode.
> If entry is not in dcache AND the file was just moved into dir
> (quite unlinkely) then write off d_ino as a transient inconsistency.

Note: in theory the "just moved into dir" can be two weeks before.  In
practice it's unlikely that two calls of getdents(2) will be so far
apart in time.

Still, this feels a bit too hacky.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  8:53               ` Miklos Szeredi
@ 2017-06-21  9:05                 ` Amir Goldstein
  2017-06-21  9:20                   ` Miklos Szeredi
  0 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-06-21  9:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 11:53 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 10:49 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Wed, Jun 21, 2017 at 11:45 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Jun 21, 2017 at 10:38 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Wed, Jun 21, 2017 at 10:20 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>
>>>>> Right.  The problem is when dir becomes impure due to rename between
>>>>> two getdents(2) calls.  We can't call ovl_dir_reset() because the
>>>>> offset is not zero.  We could do a "cache with head cut off" starting
>>>>> from the current offset in the dir and finish with that.  Yeah, it's
>>>>> probably less complexity than trying to intercept the actor...
>>>>
>>>> On the other hand, we'll need to accommodate the native directory
>>>> indexing (i.e. seekdir(3) hell) in the cache, which is going to add to
>>>> complexity.  Ugh.
>>>
>>> Another idea:  we are allowed to omit directory entries added after
>>> the opendir/rewinddir.  So if there was a simple way to filter out
>>> newly added entries which have origin then we are fine.
>>>
>>

Following up on your idea:
- check in ovl_iterate() if version has changed and if dir became impure
- if it did, populate od->cache, but keep the dir od->is_real
- iterate upper cache entries and call ovl_cache_update_ino()
- Then actor of real dir iterator can use the cache to ommit entries or use
  p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:05                 ` Amir Goldstein
@ 2017-06-21  9:20                   ` Miklos Szeredi
  2017-06-21  9:36                     ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-21  9:20 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Jun 21, 2017 at 11:53 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jun 21, 2017 at 10:49 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On Wed, Jun 21, 2017 at 11:45 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Wed, Jun 21, 2017 at 10:38 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>> On Wed, Jun 21, 2017 at 10:20 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>
>>>>>> Right.  The problem is when dir becomes impure due to rename between
>>>>>> two getdents(2) calls.  We can't call ovl_dir_reset() because the
>>>>>> offset is not zero.  We could do a "cache with head cut off" starting
>>>>>> from the current offset in the dir and finish with that.  Yeah, it's
>>>>>> probably less complexity than trying to intercept the actor...
>>>>>
>>>>> On the other hand, we'll need to accommodate the native directory
>>>>> indexing (i.e. seekdir(3) hell) in the cache, which is going to add to
>>>>> complexity.  Ugh.
>>>>
>>>> Another idea:  we are allowed to omit directory entries added after
>>>> the opendir/rewinddir.  So if there was a simple way to filter out
>>>> newly added entries which have origin then we are fine.
>>>>
>>>
>
> Following up on your idea:
> - check in ovl_iterate() if version has changed and if dir became impure
> - if it did, populate od->cache, but keep the dir od->is_real
> - iterate upper cache entries and call ovl_cache_update_ino()
> - Then actor of real dir iterator can use the cache to ommit entries or use
>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.

For non-merge dirs we can have a simplified cache just containing the
entries with origin, recreated when the version changes or updated in
rename, whichever is simpler.  A non-merge dir will never become a
merge one, so we can keep the handling separate.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:20                   ` Miklos Szeredi
@ 2017-06-21  9:36                     ` Amir Goldstein
  2017-06-21  9:38                       ` Amir Goldstein
  2017-06-21  9:48                       ` Miklos Szeredi
  0 siblings, 2 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-21  9:36 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:

>>
>> Following up on your idea:
>> - check in ovl_iterate() if version has changed and if dir became impure
>> - if it did, populate od->cache, but keep the dir od->is_real
>> - iterate upper cache entries and call ovl_cache_update_ino()
>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>
> For non-merge dirs we can have a simplified cache just containing the
> entries with origin, recreated when the version changes or updated in
> rename, whichever is simpler.  A non-merge dir will never become a
> merge one, so we can keep the handling separate.
>

And use this cache to ommit/fix entries with actor?

Just thought of another issue related to pseudo lower st_dev
we need the underlying iter actor also for replacing real st_dev
with pseudo st_dev.

Also, just realized that with inodes index, the upper non-dir inodes
with nlink == 1 are not origin suspects. Can that help us somehow?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:36                     ` Amir Goldstein
@ 2017-06-21  9:38                       ` Amir Goldstein
  2017-06-21  9:48                       ` Miklos Szeredi
  1 sibling, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-21  9:38 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 12:36 PM, Amir Goldstein <amir73il@gmail.com> wrote:

> Just thought of another issue related to pseudo lower st_dev
> we need the underlying iter actor also for replacing real st_dev
> with pseudo st_dev.
>

Yeh, that was stupid. please ignore.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:36                     ` Amir Goldstein
  2017-06-21  9:38                       ` Amir Goldstein
@ 2017-06-21  9:48                       ` Miklos Szeredi
  2017-06-23 13:56                         ` Amir Goldstein
                                           ` (2 more replies)
  1 sibling, 3 replies; 38+ messages in thread
From: Miklos Szeredi @ 2017-06-21  9:48 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>>>
>>> Following up on your idea:
>>> - check in ovl_iterate() if version has changed and if dir became impure
>>> - if it did, populate od->cache, but keep the dir od->is_real
>>> - iterate upper cache entries and call ovl_cache_update_ino()
>>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>>
>> For non-merge dirs we can have a simplified cache just containing the
>> entries with origin, recreated when the version changes or updated in
>> rename, whichever is simpler.  A non-merge dir will never become a
>> merge one, so we can keep the handling separate.
>>
>
> And use this cache to ommit/fix entries with actor?

Right.  I think fixing up is better, because to correctly omit entries
we'd need separate lists for each open directory.  For fixing up we
can use a common one, just like for the merged dir.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:48                       ` Miklos Szeredi
@ 2017-06-23 13:56                         ` Amir Goldstein
  2017-06-30 16:23                         ` Chandan Rajendra
  2017-07-27 20:00                         ` Miklos Szeredi
  2 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-23 13:56 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 12:48 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>>>>
>>>> Following up on your idea:
>>>> - check in ovl_iterate() if version has changed and if dir became impure
>>>> - if it did, populate od->cache, but keep the dir od->is_real
>>>> - iterate upper cache entries and call ovl_cache_update_ino()
>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>>>
>>> For non-merge dirs we can have a simplified cache just containing the
>>> entries with origin, recreated when the version changes or updated in
>>> rename, whichever is simpler.  A non-merge dir will never become a
>>> merge one, so we can keep the handling separate.


Wait a minute, if we want to keep the handling for merge dir separate,
then this patch is perfectly valid for handling merge dir (and impure
dir if we know about it at open time).

Yes, it does not cover the two corner cases that you mentioned, but it
covers the likely real world cases, as proven by xfstest overlay/017.

Given that solving the corner cases requires completely new code that
is mostly independent of this patch, why not merge this patch for v4.13?

Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:48                       ` Miklos Szeredi
  2017-06-23 13:56                         ` Amir Goldstein
@ 2017-06-30 16:23                         ` Chandan Rajendra
  2017-06-30 19:01                           ` Amir Goldstein
  2017-07-27 20:00                         ` Miklos Szeredi
  2 siblings, 1 reply; 38+ messages in thread
From: Chandan Rajendra @ 2017-06-30 16:23 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

On Wednesday, June 21, 2017 3:18:20 PM IST Miklos Szeredi wrote:
> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> > On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> >>>
> >>> Following up on your idea:
> >>> - check in ovl_iterate() if version has changed and if dir became impure
> >>> - if it did, populate od->cache, but keep the dir od->is_real
> >>> - iterate upper cache entries and call ovl_cache_update_ino()
> >>> - Then actor of real dir iterator can use the cache to ommit entries or use
> >>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
> >>
> >> For non-merge dirs we can have a simplified cache just containing the
> >> entries with origin, recreated when the version changes or updated in
> >> rename, whichever is simpler.  A non-merge dir will never become a
> >> merge one, so we can keep the handling separate.
> >>
> >
> > And use this cache to ommit/fix entries with actor?
> 
> Right.  I think fixing up is better, because to correctly omit entries
> we'd need separate lists for each open directory.  For fixing up we
> can use a common one, just like for the merged dir.
> 

Amir, Can I please take over the implementation of this feature.

-- 
chandan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-30 16:23                         ` Chandan Rajendra
@ 2017-06-30 19:01                           ` Amir Goldstein
  0 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-06-30 19:01 UTC (permalink / raw)
  To: Chandan Rajendra; +Cc: Miklos Szeredi, linux-unionfs

On Fri, Jun 30, 2017 at 7:23 PM, Chandan Rajendra
<chandan@linux.vnet.ibm.com> wrote:
> On Wednesday, June 21, 2017 3:18:20 PM IST Miklos Szeredi wrote:
>> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> > On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> >> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> >
>> >>>
>> >>> Following up on your idea:
>> >>> - check in ovl_iterate() if version has changed and if dir became impure
>> >>> - if it did, populate od->cache, but keep the dir od->is_real
>> >>> - iterate upper cache entries and call ovl_cache_update_ino()
>> >>> - Then actor of real dir iterator can use the cache to ommit entries or use
>> >>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>> >>
>> >> For non-merge dirs we can have a simplified cache just containing the
>> >> entries with origin, recreated when the version changes or updated in
>> >> rename, whichever is simpler.  A non-merge dir will never become a
>> >> merge one, so we can keep the handling separate.
>> >>
>> >
>> > And use this cache to ommit/fix entries with actor?
>>
>> Right.  I think fixing up is better, because to correctly omit entries
>> we'd need separate lists for each open directory.  For fixing up we
>> can use a common one, just like for the merged dir.
>>
>
> Amir, Can I please take over the implementation of this feature.
>

Yeh, sure.
I suggest that you start from understanding the overlay readdir code.
It's not trivial IMO, so I expect you come back with more questions.
You should understand how od->cache is populated and when it is
updated. The current od->cache is implemented for merge dirs and
contains a merged list of entries from all layers.

The case of an impure non-merge dir needs a different sort of cache
(list of entries with origin that were moved into upper dir), whose version
is only ever updated on rename (and always under the overlay dir inode lock).

So I think you should start with my patch for the classic merge dir case
and then add patches to maintain and iterate the "impure dir cache".

As my patch adds quite a bit of performance overhead for merge dir
readdir, I expect there will be more optimization work needed before
the work can land in upstream.

Good luck,
Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-06-21  9:48                       ` Miklos Szeredi
  2017-06-23 13:56                         ` Amir Goldstein
  2017-06-30 16:23                         ` Chandan Rajendra
@ 2017-07-27 20:00                         ` Miklos Szeredi
  2017-07-28  9:25                           ` Amir Goldstein
  2 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-07-27 20:00 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>>>>
>>>> Following up on your idea:
>>>> - check in ovl_iterate() if version has changed and if dir became impure
>>>> - if it did, populate od->cache, but keep the dir od->is_real
>>>> - iterate upper cache entries and call ovl_cache_update_ino()
>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>>>
>>> For non-merge dirs we can have a simplified cache just containing the
>>> entries with origin, recreated when the version changes or updated in
>>> rename, whichever is simpler.  A non-merge dir will never become a
>>> merge one, so we can keep the handling separate.
>>>
>>
>> And use this cache to ommit/fix entries with actor?
>
> Right.  I think fixing up is better, because to correctly omit entries
> we'd need separate lists for each open directory.  For fixing up we
> can use a common one, just like for the merged dir.

And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.

Please test.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-07-27 20:00                         ` Miklos Szeredi
@ 2017-07-28  9:25                           ` Amir Goldstein
  2017-07-29 10:33                             ` Amir Goldstein
                                               ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-07-28  9:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Thu, Jul 27, 2017 at 11:00 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>
>>>>>
>>>>> Following up on your idea:
>>>>> - check in ovl_iterate() if version has changed and if dir became impure
>>>>> - if it did, populate od->cache, but keep the dir od->is_real
>>>>> - iterate upper cache entries and call ovl_cache_update_ino()
>>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>>>>
>>>> For non-merge dirs we can have a simplified cache just containing the
>>>> entries with origin, recreated when the version changes or updated in
>>>> rename, whichever is simpler.  A non-merge dir will never become a
>>>> merge one, so we can keep the handling separate.
>>>>
>>>
>>> And use this cache to ommit/fix entries with actor?
>>
>> Right.  I think fixing up is better, because to correctly omit entries
>> we'd need separate lists for each open directory.  For fixing up we
>> can use a common one, just like for the merged dir.
>
> And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.
>
> Please test.
>

I can confirm that xfstest  overlay/017, the only  test I have for
constant d_ino,
passes. Although it does not cover all cases handled by the patch set
(e.g. d_ino of merge parent).

Chandan,

Can you take on the task of improving test coverage for constant d_ino?
You should create a new xfstest and cover more cases of constant d_ino
and possibly also a test for clearing of impure xattr.

When the basic test is in place, I will be able to point at more corner cases
to cover.

Also need to prepare a test to verify constant d_ino for non samefs
for after this patch is integrated with your work.
overlay/017 is a good starting point, but can't use the standard _scratch_mount
helpers that mount samefs overlay.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-07-28  9:25                           ` Amir Goldstein
@ 2017-07-29 10:33                             ` Amir Goldstein
  2017-07-31 10:58                               ` Miklos Szeredi
  2017-07-29 11:26                             ` Chandan Rajendra
  2017-08-03  7:18                             ` Chandan Rajendra
  2 siblings, 1 reply; 38+ messages in thread
From: Amir Goldstein @ 2017-07-29 10:33 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Fri, Jul 28, 2017 at 12:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Jul 27, 2017 at 11:00 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>
>>>>>>
>>>>>> Following up on your idea:
>>>>>> - check in ovl_iterate() if version has changed and if dir became impure
>>>>>> - if it did, populate od->cache, but keep the dir od->is_real
>>>>>> - iterate upper cache entries and call ovl_cache_update_ino()
>>>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>>>>>
>>>>> For non-merge dirs we can have a simplified cache just containing the
>>>>> entries with origin, recreated when the version changes or updated in
>>>>> rename, whichever is simpler.  A non-merge dir will never become a
>>>>> merge one, so we can keep the handling separate.
>>>>>
>>>>
>>>> And use this cache to ommit/fix entries with actor?
>>>
>>> Right.  I think fixing up is better, because to correctly omit entries
>>> we'd need separate lists for each open directory.  For fixing up we
>>> can use a common one, just like for the merged dir.
>>
>> And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.
>>
>> Please test.
>>
>
> I can confirm that xfstest  overlay/017, the only  test I have for
> constant d_ino,
> passes. Although it does not cover all cases handled by the patch set
> (e.g. d_ino of merge parent).
>

Miklos,

Gave the code a quick review pass.
Overall looking good.
One thing that stands out in "my" patch ("ovl: constant d_ino across copy up")
The comments and commit message claim "When all layers are on the same fs",
but the patch doesn't actually checks that.
The reason is that my original patch did "call vfs_getattr() on the
overlay entries",
which meant constant d_ino IFF constant st_ino.
Your modifications turned the code to do constant d_ino, even for
non samefs, which is not that bad, but I don't think you meant for it.

Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-07-28  9:25                           ` Amir Goldstein
  2017-07-29 10:33                             ` Amir Goldstein
@ 2017-07-29 11:26                             ` Chandan Rajendra
  2017-08-03  7:18                             ` Chandan Rajendra
  2 siblings, 0 replies; 38+ messages in thread
From: Chandan Rajendra @ 2017-07-29 11:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

On Friday, July 28, 2017 2:55:56 PM IST Amir Goldstein wrote:
> On Thu, Jul 27, 2017 at 11:00 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> >>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> >>>
> >>>>>
> >>>>> Following up on your idea:
> >>>>> - check in ovl_iterate() if version has changed and if dir became impure
> >>>>> - if it did, populate od->cache, but keep the dir od->is_real
> >>>>> - iterate upper cache entries and call ovl_cache_update_ino()
> >>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
> >>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
> >>>>
> >>>> For non-merge dirs we can have a simplified cache just containing the
> >>>> entries with origin, recreated when the version changes or updated in
> >>>> rename, whichever is simpler.  A non-merge dir will never become a
> >>>> merge one, so we can keep the handling separate.
> >>>>
> >>>
> >>> And use this cache to ommit/fix entries with actor?
> >>
> >> Right.  I think fixing up is better, because to correctly omit entries
> >> we'd need separate lists for each open directory.  For fixing up we
> >> can use a common one, just like for the merged dir.
> >
> > And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.
> >
> > Please test.
> >
> 
> I can confirm that xfstest  overlay/017, the only  test I have for
> constant d_ino,
> passes. Although it does not cover all cases handled by the patch set
> (e.g. d_ino of merge parent).
> 
> Chandan,
> 
> Can you take on the task of improving test coverage for constant d_ino?
> You should create a new xfstest and cover more cases of constant d_ino
> and possibly also a test for clearing of impure xattr.
> 
> When the basic test is in place, I will be able to point at more corner cases
> to cover.
> 
> Also need to prepare a test to verify constant d_ino for non samefs
> for after this patch is integrated with your work.
> overlay/017 is a good starting point, but can't use the standard _scratch_mount
> helpers that mount samefs overlay.

I will work on that.

-- 
chandan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-07-29 10:33                             ` Amir Goldstein
@ 2017-07-31 10:58                               ` Miklos Szeredi
  2017-08-16 11:16                                 ` Amir Goldstein
  0 siblings, 1 reply; 38+ messages in thread
From: Miklos Szeredi @ 2017-07-31 10:58 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, linux-unionfs

On Sat, Jul 29, 2017 at 12:33 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Jul 28, 2017 at 12:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Thu, Jul 27, 2017 at 11:00 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>>
>>>>>>>
>>>>>>> Following up on your idea:
>>>>>>> - check in ovl_iterate() if version has changed and if dir became impure
>>>>>>> - if it did, populate od->cache, but keep the dir od->is_real
>>>>>>> - iterate upper cache entries and call ovl_cache_update_ino()
>>>>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
>>>>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
>>>>>>
>>>>>> For non-merge dirs we can have a simplified cache just containing the
>>>>>> entries with origin, recreated when the version changes or updated in
>>>>>> rename, whichever is simpler.  A non-merge dir will never become a
>>>>>> merge one, so we can keep the handling separate.
>>>>>>
>>>>>
>>>>> And use this cache to ommit/fix entries with actor?
>>>>
>>>> Right.  I think fixing up is better, because to correctly omit entries
>>>> we'd need separate lists for each open directory.  For fixing up we
>>>> can use a common one, just like for the merged dir.
>>>
>>> And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.
>>>
>>> Please test.
>>>
>>
>> I can confirm that xfstest  overlay/017, the only  test I have for
>> constant d_ino,
>> passes. Although it does not cover all cases handled by the patch set
>> (e.g. d_ino of merge parent).
>>
>
> Miklos,
>
> Gave the code a quick review pass.
> Overall looking good.

Thanks for the review.

> One thing that stands out in "my" patch ("ovl: constant d_ino across copy up")
> The comments and commit message claim "When all layers are on the same fs",
> but the patch doesn't actually checks that.

I does in  ovl_cache_update_ino().   That check would be more logical
in ovl_calc_d_ino(), though.   Fixed.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-07-28  9:25                           ` Amir Goldstein
  2017-07-29 10:33                             ` Amir Goldstein
  2017-07-29 11:26                             ` Chandan Rajendra
@ 2017-08-03  7:18                             ` Chandan Rajendra
  2017-08-03  9:53                               ` Chandan Rajendra
  2 siblings, 1 reply; 38+ messages in thread
From: Chandan Rajendra @ 2017-08-03  7:18 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

On Friday, July 28, 2017 2:55:56 PM IST Amir Goldstein wrote:
> On Thu, Jul 27, 2017 at 11:00 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> >>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> >>>
> >>>>>
> >>>>> Following up on your idea:
> >>>>> - check in ovl_iterate() if version has changed and if dir became impure
> >>>>> - if it did, populate od->cache, but keep the dir od->is_real
> >>>>> - iterate upper cache entries and call ovl_cache_update_ino()
> >>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
> >>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
> >>>>
> >>>> For non-merge dirs we can have a simplified cache just containing the
> >>>> entries with origin, recreated when the version changes or updated in
> >>>> rename, whichever is simpler.  A non-merge dir will never become a
> >>>> merge one, so we can keep the handling separate.
> >>>>
> >>>
> >>> And use this cache to ommit/fix entries with actor?
> >>
> >> Right.  I think fixing up is better, because to correctly omit entries
> >> we'd need separate lists for each open directory.  For fixing up we
> >> can use a common one, just like for the merged dir.
> >
> > And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.
> >
> > Please test.
> >
> 
> I can confirm that xfstest  overlay/017, the only  test I have for
> constant d_ino,
> passes. Although it does not cover all cases handled by the patch set
> (e.g. d_ino of merge parent).
> 
> Chandan,
> 
> Can you take on the task of improving test coverage for constant d_ino?
> You should create a new xfstest and cover more cases of constant d_ino
> and possibly also a test for clearing of impure xattr.
> 
> When the basic test is in place, I will be able to point at more corner cases
> to cover.
> 
> Also need to prepare a test to verify constant d_ino for non samefs
> for after this patch is integrated with your work.
> overlay/017 is a good starting point, but can't use the standard _scratch_mount
> helpers that mount samefs overlay.

Hi Amir,

In ovl_calc_d_ino() we have the following code snippet,

if ((p->name[0] == '.' && p->len == 1) ||
    ovl_test_flag(OVL_IMPURE, d_inode(rdd->dentry)))
        return true;

At this point in the function, the directory which is being iterated must have
an entry in upperdir. W.r.t the case of processing "." directory entry, I
think we can limit it to be applicable only when rdd->dentry is either a merge
dir or an impure dir. For a directory which is both pure and non-merge, we can
return the value of p->ino as is.

The second part of the "if" condition takes care of the 'impure dir' part. So
the first part of the "if" condition can be written as,

(p->name[0] == '.' && p->len == 1 && !OVL_TYPE_MERGE(ovl_path_type(rdd->dentry)))


-- 
chandan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-08-03  7:18                             ` Chandan Rajendra
@ 2017-08-03  9:53                               ` Chandan Rajendra
  0 siblings, 0 replies; 38+ messages in thread
From: Chandan Rajendra @ 2017-08-03  9:53 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

On Thursday, August 3, 2017 12:48:46 PM IST Chandan Rajendra wrote:
> On Friday, July 28, 2017 2:55:56 PM IST Amir Goldstein wrote:
> > On Thu, Jul 27, 2017 at 11:00 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > On Wed, Jun 21, 2017 at 11:48 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >> On Wed, Jun 21, 2017 at 11:36 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> > >>> On Wed, Jun 21, 2017 at 12:20 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >>>> On Wed, Jun 21, 2017 at 11:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> > >>>
> > >>>>>
> > >>>>> Following up on your idea:
> > >>>>> - check in ovl_iterate() if version has changed and if dir became impure
> > >>>>> - if it did, populate od->cache, but keep the dir od->is_real
> > >>>>> - iterate upper cache entries and call ovl_cache_update_ino()
> > >>>>> - Then actor of real dir iterator can use the cache to ommit entries or use
> > >>>>>   p->ino from cache if p->real_ino match real d_ino, but differs from p->ino.
> > >>>>
> > >>>> For non-merge dirs we can have a simplified cache just containing the
> > >>>> entries with origin, recreated when the version changes or updated in
> > >>>> rename, whichever is simpler.  A non-merge dir will never become a
> > >>>> merge one, so we can keep the handling separate.
> > >>>>
> > >>>
> > >>> And use this cache to ommit/fix entries with actor?
> > >>
> > >> Right.  I think fixing up is better, because to correctly omit entries
> > >> we'd need separate lists for each open directory.  For fixing up we
> > >> can use a common one, just like for the merged dir.
> > >
> > > And back to this, I pushed a branch named "ovl-d_ino" to my vfs tree.
> > >
> > > Please test.
> > >
> > 
> > I can confirm that xfstest  overlay/017, the only  test I have for
> > constant d_ino,
> > passes. Although it does not cover all cases handled by the patch set
> > (e.g. d_ino of merge parent).
> > 
> > Chandan,
> > 
> > Can you take on the task of improving test coverage for constant d_ino?
> > You should create a new xfstest and cover more cases of constant d_ino
> > and possibly also a test for clearing of impure xattr.
> > 
> > When the basic test is in place, I will be able to point at more corner cases
> > to cover.
> > 
> > Also need to prepare a test to verify constant d_ino for non samefs
> > for after this patch is integrated with your work.
> > overlay/017 is a good starting point, but can't use the standard _scratch_mount
> > helpers that mount samefs overlay.
> 
> Hi Amir,
> 
> In ovl_calc_d_ino() we have the following code snippet,
> 
> if ((p->name[0] == '.' && p->len == 1) ||
>     ovl_test_flag(OVL_IMPURE, d_inode(rdd->dentry)))
>         return true;
> 
> At this point in the function, the directory which is being iterated must have
> an entry in upperdir. W.r.t the case of processing "." directory entry, I
> think we can limit it to be applicable only when rdd->dentry is either a merge
> dir or an impure dir. For a directory which is both pure and non-merge, we can
> return the value of p->ino as is.
> 
> The second part of the "if" condition takes care of the 'impure dir' part. So
> the first part of the "if" condition can be written as,
> 
> (p->name[0] == '.' && p->len == 1 && !OVL_TYPE_MERGE(ovl_path_type(rdd->dentry)))
> 
> 

Sorry about the noise. I found out that in ovl_iterate(), the code which is
responsible for merging entries from a "merge dir" can also end up invoking
ovl_calc_d_ino().

-- 
chandan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3 5/5] ovl: consistent st_ino/d_ino
  2017-07-31 10:58                               ` Miklos Szeredi
@ 2017-08-16 11:16                                 ` Amir Goldstein
  0 siblings, 0 replies; 38+ messages in thread
From: Amir Goldstein @ 2017-08-16 11:16 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, linux-unionfs

On Mon, Jul 31, 2017 at 12:58 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Sat, Jul 29, 2017 at 12:33 PM, Amir Goldstein <amir73il@gmail.com> wrote:

>
>> One thing that stands out in "my" patch ("ovl: constant d_ino across copy up")
>> The comments and commit message claim "When all layers are on the same fs",
>> but the patch doesn't actually checks that.
>
> I does in  ovl_cache_update_ino().   That check would be more logical
> in ovl_calc_d_ino(), though.   Fixed.
>

I see the patches in overlayfs-next still don't have the ovl_same_sb()
test in ovl_calc_d_ino(). I don't really mind.
BTW, do you intend to apply/rework Chandan's non-samefs patches for next?
if so, the current location of ovl_same_sb() really doesn't matter..

Amir.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-08-16 11:16 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-01  9:02 [PATCH v3 0/5] ovl: constant inode numbers (cont.) Amir Goldstein
2017-06-01  9:02 ` [PATCH v3 1/5] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
2017-06-01  9:02 ` [PATCH v3 2/5] ovl: relax same fs constrain for constant st_ino Amir Goldstein
2017-06-01 12:30   ` Chandan Rajendra
2017-06-01 13:32     ` Amir Goldstein
2017-06-01 19:55   ` Miklos Szeredi
2017-06-02  6:32     ` Amir Goldstein
2017-06-02 12:27       ` Miklos Szeredi
2017-06-02 13:23         ` Amir Goldstein
2017-06-02 13:26           ` Miklos Szeredi
2017-06-02 13:34             ` Amir Goldstein
2017-06-01  9:02 ` [PATCH v3 3/5] vfs: factor out lookup_one_len_init() Amir Goldstein
2017-06-01  9:02 ` [PATCH v3 4/5] vfs: add helper lookup_one_len_noperm() Amir Goldstein
2017-06-05 12:36   ` Amir Goldstein
2017-06-01  9:02 ` [PATCH v3 5/5] ovl: consistent st_ino/d_ino Amir Goldstein
2017-06-20 21:25   ` Miklos Szeredi
2017-06-21  8:03     ` Amir Goldstein
2017-06-21  8:20       ` Miklos Szeredi
2017-06-21  8:38         ` Miklos Szeredi
2017-06-21  8:45           ` Miklos Szeredi
2017-06-21  8:49             ` Amir Goldstein
2017-06-21  8:53               ` Miklos Szeredi
2017-06-21  9:05                 ` Amir Goldstein
2017-06-21  9:20                   ` Miklos Szeredi
2017-06-21  9:36                     ` Amir Goldstein
2017-06-21  9:38                       ` Amir Goldstein
2017-06-21  9:48                       ` Miklos Szeredi
2017-06-23 13:56                         ` Amir Goldstein
2017-06-30 16:23                         ` Chandan Rajendra
2017-06-30 19:01                           ` Amir Goldstein
2017-07-27 20:00                         ` Miklos Szeredi
2017-07-28  9:25                           ` Amir Goldstein
2017-07-29 10:33                             ` Amir Goldstein
2017-07-31 10:58                               ` Miklos Szeredi
2017-08-16 11:16                                 ` Amir Goldstein
2017-07-29 11:26                             ` Chandan Rajendra
2017-08-03  7:18                             ` Chandan Rajendra
2017-08-03  9:53                               ` Chandan Rajendra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.