All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs
@ 2017-11-01 20:22 Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 1/9] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

Miklos,

My trigger for fixing up the non-samefs patches is that I found out
while testing NFS export that reconnecting a dir dentry doesn't work
for non-samefs case.
I tracked the problem down to the default implementation of get_name()
in expfs.c that looks up a child inode in parent dir by d_ino.

First, I rebased NFS export patches onto Chandan's patches and tested,
but that still didn't fix the problem. I found and fixed more cases
that we need to include in impure dir cache (see 2 last patches) and
now my dir file handle unit tests [2] pass.

Changes since v5:
- Justification for moving ovl_entry.h into overlayfs.h
- Split allocate anonymous devs patch to 3 patches
- Explain why we need to return pseudo st_dev for lower
- Fix d_ino of subdirs in non-samefs setup

Besides the NFS export unit tests, I tested these patches with upstream
xfstest overlay/041 by Chandan and with unionmount test
'./run --ov --verify' that is avialble on my unionmount devel branch [3].

I asked Chandan to post fixes to his xfstests to cover the impure cases
I fixes.

Please verify that I didn't mess up authorship when posting Chandan's
patches as I did with zhangyi's patch yesterday. On my branch [1], all
the commits have correct authorship now.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/ovl-nonsamefs-v6
[2] https://github.com/amir73il/xfstests/commits/ovl-nfs-export
[3] https://github.com/amir73il/unionmount-testsuite/commits/overlayfs-devel

Amir Goldstein (5):
  ovl: move include of ovl_entry.h into overlayfs.h
  ovl: relax same fs constraint for constant st_ino
  ovl: update cache version of impure parent on rename
  ovl: update merge dir cache for non-samefs case
  ovl: update non-merge dir cache for non-samefs case

Chandan Rajendra (4):
  ovl: re-structure overlay lower layers in-memory
  ovl: allocate anonymous devs for lowerdirs
  ovl: return anonymous st_dev for lower inodes
  ovl: fix d_ino of current pure upper in non-samefs case

 fs/overlayfs/copy_up.c   |  1 -
 fs/overlayfs/dir.c       | 12 ++++---
 fs/overlayfs/inode.c     | 51 +++++++++++++++++++++---------
 fs/overlayfs/namei.c     | 53 ++++++++++++++++++-------------
 fs/overlayfs/overlayfs.h |  7 +++--
 fs/overlayfs/ovl_entry.h | 14 +++++++--
 fs/overlayfs/readdir.c   | 50 +++++++++++++++++++++++-------
 fs/overlayfs/super.c     | 81 ++++++++++++++++++++++++++++++------------------
 fs/overlayfs/util.c      | 17 +++++++---
 9 files changed, 191 insertions(+), 95 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v6 1/9] ovl: move include of ovl_entry.h into overlayfs.h
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 2/9] ovl: re-structure overlay lower layers in-memory Amir Goldstein
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

Most overlayfs c files already explicitly include ovl_entry.h
to use overlay entry struct definitions and upcoming changes
are going to require even more c files to include this header.

All overlayfs c files include overlayfs.h and overlayfs.h itself
refers to some structs defined in ovl_entry.h, so it seems more
logic to include ovl_entry.h from overlayfs.h than from c files.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 1 -
 fs/overlayfs/inode.c     | 1 -
 fs/overlayfs/namei.c     | 1 -
 fs/overlayfs/overlayfs.h | 1 +
 fs/overlayfs/super.c     | 1 -
 fs/overlayfs/util.c      | 1 -
 6 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index c441f9387a1b..9311d183b532 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -22,7 +22,6 @@
 #include <linux/ratelimit.h>
 #include <linux/exportfs.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 #define OVL_COPY_UP_CHUNK_SIZE (1 << 20)
 
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 2949448d0c7d..b772e5a2a730 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -14,7 +14,6 @@
 #include <linux/posix_acl.h>
 #include <linux/ratelimit.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index a12dc10bf726..505a4b8902fc 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -15,7 +15,6 @@
 #include <linux/mount.h>
 #include <linux/exportfs.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 struct ovl_lookup_data {
 	struct qstr name;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index d53157ccf0d7..1cf3bdd193a4 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -9,6 +9,7 @@
 
 #include <linux/kernel.h>
 #include <linux/uuid.h>
+#include "ovl_entry.h"
 
 enum ovl_path_type {
 	__OVL_PATH_UPPER	= (1 << 0),
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index f5738e96a052..8702803ba328 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -18,7 +18,6 @@
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
 MODULE_DESCRIPTION("Overlay filesystem");
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 51ca8bd16009..9158d17bb320 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -17,7 +17,6 @@
 #include <linux/namei.h>
 #include <linux/ratelimit.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 int ovl_want_write(struct dentry *dentry)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 2/9] ovl: re-structure overlay lower layers in-memory
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 1/9] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

From: Chandan Rajendra <chandan@linux.vnet.ibm.com>

Define new structures to represent overlay instance lower layers and
overlay merge dir lower layers to make room for storing more per layer
information in-memory.

Instead of keeping the fs instance lower layers in an array of struct
vfsmount, keep them in an array of new struct ovl_layer, that has a
pointer to struct vfsmount.

Instead of keeping the dentry lower layers in an array of struct path,
keep them in an array of new struct ovl_path, that has a pointer to
struct dentry and to struct ovl_layer.

Add a small helper to find the fs layer id that correspopnds to a lower
struct ovl_path and use it in ovl_lookup().

[amir: split re-structure from anonymous bdev patch]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     | 52 ++++++++++++++++++++++++----------------
 fs/overlayfs/overlayfs.h |  4 ++--
 fs/overlayfs/ovl_entry.h | 13 ++++++++--
 fs/overlayfs/readdir.c   |  4 ++--
 fs/overlayfs/super.c     | 62 ++++++++++++++++++++++++++----------------------
 fs/overlayfs/util.c      |  7 +++++-
 6 files changed, 86 insertions(+), 56 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 505a4b8902fc..6cc3ece3417d 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 
 
 static int ovl_check_origin(struct dentry *upperdentry,
-			    struct path *lowerstack, unsigned int numlower,
-			    struct path **stackp, unsigned int *ctrp)
+			    struct ovl_path *lower, unsigned int numlower,
+			    struct ovl_path **stackp, unsigned int *ctrp)
 {
 	struct vfsmount *mnt;
 	struct dentry *origin = NULL;
 	int i;
 
-
 	for (i = 0; i < numlower; i++) {
-		mnt = lowerstack[i].mnt;
+		mnt = lower[i].layer->mnt;
 		origin = ovl_get_origin(upperdentry, mnt);
 		if (IS_ERR(origin))
 			return PTR_ERR(origin);
@@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
 
 	BUG_ON(*ctrp);
 	if (!*stackp)
-		*stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
+		*stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
 	if (!*stackp) {
 		dput(origin);
 		return -ENOMEM;
 	}
-	**stackp = (struct path) { .dentry = origin, .mnt = mnt };
+	**stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
 	*ctrp = 1;
 
 	return 0;
@@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
  * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
  * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
  */
-int ovl_verify_index(struct dentry *index, struct path *lowerstack,
+int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
 		     unsigned int numlower)
 {
 	struct ovl_fh *fh = NULL;
 	size_t len;
-	struct path origin = { };
-	struct path *stack = &origin;
+	struct ovl_path origin = { };
+	struct ovl_path *stack = &origin;
 	unsigned int ctr = 0;
 	int err;
 
@@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
 	if (err)
 		goto fail;
 
-	err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
+	err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
 	if (!err && !ctr)
 		err = -ESTALE;
 	if (err)
@@ -567,11 +566,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path)
 		idx++;
 	}
 	BUG_ON(idx > oe->numlower);
-	*path = oe->lowerstack[idx - 1];
+	path->dentry = oe->lowerstack[idx - 1].dentry;
+	path->mnt = oe->lowerstack[idx - 1].layer->mnt;
 
 	return (idx < oe->numlower) ? idx + 1 : -1;
 }
 
+static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
+{
+	int i;
+
+	for (i = 0; i < ofs->numlower; i++) {
+		if (ofs->lower_layers[i].mnt == path->layer->mnt)
+			break;
+	}
+
+	return i;
+}
+
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			  unsigned int flags)
 {
@@ -580,7 +592,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct path *stack = NULL;
+	struct ovl_path *stack = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *index = NULL;
 	unsigned int ctr = 0;
@@ -645,17 +657,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (!d.stop && poe->numlower) {
 		err = -ENOMEM;
-		stack = kcalloc(ofs->numlower, sizeof(struct path),
+		stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
 				GFP_KERNEL);
 		if (!stack)
 			goto out_put_upper;
 	}
 
 	for (i = 0; !d.stop && i < poe->numlower; i++) {
-		struct path lowerpath = poe->lowerstack[i];
+		struct ovl_path lower = poe->lowerstack[i];
 
 		d.last = i == poe->numlower - 1;
-		err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
+		err = ovl_lookup_layer(lower.dentry, &d, &this);
 		if (err)
 			goto out_put;
 
@@ -663,7 +675,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			continue;
 
 		stack[ctr].dentry = this;
-		stack[ctr].mnt = lowerpath.mnt;
+		stack[ctr].layer = lower.layer;
 		ctr++;
 
 		if (d.stop)
@@ -673,10 +685,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			poe = roe;
 
 			/* Find the current layer on the root dentry */
-			for (i = 0; i < poe->numlower; i++)
-				if (poe->lowerstack[i].mnt == lowerpath.mnt)
-					break;
-			if (WARN_ON(i == poe->numlower))
+			i = ovl_find_layer(ofs, &lower);
+			if (WARN_ON(i == ofs->numlower))
 				break;
 		}
 	}
@@ -699,7 +709,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		goto out_put;
 
 	oe->opaque = upperopaque;
-	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
+	memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
 	dentry->d_fsdata = oe;
 
 	if (upperdentry)
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 1cf3bdd193a4..cefe5a97d048 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -251,7 +251,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 /* namei.c */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 		      struct dentry *origin, bool is_upper, bool set);
-int ovl_verify_index(struct dentry *index, struct path *lowerstack,
+int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
 		     unsigned int numlower);
 int ovl_get_index_name(struct dentry *origin, struct qstr *name);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path);
@@ -268,7 +268,7 @@ int ovl_check_d_type_supported(struct path *realpath);
 void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
 			 struct dentry *dentry, int level);
 int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
-			 struct path *lowerstack, unsigned int numlower);
+			 struct ovl_path *lower, unsigned int numlower);
 
 /* inode.c */
 int ovl_set_nlink_upper(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 25d9b5adcd42..1e28329b5db8 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -17,11 +17,20 @@ struct ovl_config {
 	bool index;
 };
 
+struct ovl_layer {
+	struct vfsmount *mnt;
+};
+
+struct ovl_path {
+	struct ovl_layer *layer;
+	struct dentry *dentry;
+};
+
 /* private information held for overlayfs's superblock */
 struct ovl_fs {
 	struct vfsmount *upper_mnt;
 	unsigned numlower;
-	struct vfsmount **lower_mnt;
+	struct ovl_layer *lower_layers;
 	/* workbasedir is the path at workdir= mount option */
 	struct dentry *workbasedir;
 	/* workdir is the 'work' directory under workbasedir */
@@ -52,7 +61,7 @@ struct ovl_entry {
 		struct rcu_head rcu;
 	};
 	unsigned numlower;
-	struct path lowerstack[];
+	struct ovl_path lowerstack[];
 };
 
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 333de85d6de4..4f2479b6ef91 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1019,7 +1019,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
 }
 
 int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
-			 struct path *lowerstack, unsigned int numlower)
+			 struct ovl_path *lower, unsigned int numlower)
 {
 	int err;
 	struct dentry *index = NULL;
@@ -1054,7 +1054,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
 			index = NULL;
 			break;
 		}
-		err = ovl_verify_index(index, lowerstack, numlower);
+		err = ovl_verify_index(index, lower, numlower);
 		/* Cleanup stale and orphan index entries */
 		if (err && (err == -ESTALE || err == -ENOENT))
 			err = ovl_cleanup(dir, index);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 8702803ba328..e6e13d9588d1 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -220,8 +220,8 @@ static void ovl_put_super(struct super_block *sb)
 		ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
 	mntput(ufs->upper_mnt);
 	for (i = 0; i < ufs->numlower; i++)
-		mntput(ufs->lower_mnt[i]);
-	kfree(ufs->lower_mnt);
+		mntput(ufs->lower_layers[i].mnt);
+	kfree(ufs->lower_layers);
 
 	kfree(ufs->config.lowerdir);
 	kfree(ufs->config.upperdir);
@@ -1026,24 +1026,26 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	}
 
 	err = -ENOMEM;
-	ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
-	if (ufs->lower_mnt == NULL)
+	ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
+				    GFP_KERNEL);
+	if (ufs->lower_layers == NULL)
 		goto out_put_workdir;
 	for (i = 0; i < numlower; i++) {
-		struct vfsmount *mnt = clone_private_mount(&stack[i]);
+		struct vfsmount *mnt;
 
+		mnt = clone_private_mount(&stack[i]);
 		err = PTR_ERR(mnt);
 		if (IS_ERR(mnt)) {
 			pr_err("overlayfs: failed to clone lowerpath\n");
-			goto out_put_lower_mnt;
+			goto out_put_lower_layers;
 		}
 		/*
-		 * Make lower_mnt R/O.  That way fchmod/fchown on lower file
+		 * Make lower layers R/O.  That way fchmod/fchown on lower file
 		 * will fail instead of modifying lower fs.
 		 */
 		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
 
-		ufs->lower_mnt[ufs->numlower] = mnt;
+		ufs->lower_layers[ufs->numlower].mnt = mnt;
 		ufs->numlower++;
 
 		/* Check if all lower layers are on same sb */
@@ -1059,13 +1061,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
 		ufs->same_sb = NULL;
 
+	err = -ENOMEM;
+	oe = ovl_alloc_entry(numlower);
+	if (!oe)
+		goto out_put_lower_layers;
+
+	for (i = 0; i < numlower; i++) {
+		oe->lowerstack[i].dentry = stack[i].dentry;
+		oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
+	}
+
 	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
 		/* Verify lower root is upper root origin */
-		err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
-					stack[0].dentry, false, true);
+		err = ovl_verify_origin(upperpath.dentry,
+					oe->lowerstack[0].layer->mnt,
+					oe->lowerstack[0].dentry,
+					false, true);
 		if (err) {
 			pr_err("overlayfs: failed to verify upper root origin\n");
-			goto out_put_lower_mnt;
+			goto out_free_oe;
 		}
 
 		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
@@ -1081,7 +1095,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			if (!err)
 				err = ovl_indexdir_cleanup(ufs->indexdir,
 							   ufs->upper_mnt,
-							   stack, numlower);
+							   oe->lowerstack,
+							   numlower);
 		}
 		if (err || !ufs->indexdir)
 			pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
@@ -1106,11 +1121,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	/* Never override disk quota limits or use reserved space */
 	cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
 
-	err = -ENOMEM;
-	oe = ovl_alloc_entry(numlower);
-	if (!oe)
-		goto out_put_cred;
-
 	sb->s_magic = OVERLAYFS_SUPER_MAGIC;
 	sb->s_op = &ovl_super_operations;
 	sb->s_xattr = ovl_xattr_handlers;
@@ -1119,11 +1129,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
 	if (!root_dentry)
-		goto out_free_oe;
+		goto out_put_cred;
 
 	mntput(upperpath.mnt);
 	for (i = 0; i < numlower; i++)
 		mntput(stack[i].mnt);
+	kfree(stack);
 	mntput(workpath.mnt);
 	kfree(lowertmp);
 
@@ -1132,11 +1143,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		if (ovl_is_impuredir(upperpath.dentry))
 			ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
 	}
-	for (i = 0; i < numlower; i++) {
-		oe->lowerstack[i].dentry = stack[i].dentry;
-		oe->lowerstack[i].mnt = ufs->lower_mnt[i];
-	}
-	kfree(stack);
 
 	root_dentry->d_fsdata = oe;
 
@@ -1147,16 +1153,16 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	return 0;
 
-out_free_oe:
-	kfree(oe);
 out_put_cred:
 	put_cred(ufs->creator_cred);
 out_put_indexdir:
 	dput(ufs->indexdir);
-out_put_lower_mnt:
+out_free_oe:
+	kfree(oe);
+out_put_lower_layers:
 	for (i = 0; i < ufs->numlower; i++)
-		mntput(ufs->lower_mnt[i]);
-	kfree(ufs->lower_mnt);
+		mntput(ufs->lower_layers[i].mnt);
+	kfree(ufs->lower_layers);
 out_put_workdir:
 	dput(ufs->workdir);
 	mntput(ufs->upper_mnt);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9158d17bb320..d6bb1c9f5e7a 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -124,7 +124,12 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
-	*path = oe->numlower ? oe->lowerstack[0] : (struct path) { };
+	if (oe->numlower) {
+		path->mnt = oe->lowerstack[0].layer->mnt;
+		path->dentry = oe->lowerstack[0].dentry;
+	} else {
+		*path = (struct path) { };
+	}
 }
 
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 1/9] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 2/9] ovl: re-structure overlay lower layers in-memory Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-02 12:28   ` Vivek Goyal
  2017-11-01 20:22 ` [PATCH v6 4/9] ovl: return anonymous st_dev for lower inodes Amir Goldstein
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

From: Chandan Rajendra <chandan@linux.vnet.ibm.com>

Generate unique values of st_dev per lower layer for non-samefs
overlay mount. The unique values are obtained by allocating anonymous
bdevs for each of the lowerdirs in the overlayfs instance.

The anonymous bdev is going to be returned by stat(2) for lowerdir
non-dir entries in non-samefs case.

[amir: split from ovl_getattr() and re-structure patches]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/super.c     | 18 ++++++++++++++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 1e28329b5db8..93eb6a044dd2 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -19,6 +19,7 @@ struct ovl_config {
 
 struct ovl_layer {
 	struct vfsmount *mnt;
+	dev_t pseudo_dev;
 };
 
 struct ovl_path {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e6e13d9588d1..323dfd7a0236 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -219,8 +219,10 @@ static void ovl_put_super(struct super_block *sb)
 	if (ufs->upper_mnt && ufs->upperdir_locked)
 		ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
 	mntput(ufs->upper_mnt);
-	for (i = 0; i < ufs->numlower; i++)
+	for (i = 0; i < ufs->numlower; i++) {
 		mntput(ufs->lower_layers[i].mnt);
+		free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
+	}
 	kfree(ufs->lower_layers);
 
 	kfree(ufs->config.lowerdir);
@@ -1032,11 +1034,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		goto out_put_workdir;
 	for (i = 0; i < numlower; i++) {
 		struct vfsmount *mnt;
+		dev_t dev;
+
+		err = get_anon_bdev(&dev);
+		if (err) {
+			pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
+			goto out_put_lower_layers;
+		}
 
 		mnt = clone_private_mount(&stack[i]);
 		err = PTR_ERR(mnt);
 		if (IS_ERR(mnt)) {
 			pr_err("overlayfs: failed to clone lowerpath\n");
+			free_anon_bdev(dev);
 			goto out_put_lower_layers;
 		}
 		/*
@@ -1046,6 +1056,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
 
 		ufs->lower_layers[ufs->numlower].mnt = mnt;
+		ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
 		ufs->numlower++;
 
 		/* Check if all lower layers are on same sb */
@@ -1160,8 +1171,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 out_free_oe:
 	kfree(oe);
 out_put_lower_layers:
-	for (i = 0; i < ufs->numlower; i++)
+	for (i = 0; i < ufs->numlower; i++) {
+		if (ufs->lower_layers[i].mnt)
+			free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
 		mntput(ufs->lower_layers[i].mnt);
+	}
 	kfree(ufs->lower_layers);
 out_put_workdir:
 	dput(ufs->workdir);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 4/9] ovl: return anonymous st_dev for lower inodes
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 5/9] ovl: relax same fs constraint for constant st_ino Amir Goldstein
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

From: Chandan Rajendra <chandan@linux.vnet.ibm.com>

For non-samefs setup, to make sure that st_dev/st_ino pair
is unique across the system, we return a unique anonymous
st_dev for stat(2) of lower layer inode.

We need to make this change before fixing constant st_dev/st_ino across
copy up for non-samefs. Otherwise, we can end up with two objects
in the system, the real lower inode and the overlay inode which
have same st_dev/st_ino value, but different content.

[amir: simplify ovl_get_pseudo_dev()
       split from allocate anonymous bdev patch]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index b772e5a2a730..607a50c50042 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -15,6 +15,14 @@
 #include <linux/ratelimit.h>
 #include "overlayfs.h"
 
+
+static dev_t ovl_get_pseudo_dev(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	return oe->lowerstack[0].layer->pseudo_dev;
+}
+
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int err;
@@ -121,6 +129,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
+	} else if (!OVL_TYPE_UPPER(type)) {
+		/*
+		 * For non-samefs setup, to make sure that st_dev/st_ino pair
+		 * is unique across the system, we use a unique anonymous
+		 * st_dev for lower layer inode.
+		 */
+		stat->dev = ovl_get_pseudo_dev(dentry);
 	}
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 5/9] ovl: relax same fs constraint for constant st_ino
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 4/9] ovl: return anonymous st_dev for lower inodes Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 6/9] ovl: fix d_ino of current pure upper in non-samefs case Amir Goldstein
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

For the case of all layers not on the same fs, return the copy up origin
inode st_dev/st_ino for non-dir from stat(2).

This guaranties constant and system-wide unique st_dev/st_ino for non-dir
across copy up. Like the same fs case, st_ino of non-dir is also
persistent. Unlike the same fs case, st_dev is not persistent.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 49 +++++++++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 607a50c50042..d06de98808f6 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -74,6 +74,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	struct path realpath;
 	const struct cred *old_cred;
 	bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
+	bool samefs = ovl_same_sb(dentry->d_sb);
 	int err;
 
 	type = ovl_path_real(dentry, &realpath);
@@ -83,16 +84,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		goto out;
 
 	/*
-	 * When all layers are on the same fs, all real inode number are
-	 * unique, so we use the overlay st_dev, which is friendly to du -x.
-	 *
-	 * We also use st_ino of the copy up origin, if we know it.
-	 * This guaranties constant st_dev/st_ino across copy up.
+	 * For non-dir or same fs, we use st_ino of the copy up origin, if we
+	 * know it. This guaranties constant st_dev/st_ino across copy up.
 	 *
 	 * If filesystem supports NFS export ops, this also guaranties
 	 * persistent st_ino across mount cycle.
 	 */
-	if (ovl_same_sb(dentry->d_sb)) {
+	if (!is_dir || samefs) {
 		if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
 			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
@@ -103,7 +101,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (err)
 				goto out;
 
-			WARN_ON_ONCE(stat->dev != lowerstat.dev);
 			/*
 			 * Lower hardlinks may be broken on copy up to different
 			 * upper files, so we cannot use the lower origin st_ino
@@ -115,27 +112,39 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (is_dir || lowerstat.nlink == 1 ||
 			    ovl_test_flag(OVL_INDEX, d_inode(dentry)))
 				stat->ino = lowerstat.ino;
+
+			if (samefs)
+				WARN_ON_ONCE(stat->dev != lowerstat.dev);
+			else
+				stat->dev = ovl_get_pseudo_dev(dentry);
 		}
-		stat->dev = dentry->d_sb->s_dev;
-	} else if (is_dir) {
+		if (samefs) {
+			/*
+			 * When all layers are on the same fs, all real inode
+			 * number are unique, so we use the overlay st_dev,
+			 * which is friendly to du -x.
+			 */
+			stat->dev = dentry->d_sb->s_dev;
+		} else if (!OVL_TYPE_UPPER(type)) {
+			/*
+			 * For non-samefs setup, to make sure that st_dev/st_ino
+			 * pair is unique across the system, we use a unique
+			 * anonymous st_dev for lower layer inode.
+			 */
+			stat->dev = ovl_get_pseudo_dev(dentry);
+		}
+	} else {
 		/*
-		 * If not all layers are on the same fs the pair {real st_ino;
-		 * overlay st_dev} is not unique, so use the non persistent
-		 * overlay st_ino.
-		 *
 		 * Always use the overlay st_dev for directories, so 'find
 		 * -xdev' will scan the entire overlay mount and won't cross the
 		 * overlay mount boundaries.
+		 *
+		 * If not all layers are on the same fs the pair {real st_ino;
+		 * overlay st_dev} is not unique, so use the non persistent
+		 * overlay st_ino for directories.
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
-	} else if (!OVL_TYPE_UPPER(type)) {
-		/*
-		 * For non-samefs setup, to make sure that st_dev/st_ino pair
-		 * is unique across the system, we use a unique anonymous
-		 * st_dev for lower layer inode.
-		 */
-		stat->dev = ovl_get_pseudo_dev(dentry);
 	}
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 6/9] ovl: fix d_ino of current pure upper in non-samefs case
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (4 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 5/9] ovl: relax same fs constraint for constant st_ino Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 7/9] ovl: update cache version of impure parent on rename Amir Goldstein
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

From: Chandan Rajendra <chandan@linux.vnet.ibm.com>

For a pure upper/lower dir in a non-samefs setup, stat(2) returns the
overlay inode number as the value of st_ino. To keep in line with this
behaviour, ovl_fill_real() has been modified to return overlay inode
number of the "." entry as the d_ino of a pure upper dir.

[amir: change the subject line]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/readdir.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 4f2479b6ef91..74b859ed569e 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -451,7 +451,6 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
  * sure that d_ino will be consistent with st_ino from stat(2).
  */
 static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
-
 {
 	struct dentry *dir = path->dentry;
 	struct dentry *this = NULL;
@@ -459,9 +458,6 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 	u64 ino = p->real_ino;
 	int err = 0;
 
-	if (!ovl_same_sb(dir->d_sb))
-		goto out;
-
 	if (p->name[0] == '.') {
 		if (p->len == 1) {
 			this = dget(dir);
@@ -493,8 +489,8 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 		err = vfs_getattr(&statpath, &stat, STATX_INO, 0);
 		if (err)
 			goto fail;
-
-		WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
+		if (d_is_dir(this) || ovl_same_sb(this->d_sb))
+			WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
 		ino = stat.ino;
 	}
 
@@ -616,6 +612,7 @@ struct ovl_readdir_translate {
 	struct ovl_dir_cache *cache;
 	struct dir_context ctx;
 	u64 parent_ino;
+	u64 current_ino;
 };
 
 static int ovl_fill_real(struct dir_context *ctx, const char *name,
@@ -628,6 +625,8 @@ static int ovl_fill_real(struct dir_context *ctx, const char *name,
 
 	if (rdt->parent_ino && strcmp(name, "..") == 0)
 		ino = rdt->parent_ino;
+	else if (rdt->current_ino && namelen == 1 && name[0] == '.')
+		ino = rdt->current_ino;
 	else if (rdt->cache) {
 		struct ovl_cache_entry *p;
 
@@ -668,6 +667,9 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 			return PTR_ERR(rdt.cache);
 	}
 
+	if (!ovl_same_sb(dir->d_sb))
+		rdt.current_ino = dir->d_inode->i_ino;
+
 	return iterate_dir(od->realfile, &rdt.ctx);
 }
 
@@ -688,9 +690,9 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
 		 * dir is impure then need to adjust d_ino for copied up
 		 * entries.
 		 */
-		if (ovl_same_sb(dentry->d_sb) &&
-		    (ovl_test_flag(OVL_IMPURE, d_inode(dentry)) ||
-		     OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent)))) {
+		if (ovl_test_flag(OVL_IMPURE, d_inode(dentry)) ||
+		     OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent)) ||
+		     !ovl_same_sb(dentry->d_sb)) {
 			return ovl_iterate_real(file, ctx);
 		}
 		return iterate_dir(od->realfile, ctx);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 7/9] ovl: update cache version of impure parent on rename
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (5 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 6/9] ovl: fix d_ino of current pure upper in non-samefs case Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 8/9] ovl: update merge dir cache for non-samefs case Amir Goldstein
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

ovl_rename() updates dir cache version for impure old parent if an entry
with copy up origin is moved into old parent, but it did not update
cache version if the entry moved out of old parent has a copy up origin.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index ef533198be45..70af6b470420 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1075,6 +1075,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 			drop_nlink(d_inode(new));
 	}
 
+	ovl_dentry_version_inc(old->d_parent, ovl_type_origin(old));
 	ovl_dentry_version_inc(old->d_parent,
 			       !overwrite && ovl_type_origin(new));
 	ovl_dentry_version_inc(new->d_parent, ovl_type_origin(old));
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 8/9] ovl: update merge dir cache for non-samefs case
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (6 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 7/9] ovl: update cache version of impure parent on rename Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-01 20:22 ` [PATCH v6 9/9] ovl: update non-merge " Amir Goldstein
  2017-11-02 16:33 ` [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Miklos Szeredi
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

st_ino returned by stat(2) for dirs on non-samefs is the non persistent
overlay i_ino. Update the dir cache with volatile overlay i_ino values.
Overlay dir cache and inode cache may get out of sync after child inodes
exit and re-enter inode cache. This means that user may see inconsistent
st_ino/d_ino, but user can already see inconsistent st_ino of same object
in different times, so this is better than nothing.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/readdir.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 74b859ed569e..fa1be3fe68fa 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -120,6 +120,10 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
 	if (!rdd->dentry)
 		return false;
 
+	/* Always recalc d_ino from i_ino for dir on non-samefs */
+	if (p->type == DT_DIR && !ovl_same_sb(rdd->dentry->d_sb))
+		return true;
+
 	/* Always recalc d_ino for parent */
 	if (strcmp(p->name, "..") == 0)
 		return true;
@@ -480,6 +484,19 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 	}
 
 get:
+	/*
+	 * st_ino for dirs on non-samefs is the non persistent overlay i_ino.
+	 * Update the dir cache with volatile overlay i_ino, even though
+	 * dir cache and inode cache may get out of sync after child inodes
+	 * exit and re-enter inode cache. This means that user may see
+	 * inconsistent st_ino/d_ino, but user can already see inconsistent
+	 * st_ino of same object in different times.
+	 */
+	if (p->type == DT_DIR && !ovl_same_sb(dir->d_sb)) {
+		ino = this->d_inode->i_ino;
+		goto out;
+	}
+
 	type = ovl_path_type(this);
 	if (OVL_TYPE_ORIGIN(type)) {
 		struct kstat stat;
@@ -489,7 +506,7 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 		err = vfs_getattr(&statpath, &stat, STATX_INO, 0);
 		if (err)
 			goto fail;
-		if (d_is_dir(this) || ovl_same_sb(this->d_sb))
+		if (ovl_same_sb(dir->d_sb))
 			WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
 		ino = stat.ino;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 9/9] ovl: update non-merge dir cache for non-samefs case
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (7 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 8/9] ovl: update merge dir cache for non-samefs case Amir Goldstein
@ 2017-11-01 20:22 ` Amir Goldstein
  2017-11-02 16:33 ` [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Miklos Szeredi
  9 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-01 20:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

For non-samefs case, all subdirectories are "impure" because the
st_ino value returned from stat(2) is not the st_ino value of the
real inode.

Update non-merge dir cache version on every change in subdir entry
and cache st_ino values of subdirs in impure dir cache.

Build the impure dir cache for non-samefs case even if there is no
"impure" xattr on dir, but if real dir has nlink > 2 or nlink == 1,
which indicates that is has subdirs.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c       | 13 +++++++------
 fs/overlayfs/overlayfs.h |  2 +-
 fs/overlayfs/readdir.c   |  9 ++++++++-
 fs/overlayfs/util.c      |  9 ++++++---
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 70af6b470420..b41f9d15345b 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -155,7 +155,7 @@ static int ovl_set_opaque(struct dentry *dentry, struct dentry *upperdentry)
 static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 			    struct dentry *newdentry, bool hardlink)
 {
-	ovl_dentry_version_inc(dentry->d_parent, false);
+	ovl_dentry_version_inc(dentry->d_parent, false, S_ISDIR(inode->i_mode));
 	ovl_dentry_set_upper_alias(dentry);
 	if (!hardlink) {
 		ovl_inode_update(inode, newdentry);
@@ -676,7 +676,7 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
 	if (flags)
 		ovl_cleanup(wdir, upper);
 
-	ovl_dentry_version_inc(dentry->d_parent, true);
+	ovl_dentry_version_inc(dentry->d_parent, true, is_dir);
 out_d_drop:
 	d_drop(dentry);
 	dput(whiteout);
@@ -727,7 +727,8 @@ static int ovl_remove_upper(struct dentry *dentry, bool is_dir)
 		err = vfs_rmdir(dir, upper);
 	else
 		err = vfs_unlink(dir, upper, NULL);
-	ovl_dentry_version_inc(dentry->d_parent, ovl_type_origin(dentry));
+	ovl_dentry_version_inc(dentry->d_parent, ovl_type_origin(dentry),
+			       is_dir);
 
 	/*
 	 * Keeping this dentry hashed would mean having to release
@@ -1075,10 +1076,10 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 			drop_nlink(d_inode(new));
 	}
 
-	ovl_dentry_version_inc(old->d_parent, ovl_type_origin(old));
+	ovl_dentry_version_inc(old->d_parent, ovl_type_origin(old), is_dir);
 	ovl_dentry_version_inc(old->d_parent,
-			       !overwrite && ovl_type_origin(new));
-	ovl_dentry_version_inc(new->d_parent, ovl_type_origin(old));
+			       !overwrite && ovl_type_origin(new), new_is_dir);
+	ovl_dentry_version_inc(new->d_parent, ovl_type_origin(old), is_dir);
 
 out_dput:
 	dput(newdentry);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index cefe5a97d048..17fc4622dfba 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -221,7 +221,7 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 		    struct dentry *lowerdentry);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
-void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
+void ovl_dentry_version_inc(struct dentry *dentry, bool impurity, bool subdir);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index fa1be3fe68fa..469837c06814 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -660,6 +660,7 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 	int err;
 	struct ovl_dir_file *od = file->private_data;
 	struct dentry *dir = file->f_path.dentry;
+	struct inode *realinode = d_inode(od->realfile->f_path.dentry);
 	struct ovl_readdir_translate rdt = {
 		.ctx.actor = ovl_fill_real,
 		.orig_ctx = ctx,
@@ -678,7 +679,13 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 		rdt.parent_ino = stat.ino;
 	}
 
-	if (ovl_test_flag(OVL_IMPURE, d_inode(dir))) {
+	/*
+	 * If dir is impure, we need to cache st_ino values of copy up origins.
+	 * For non-samefs, if dir nlink is either 1 (=any) or > 2, then it may
+	 * have subdirs and we need to cache their non persistent st_ino values.
+	 */
+	if (ovl_test_flag(OVL_IMPURE, d_inode(dir)) ||
+	    (!ovl_same_sb(dir->d_sb) && realinode->i_nlink != 2)) {
 		rdt.cache = ovl_cache_get_impure(&file->f_path);
 		if (IS_ERR(rdt.cache))
 			return PTR_ERR(rdt.cache);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index d6bb1c9f5e7a..b4d729dff66b 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -279,7 +279,7 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
 	}
 }
 
-void ovl_dentry_version_inc(struct dentry *dentry, bool impurity)
+void ovl_dentry_version_inc(struct dentry *dentry, bool impurity, bool subdir)
 {
 	struct inode *inode = d_inode(dentry);
 
@@ -288,9 +288,12 @@ void ovl_dentry_version_inc(struct dentry *dentry, bool impurity)
 	 * Version is used by readdir code to keep cache consistent.  For merge
 	 * dirs all changes need to be noted.  For non-merge dirs, cache only
 	 * contains impure (ones which have been copied up and have origins)
-	 * entries, so only need to note changes to impure entries.
+	 * entries, so only need to note changes to impure entries.  For
+	 * non-merge dirs in non-samefs case, we need to also note changes for
+	 * all sub-dirs, because we cache all overlay dir inode numbers.
 	 */
-	if (OVL_TYPE_MERGE(ovl_path_type(dentry)) || impurity)
+	if (OVL_TYPE_MERGE(ovl_path_type(dentry)) || impurity ||
+	    (subdir && !ovl_same_sb(dentry->d_sb)))
 		OVL_I(inode)->version++;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 20:22 ` [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
@ 2017-11-02 12:28   ` Vivek Goyal
  2017-11-02 13:05     ` Amir Goldstein
  0 siblings, 1 reply; 14+ messages in thread
From: Vivek Goyal @ 2017-11-02 12:28 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Chandan Rajendra, linux-unionfs

On Wed, Nov 01, 2017 at 10:22:47PM +0200, Amir Goldstein wrote:
> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> 
> Generate unique values of st_dev per lower layer for non-samefs
> overlay mount. The unique values are obtained by allocating anonymous
> bdevs for each of the lowerdirs in the overlayfs instance.

Still not sure that why are we allocating anonymous bdev only for
lower and not upper. Until and unless we have a good reason not not
do so, treating them in same way will make it atleast easier to
understand?

Vivek

> 
> The anonymous bdev is going to be returned by stat(2) for lowerdir
> non-dir entries in non-samefs case.
> 
> [amir: split from ovl_getattr() and re-structure patches]
> 
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/super.c     | 18 ++++++++++++++++--
>  2 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 1e28329b5db8..93eb6a044dd2 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -19,6 +19,7 @@ struct ovl_config {
>  
>  struct ovl_layer {
>  	struct vfsmount *mnt;
> +	dev_t pseudo_dev;
>  };
>  
>  struct ovl_path {
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index e6e13d9588d1..323dfd7a0236 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -219,8 +219,10 @@ static void ovl_put_super(struct super_block *sb)
>  	if (ufs->upper_mnt && ufs->upperdir_locked)
>  		ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
>  	mntput(ufs->upper_mnt);
> -	for (i = 0; i < ufs->numlower; i++)
> +	for (i = 0; i < ufs->numlower; i++) {
>  		mntput(ufs->lower_layers[i].mnt);
> +		free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +	}
>  	kfree(ufs->lower_layers);
>  
>  	kfree(ufs->config.lowerdir);
> @@ -1032,11 +1034,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		goto out_put_workdir;
>  	for (i = 0; i < numlower; i++) {
>  		struct vfsmount *mnt;
> +		dev_t dev;
> +
> +		err = get_anon_bdev(&dev);
> +		if (err) {
> +			pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
> +			goto out_put_lower_layers;
> +		}
>  
>  		mnt = clone_private_mount(&stack[i]);
>  		err = PTR_ERR(mnt);
>  		if (IS_ERR(mnt)) {
>  			pr_err("overlayfs: failed to clone lowerpath\n");
> +			free_anon_bdev(dev);
>  			goto out_put_lower_layers;
>  		}
>  		/*
> @@ -1046,6 +1056,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
>  
>  		ufs->lower_layers[ufs->numlower].mnt = mnt;
> +		ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
>  		ufs->numlower++;
>  
>  		/* Check if all lower layers are on same sb */
> @@ -1160,8 +1171,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  out_free_oe:
>  	kfree(oe);
>  out_put_lower_layers:
> -	for (i = 0; i < ufs->numlower; i++)
> +	for (i = 0; i < ufs->numlower; i++) {
> +		if (ufs->lower_layers[i].mnt)
> +			free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
>  		mntput(ufs->lower_layers[i].mnt);
> +	}
>  	kfree(ufs->lower_layers);
>  out_put_workdir:
>  	dput(ufs->workdir);
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs
  2017-11-02 12:28   ` Vivek Goyal
@ 2017-11-02 13:05     ` Amir Goldstein
  0 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-02 13:05 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Chandan Rajendra, overlayfs

On Thu, Nov 2, 2017 at 2:28 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Nov 01, 2017 at 10:22:47PM +0200, Amir Goldstein wrote:
>> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>>
>> Generate unique values of st_dev per lower layer for non-samefs
>> overlay mount. The unique values are obtained by allocating anonymous
>> bdevs for each of the lowerdirs in the overlayfs instance.
>
> Still not sure that why are we allocating anonymous bdev only for
> lower and not upper. Until and unless we have a good reason not not
> do so, treating them in same way will make it atleast easier to
> understand?
>

I too find it hard to understand and to explain why anonymous bdev are needed.
But I really don't see how allocating anonymous bdev for upper makes it easier
to understand...

If you think that it makes the code easier to understand, please show me where,
because I don't see it.
Both in ovl_getattr() and in ovl_lookup(), lower and upper are handled with
completely different code anyway and upper dentry is stored
differently (in overlay
inode) than lower dentry (in overlay dentry).

Anyway, as I wrote, I don't remember if there was a concrete reason why *not*
to allocate anon bdev for upper, or simply that there was no good
reason to *yes*
allocate anon bdev to upper, so need to refer this question to Miklos who
suggested this solution:
https://marc.info/?l=linux-unionfs&m=149259338809700&w=2

Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs
  2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (8 preceding siblings ...)
  2017-11-01 20:22 ` [PATCH v6 9/9] ovl: update non-merge " Amir Goldstein
@ 2017-11-02 16:33 ` Miklos Szeredi
  2017-11-02 18:06   ` Amir Goldstein
  9 siblings, 1 reply; 14+ messages in thread
From: Miklos Szeredi @ 2017-11-02 16:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

On Wed, Nov 1, 2017 at 9:22 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Miklos,
>
> My trigger for fixing up the non-samefs patches is that I found out
> while testing NFS export that reconnecting a dir dentry doesn't work
> for non-samefs case.
> I tracked the problem down to the default implementation of get_name()
> in expfs.c that looks up a child inode in parent dir by d_ino.

That may be a good fix, but.. we shouldn't have to rely on correct
d_ino to get the name.

In fact we can just recursively call exportfs_get_name() on underlying
fs, since we have the parent and the child handle in both the dir and
non-dir cases.

Am I missing something?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs
  2017-11-02 16:33 ` [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Miklos Szeredi
@ 2017-11-02 18:06   ` Amir Goldstein
  0 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2017-11-02 18:06 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, Vivek Goyal, linux-unionfs

On Thu, Nov 2, 2017 at 6:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Nov 1, 2017 at 9:22 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Miklos,
>>
>> My trigger for fixing up the non-samefs patches is that I found out
>> while testing NFS export that reconnecting a dir dentry doesn't work
>> for non-samefs case.
>> I tracked the problem down to the default implementation of get_name()
>> in expfs.c that looks up a child inode in parent dir by d_ino.
>
> That may be a good fix, but..

Not that good, it creates dir cache for pure lower without handling
NULL upper properly.

Pushed tested fix to ovl-nonsamefs-v7
with some re-arrange of the patches and more blurb in commit messages.


> we shouldn't have to rely on correct
> d_ino to get the name.
>
> In fact we can just recursively call exportfs_get_name() on underlying
> fs, since we have the parent and the child handle in both the dir and
> non-dir cases.
>

Yes, I have that implementation in the early pure upper decode patches,
but threw it out later on.

> Am I missing something?
>

No, in fact implementation of get_name() is even simpler than that.

The trick with copy up of dir on encode is that we are guarantied
to have a connected upper real  dir dentry before we decode the
overlay dentry (see simple implementation of get_parent() for example),
so we can lookup the name in dcache.

I just thought, hey, I'll apply Chandan's patch to fix get_name(),
but it turned out to be a lot more.
V7 should be fine now, although I still need to improve the xfstest
to make sure I covered all cases properly.

I did already torture V7 (+ovl-nfs-export-wip +ovl_snapshot_wip)
with snapshot tests, so I am more confident about it than V6.

Will post V7 patched to list after I have the test, but feel free to review.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-11-02 18:06 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-01 20:22 [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 1/9] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 2/9] ovl: re-structure overlay lower layers in-memory Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 3/9] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
2017-11-02 12:28   ` Vivek Goyal
2017-11-02 13:05     ` Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 4/9] ovl: return anonymous st_dev for lower inodes Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 5/9] ovl: relax same fs constraint for constant st_ino Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 6/9] ovl: fix d_ino of current pure upper in non-samefs case Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 7/9] ovl: update cache version of impure parent on rename Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 8/9] ovl: update merge dir cache for non-samefs case Amir Goldstein
2017-11-01 20:22 ` [PATCH v6 9/9] ovl: update non-merge " Amir Goldstein
2017-11-02 16:33 ` [PATCH v6 0/9] Overlayfs: constant st_ino/d_ino for non-samefs Miklos Szeredi
2017-11-02 18:06   ` Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.