All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Overlayfs: constant st_ino/d_ino for non-samefs
@ 2017-10-30 20:27 Amir Goldstein
  2017-10-30 20:27 ` [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-10-30 20:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

Miklos,

I rebased and pushed a new version of Chandan's patches to my
overlayfs devel branch [1].

I am posting this for review here as v5, although I lost count on the real
revision number.

Beyond rebase, cleanup and checkpatch warning fixes I made the following
cosmetic modifications since previous version:

1. s/struct ovl_lower_mnt/struct ovl_layer
2. s/struct ovl_lower/struct ovl_path
3. s/ofs->lower_mnt/ofs->lower_layers
4. ovl_find_layer() mini helper for ovl_lookup()
5. split cleanup patch of include ovl_entry.h

The cosmetic changes make the struct member variables names .layer and
.lower_layers consistent with their type (struct ovl_layer).
The type name struct ovl_path fits better IMO to the argument list of
ovl_check_origin() which will later be used to check "upper origin"
as well (index=all patches).

I tested these patches with upstream xfstest overlay/041 by Chandan
and with unionmount test './run --ov --verify' that is avialble on my
unionmount devel branch [2].

My overlayfs devel branch [1] contains also the patches by Zhangyi
for not exposing whiteouts when lower dir is gone missing. I tested these
changes with upstream xfstest overlay/031 by Zhangyi.

Please consider the patches from my devel branch [1] for v4.15.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/overlayfs-devel
[2] https://github.com/amir73il/unionmount-testsuite/commits/overlayfs-devel

Amir Goldstein (2):
  ovl: move include of ovl_entry.h into overlayfs.h
  ovl: relax same fs constrain for constant st_ino

Chandan Rajendra (2):
  ovl: allocate anonymous devs for lowerdirs
  ovl: relax same fs constraint for constant d_ino

 fs/overlayfs/copy_up.c   |  1 -
 fs/overlayfs/inode.c     | 51 +++++++++++++++++++++---------
 fs/overlayfs/namei.c     | 53 ++++++++++++++++++-------------
 fs/overlayfs/overlayfs.h |  5 +--
 fs/overlayfs/ovl_entry.h | 14 +++++++--
 fs/overlayfs/readdir.c   | 24 +++++++-------
 fs/overlayfs/super.c     | 81 ++++++++++++++++++++++++++++++------------------
 fs/overlayfs/util.c      |  8 +++--
 8 files changed, 152 insertions(+), 85 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h
  2017-10-30 20:27 [PATCH v5 0/4] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
@ 2017-10-30 20:27 ` Amir Goldstein
  2017-10-31 13:14   ` Miklos Szeredi
  2017-10-30 20:27 ` [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2017-10-30 20:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 1 -
 fs/overlayfs/inode.c     | 1 -
 fs/overlayfs/namei.c     | 1 -
 fs/overlayfs/overlayfs.h | 1 +
 fs/overlayfs/super.c     | 1 -
 fs/overlayfs/util.c      | 1 -
 6 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index c441f9387a1b..9311d183b532 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -22,7 +22,6 @@
 #include <linux/ratelimit.h>
 #include <linux/exportfs.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 #define OVL_COPY_UP_CHUNK_SIZE (1 << 20)
 
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 03f0ec2b73eb..50e233ccca53 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -14,7 +14,6 @@
 #include <linux/posix_acl.h>
 #include <linux/ratelimit.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 63200578ce5b..de2dac98e147 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -15,7 +15,6 @@
 #include <linux/mount.h>
 #include <linux/exportfs.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 struct ovl_lookup_data {
 	struct qstr name;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 4fa30633780d..73ef1e850635 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -9,6 +9,7 @@
 
 #include <linux/kernel.h>
 #include <linux/uuid.h>
+#include "ovl_entry.h"
 
 enum ovl_path_type {
 	__OVL_PATH_UPPER	= (1 << 0),
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index f5738e96a052..8702803ba328 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -18,7 +18,6 @@
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
 MODULE_DESCRIPTION("Overlay filesystem");
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 51ca8bd16009..9158d17bb320 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -17,7 +17,6 @@
 #include <linux/namei.h>
 #include <linux/ratelimit.h>
 #include "overlayfs.h"
-#include "ovl_entry.h"
 
 int ovl_want_write(struct dentry *dentry)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-30 20:27 [PATCH v5 0/4] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  2017-10-30 20:27 ` [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
@ 2017-10-30 20:27 ` Amir Goldstein
  2017-10-31 13:43   ` Miklos Szeredi
                     ` (2 more replies)
  2017-10-30 20:27 ` [PATCH v5 3/4] ovl: relax same fs constrain for constant st_ino Amir Goldstein
  2017-10-30 20:27 ` [PATCH v5 4/4] ovl: relax same fs constraint for constant d_ino Amir Goldstein
  3 siblings, 3 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-10-30 20:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

From: Chandan Rajendra <chandan@linux.vnet.ibm.com>

For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
provides unique values for st_dev. The unique values are obtained by
allocating anonymous bdevs for each of the lowerdirs in the overlayfs
instance.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
[amir: rename to struct ovl_layer lower_layers and struct ovl_path]
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 18 +++++++++++
 fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
 fs/overlayfs/overlayfs.h |  4 +--
 fs/overlayfs/ovl_entry.h | 14 +++++++--
 fs/overlayfs/readdir.c   |  4 +--
 fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
 fs/overlayfs/util.c      |  7 ++++-
 7 files changed, 121 insertions(+), 58 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 50e233ccca53..ec5c3ce91868 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/fs.h>
+#include <linux/mount.h>
 #include <linux/slab.h>
 #include <linux/cred.h>
 #include <linux/xattr.h>
@@ -15,6 +16,21 @@
 #include <linux/ratelimit.h>
 #include "overlayfs.h"
 
+
+static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
+{
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
+		return dev;
+
+	if (oe->numlower)
+		return oe->lowerstack[0].layer->pseudo_dev;
+
+	return dev;
+}
+
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int err;
@@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
+	} else {
+		stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
 	}
 
 	/*
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index de2dac98e147..78e965527167 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 
 
 static int ovl_check_origin(struct dentry *upperdentry,
-			    struct path *lowerstack, unsigned int numlower,
-			    struct path **stackp, unsigned int *ctrp)
+			    struct ovl_path *lower, unsigned int numlower,
+			    struct ovl_path **stackp, unsigned int *ctrp)
 {
 	struct vfsmount *mnt;
 	struct dentry *origin = NULL;
 	int i;
 
-
 	for (i = 0; i < numlower; i++) {
-		mnt = lowerstack[i].mnt;
+		mnt = lower[i].layer->mnt;
 		origin = ovl_get_origin(upperdentry, mnt);
 		if (IS_ERR(origin))
 			return PTR_ERR(origin);
@@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
 
 	BUG_ON(*ctrp);
 	if (!*stackp)
-		*stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
+		*stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
 	if (!*stackp) {
 		dput(origin);
 		return -ENOMEM;
 	}
-	**stackp = (struct path) { .dentry = origin, .mnt = mnt };
+	**stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
 	*ctrp = 1;
 
 	return 0;
@@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
  * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
  * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
  */
-int ovl_verify_index(struct dentry *index, struct path *lowerstack,
+int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
 		     unsigned int numlower)
 {
 	struct ovl_fh *fh = NULL;
 	size_t len;
-	struct path origin = { };
-	struct path *stack = &origin;
+	struct ovl_path origin = { };
+	struct ovl_path *stack = &origin;
 	unsigned int ctr = 0;
 	int err;
 
@@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
 	if (err)
 		goto fail;
 
-	err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
+	err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
 	if (!err && !ctr)
 		err = -ESTALE;
 	if (err)
@@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
 	}
 	BUG_ON(idx > oe->numlower);
 	*idxp = idx;
-	*path = oe->lowerstack[idx - 1];
+	path->dentry = oe->lowerstack[idx - 1].dentry;
+	path->mnt = oe->lowerstack[idx - 1].layer->mnt;
 
 	return (idx < oe->numlower) ? idx + 1 : -1;
 }
 
+static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
+{
+	int i;
+
+	for (i = 0; i < ofs->numlower; i++) {
+		if (ofs->lower_layers[i].mnt == path->layer->mnt)
+			break;
+	}
+
+	return i;
+}
+
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			  unsigned int flags)
 {
@@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
-	struct path *stack = NULL;
+	struct ovl_path *stack = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	struct dentry *index = NULL;
 	unsigned int ctr = 0;
@@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 	if (!d.stop && poe->numlower) {
 		err = -ENOMEM;
-		stack = kcalloc(ofs->numlower, sizeof(struct path),
+		stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
 				GFP_KERNEL);
 		if (!stack)
 			goto out_put_upper;
 	}
 
 	for (i = 0; !d.stop && i < poe->numlower; i++) {
-		struct path lowerpath = poe->lowerstack[i];
+		struct ovl_path lower = poe->lowerstack[i];
 
 		d.last = i == poe->numlower - 1;
-		err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
+		err = ovl_lookup_layer(lower.dentry, &d, &this);
 		if (err)
 			goto out_put;
 
@@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			continue;
 
 		stack[ctr].dentry = this;
-		stack[ctr].mnt = lowerpath.mnt;
+		stack[ctr].layer = lower.layer;
 		ctr++;
 
 		if (d.stop)
@@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			poe = roe;
 
 			/* Find the current layer on the root dentry */
-			for (i = 0; i < poe->numlower; i++)
-				if (poe->lowerstack[i].mnt == lowerpath.mnt)
-					break;
-			if (WARN_ON(i == poe->numlower))
+			i = ovl_find_layer(ofs, &lower);
+			if (WARN_ON(i == ofs->numlower))
 				break;
 		}
 	}
@@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		goto out_put;
 
 	oe->opaque = upperopaque;
-	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
+	memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
 	dentry->d_fsdata = oe;
 
 	if (upperdentry)
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 73ef1e850635..335e9a052995 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 /* namei.c */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 		      struct dentry *origin, bool is_upper, bool set);
-int ovl_verify_index(struct dentry *index, struct path *lowerstack,
+int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
 		     unsigned int numlower);
 int ovl_get_index_name(struct dentry *origin, struct qstr *name);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
@@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
 void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
 			 struct dentry *dentry, int level);
 int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
-			 struct path *lowerstack, unsigned int numlower);
+			 struct ovl_path *lower, unsigned int numlower);
 
 /* inode.c */
 int ovl_set_nlink_upper(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 25d9b5adcd42..93eb6a044dd2 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -17,11 +17,21 @@ struct ovl_config {
 	bool index;
 };
 
+struct ovl_layer {
+	struct vfsmount *mnt;
+	dev_t pseudo_dev;
+};
+
+struct ovl_path {
+	struct ovl_layer *layer;
+	struct dentry *dentry;
+};
+
 /* private information held for overlayfs's superblock */
 struct ovl_fs {
 	struct vfsmount *upper_mnt;
 	unsigned numlower;
-	struct vfsmount **lower_mnt;
+	struct ovl_layer *lower_layers;
 	/* workbasedir is the path at workdir= mount option */
 	struct dentry *workbasedir;
 	/* workdir is the 'work' directory under workbasedir */
@@ -52,7 +62,7 @@ struct ovl_entry {
 		struct rcu_head rcu;
 	};
 	unsigned numlower;
-	struct path lowerstack[];
+	struct ovl_path lowerstack[];
 };
 
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index bcc123c7706c..0ab657f2c1c8 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
 }
 
 int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
-			 struct path *lowerstack, unsigned int numlower)
+			 struct ovl_path *lower, unsigned int numlower)
 {
 	int err;
 	struct dentry *index = NULL;
@@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
 			index = NULL;
 			break;
 		}
-		err = ovl_verify_index(index, lowerstack, numlower);
+		err = ovl_verify_index(index, lower, numlower);
 		/* Cleanup stale and orphan index entries */
 		if (err && (err == -ESTALE || err == -ENOENT))
 			err = ovl_cleanup(dir, index);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 8702803ba328..323dfd7a0236 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
 	if (ufs->upper_mnt && ufs->upperdir_locked)
 		ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
 	mntput(ufs->upper_mnt);
-	for (i = 0; i < ufs->numlower; i++)
-		mntput(ufs->lower_mnt[i]);
-	kfree(ufs->lower_mnt);
+	for (i = 0; i < ufs->numlower; i++) {
+		mntput(ufs->lower_layers[i].mnt);
+		free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
+	}
+	kfree(ufs->lower_layers);
 
 	kfree(ufs->config.lowerdir);
 	kfree(ufs->config.upperdir);
@@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	}
 
 	err = -ENOMEM;
-	ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
-	if (ufs->lower_mnt == NULL)
+	ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
+				    GFP_KERNEL);
+	if (ufs->lower_layers == NULL)
 		goto out_put_workdir;
 	for (i = 0; i < numlower; i++) {
-		struct vfsmount *mnt = clone_private_mount(&stack[i]);
+		struct vfsmount *mnt;
+		dev_t dev;
+
+		err = get_anon_bdev(&dev);
+		if (err) {
+			pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
+			goto out_put_lower_layers;
+		}
 
+		mnt = clone_private_mount(&stack[i]);
 		err = PTR_ERR(mnt);
 		if (IS_ERR(mnt)) {
 			pr_err("overlayfs: failed to clone lowerpath\n");
-			goto out_put_lower_mnt;
+			free_anon_bdev(dev);
+			goto out_put_lower_layers;
 		}
 		/*
-		 * Make lower_mnt R/O.  That way fchmod/fchown on lower file
+		 * Make lower layers R/O.  That way fchmod/fchown on lower file
 		 * will fail instead of modifying lower fs.
 		 */
 		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
 
-		ufs->lower_mnt[ufs->numlower] = mnt;
+		ufs->lower_layers[ufs->numlower].mnt = mnt;
+		ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
 		ufs->numlower++;
 
 		/* Check if all lower layers are on same sb */
@@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
 		ufs->same_sb = NULL;
 
+	err = -ENOMEM;
+	oe = ovl_alloc_entry(numlower);
+	if (!oe)
+		goto out_put_lower_layers;
+
+	for (i = 0; i < numlower; i++) {
+		oe->lowerstack[i].dentry = stack[i].dentry;
+		oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
+	}
+
 	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
 		/* Verify lower root is upper root origin */
-		err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
-					stack[0].dentry, false, true);
+		err = ovl_verify_origin(upperpath.dentry,
+					oe->lowerstack[0].layer->mnt,
+					oe->lowerstack[0].dentry,
+					false, true);
 		if (err) {
 			pr_err("overlayfs: failed to verify upper root origin\n");
-			goto out_put_lower_mnt;
+			goto out_free_oe;
 		}
 
 		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
@@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			if (!err)
 				err = ovl_indexdir_cleanup(ufs->indexdir,
 							   ufs->upper_mnt,
-							   stack, numlower);
+							   oe->lowerstack,
+							   numlower);
 		}
 		if (err || !ufs->indexdir)
 			pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
@@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	/* Never override disk quota limits or use reserved space */
 	cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
 
-	err = -ENOMEM;
-	oe = ovl_alloc_entry(numlower);
-	if (!oe)
-		goto out_put_cred;
-
 	sb->s_magic = OVERLAYFS_SUPER_MAGIC;
 	sb->s_op = &ovl_super_operations;
 	sb->s_xattr = ovl_xattr_handlers;
@@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
 	if (!root_dentry)
-		goto out_free_oe;
+		goto out_put_cred;
 
 	mntput(upperpath.mnt);
 	for (i = 0; i < numlower; i++)
 		mntput(stack[i].mnt);
+	kfree(stack);
 	mntput(workpath.mnt);
 	kfree(lowertmp);
 
@@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		if (ovl_is_impuredir(upperpath.dentry))
 			ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
 	}
-	for (i = 0; i < numlower; i++) {
-		oe->lowerstack[i].dentry = stack[i].dentry;
-		oe->lowerstack[i].mnt = ufs->lower_mnt[i];
-	}
-	kfree(stack);
 
 	root_dentry->d_fsdata = oe;
 
@@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	return 0;
 
-out_free_oe:
-	kfree(oe);
 out_put_cred:
 	put_cred(ufs->creator_cred);
 out_put_indexdir:
 	dput(ufs->indexdir);
-out_put_lower_mnt:
-	for (i = 0; i < ufs->numlower; i++)
-		mntput(ufs->lower_mnt[i]);
-	kfree(ufs->lower_mnt);
+out_free_oe:
+	kfree(oe);
+out_put_lower_layers:
+	for (i = 0; i < ufs->numlower; i++) {
+		if (ufs->lower_layers[i].mnt)
+			free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
+		mntput(ufs->lower_layers[i].mnt);
+	}
+	kfree(ufs->lower_layers);
 out_put_workdir:
 	dput(ufs->workdir);
 	mntput(ufs->upper_mnt);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9158d17bb320..d6bb1c9f5e7a 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -124,7 +124,12 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
-	*path = oe->numlower ? oe->lowerstack[0] : (struct path) { };
+	if (oe->numlower) {
+		path->mnt = oe->lowerstack[0].layer->mnt;
+		path->dentry = oe->lowerstack[0].dentry;
+	} else {
+		*path = (struct path) { };
+	}
 }
 
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 3/4] ovl: relax same fs constrain for constant st_ino
  2017-10-30 20:27 [PATCH v5 0/4] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  2017-10-30 20:27 ` [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
  2017-10-30 20:27 ` [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
@ 2017-10-30 20:27 ` Amir Goldstein
  2017-10-30 20:27 ` [PATCH v5 4/4] ovl: relax same fs constraint for constant d_ino Amir Goldstein
  3 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-10-30 20:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

For the case of all layers not on the same fs, use the copy up
origin st_ino for non-dir, if we know it.

This guaranties constant and system-wide unique st_ino/st_dev
for non-dir across copy up. Like the same fs case, st_ino of
non-dir is also persistent. Unlike the same fs case, st_dev is
not persistent.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index ec5c3ce91868..a65746e000a3 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -82,6 +82,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	struct path realpath;
 	const struct cred *old_cred;
 	bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
+	bool samefs = ovl_same_sb(dentry->d_sb);
 	int err;
 
 	type = ovl_path_real(dentry, &realpath);
@@ -91,16 +92,16 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		goto out;
 
 	/*
-	 * When all layers are on the same fs, all real inode number are
-	 * unique, so we use the overlay st_dev, which is friendly to du -x.
-	 *
-	 * We also use st_ino of the copy up origin, if we know it.
-	 * This guaranties constant st_dev/st_ino across copy up.
+	 * For non-dir or same fs, we use st_ino of the copy up origin, if we
+	 * know it. This guaranties constant st_dev/st_ino across copy up.
 	 *
 	 * If filesystem supports NFS export ops, this also guaranties
 	 * persistent st_ino across mount cycle.
+	 *
+	 * When all layers are on the same fs, all real inode number are
+	 * unique, so we use the overlay st_dev, which is friendly to du -x.
 	 */
-	if (ovl_same_sb(dentry->d_sb)) {
+	if (!is_dir || samefs) {
 		if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
 			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
@@ -111,7 +112,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (err)
 				goto out;
 
-			WARN_ON_ONCE(stat->dev != lowerstat.dev);
 			/*
 			 * Lower hardlinks may be broken on copy up to different
 			 * upper files, so we cannot use the lower origin st_ino
@@ -123,22 +123,28 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (is_dir || lowerstat.nlink == 1 ||
 			    ovl_test_flag(OVL_INDEX, d_inode(dentry)))
 				stat->ino = lowerstat.ino;
+
+			if (samefs)
+				WARN_ON_ONCE(stat->dev != lowerstat.dev);
+			else
+				stat->dev = lowerstat.dev;
 		}
-		stat->dev = dentry->d_sb->s_dev;
-	} else if (is_dir) {
+		if (samefs)
+			stat->dev = dentry->d_sb->s_dev;
+		else
+			stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
+	} else {
 		/*
-		 * If not all layers are on the same fs the pair {real st_ino;
-		 * overlay st_dev} is not unique, so use the non persistent
-		 * overlay st_ino.
-		 *
 		 * Always use the overlay st_dev for directories, so 'find
 		 * -xdev' will scan the entire overlay mount and won't cross the
 		 * overlay mount boundaries.
+		 *
+		 * If not all layers are on the same fs the pair {real st_ino;
+		 * overlay st_dev} is not unique, so use the non persistent
+		 * overlay st_ino for directories.
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
-	} else {
-		stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
 	}
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 4/4] ovl: relax same fs constraint for constant d_ino
  2017-10-30 20:27 [PATCH v5 0/4] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-10-30 20:27 ` [PATCH v5 3/4] ovl: relax same fs constrain for constant st_ino Amir Goldstein
@ 2017-10-30 20:27 ` Amir Goldstein
  3 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-10-30 20:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

From: Chandan Rajendra <chandan@linux.vnet.ibm.com>

For a pure upper dir in a non-samefs setup, ovl_getattr() returns the
overlay inode number as the value of st_ino. To keep in line with this
behaviour, ovl_fill_real() has been modified to return overlay inode
number of the "." entry as the d_ino of a pure upper dir.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
---
 fs/overlayfs/readdir.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 0ab657f2c1c8..aad74f0891a5 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -452,7 +452,6 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
  * sure that d_ino will be consistent with st_ino from stat(2).
  */
 static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
-
 {
 	struct dentry *dir = path->dentry;
 	struct dentry *this = NULL;
@@ -460,9 +459,6 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 	u64 ino = p->real_ino;
 	int err = 0;
 
-	if (!ovl_same_sb(dir->d_sb))
-		goto out;
-
 	if (p->name[0] == '.') {
 		if (p->len == 1) {
 			this = dget(dir);
@@ -494,8 +490,8 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 		err = vfs_getattr(&statpath, &stat, STATX_INO, 0);
 		if (err)
 			goto fail;
-
-		WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
+		if (d_is_dir(this) || ovl_same_sb(this->d_sb))
+			WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
 		ino = stat.ino;
 	}
 
@@ -617,6 +613,7 @@ struct ovl_readdir_translate {
 	struct ovl_dir_cache *cache;
 	struct dir_context ctx;
 	u64 parent_ino;
+	u64 current_ino;
 };
 
 static int ovl_fill_real(struct dir_context *ctx, const char *name,
@@ -629,6 +626,8 @@ static int ovl_fill_real(struct dir_context *ctx, const char *name,
 
 	if (rdt->parent_ino && strcmp(name, "..") == 0)
 		ino = rdt->parent_ino;
+	else if (rdt->current_ino && namelen == 1 && name[0] == '.')
+		ino = rdt->current_ino;
 	else if (rdt->cache) {
 		struct ovl_cache_entry *p;
 
@@ -669,6 +668,9 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 			return PTR_ERR(rdt.cache);
 	}
 
+	if (!ovl_same_sb(dir->d_sb))
+		rdt.current_ino = dir->d_inode->i_ino;
+
 	return iterate_dir(od->realfile, &rdt.ctx);
 }
 
@@ -689,9 +691,9 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
 		 * dir is impure then need to adjust d_ino for copied up
 		 * entries.
 		 */
-		if (ovl_same_sb(dentry->d_sb) &&
-		    (ovl_test_flag(OVL_IMPURE, d_inode(dentry)) ||
-		     OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent)))) {
+		if (ovl_test_flag(OVL_IMPURE, d_inode(dentry)) ||
+		     OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent)) ||
+		     !ovl_same_sb(dentry->d_sb)) {
 			return ovl_iterate_real(file, ctx);
 		}
 		return iterate_dir(od->realfile, ctx);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h
  2017-10-30 20:27 ` [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
@ 2017-10-31 13:14   ` Miklos Szeredi
  2017-10-31 13:22     ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Miklos Szeredi @ 2017-10-31 13:14 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

Is this really necessary?

"ovl_entry.h" is supposed to be more of a low level thing, and I'm not
sure having that proliferate to all parts of the code is a good idea.

At least some justification would be good here.

Thanks,
Miklos

On Mon, Oct 30, 2017 at 9:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/copy_up.c   | 1 -
>  fs/overlayfs/inode.c     | 1 -
>  fs/overlayfs/namei.c     | 1 -
>  fs/overlayfs/overlayfs.h | 1 +
>  fs/overlayfs/super.c     | 1 -
>  fs/overlayfs/util.c      | 1 -
>  6 files changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index c441f9387a1b..9311d183b532 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -22,7 +22,6 @@
>  #include <linux/ratelimit.h>
>  #include <linux/exportfs.h>
>  #include "overlayfs.h"
> -#include "ovl_entry.h"
>
>  #define OVL_COPY_UP_CHUNK_SIZE (1 << 20)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 03f0ec2b73eb..50e233ccca53 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -14,7 +14,6 @@
>  #include <linux/posix_acl.h>
>  #include <linux/ratelimit.h>
>  #include "overlayfs.h"
> -#include "ovl_entry.h"
>
>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  {
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 63200578ce5b..de2dac98e147 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -15,7 +15,6 @@
>  #include <linux/mount.h>
>  #include <linux/exportfs.h>
>  #include "overlayfs.h"
> -#include "ovl_entry.h"
>
>  struct ovl_lookup_data {
>         struct qstr name;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 4fa30633780d..73ef1e850635 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -9,6 +9,7 @@
>
>  #include <linux/kernel.h>
>  #include <linux/uuid.h>
> +#include "ovl_entry.h"
>
>  enum ovl_path_type {
>         __OVL_PATH_UPPER        = (1 << 0),
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index f5738e96a052..8702803ba328 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -18,7 +18,6 @@
>  #include <linux/seq_file.h>
>  #include <linux/posix_acl_xattr.h>
>  #include "overlayfs.h"
> -#include "ovl_entry.h"
>
>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
>  MODULE_DESCRIPTION("Overlay filesystem");
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 51ca8bd16009..9158d17bb320 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -17,7 +17,6 @@
>  #include <linux/namei.h>
>  #include <linux/ratelimit.h>
>  #include "overlayfs.h"
> -#include "ovl_entry.h"
>
>  int ovl_want_write(struct dentry *dentry)
>  {
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h
  2017-10-31 13:14   ` Miklos Szeredi
@ 2017-10-31 13:22     ` Amir Goldstein
  0 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-10-31 13:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

On Tue, Oct 31, 2017 at 3:14 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> Is this really necessary?
>
> "ovl_entry.h" is supposed to be more of a low level thing, and I'm not
> sure having that proliferate to all parts of the code is a good idea.
>
> At least some justification would be good here.

The "problem" is that include "ovl_entry.h" is already included by 5/8
overlayfs c files and by the end of the NFS export series its 7/9 and
there are exported functions in overlayfs.h whose signature include
pointers to structs defined in ovl_entry.h, so I started forward declaring those
structs in overlayfs.h.

After you look at the next patch "anonymous devs for lowerdirs"
let me know if you prefer a different distribution between the 2 include
files, or a 3rd include file.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-30 20:27 ` [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
@ 2017-10-31 13:43   ` Miklos Szeredi
  2017-10-31 13:53     ` Amir Goldstein
  2017-11-01 14:42   ` Vivek Goyal
  2017-11-01 15:41   ` Vivek Goyal
  2 siblings, 1 reply; 22+ messages in thread
From: Miklos Szeredi @ 2017-10-31 13:43 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

On Mon, Oct 30, 2017 at 9:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>
> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
> provides unique values for st_dev. The unique values are obtained by
> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
> instance.
>
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c     | 18 +++++++++++
>  fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
>  fs/overlayfs/overlayfs.h |  4 +--
>  fs/overlayfs/ovl_entry.h | 14 +++++++--
>  fs/overlayfs/readdir.c   |  4 +--
>  fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
>  fs/overlayfs/util.c      |  7 ++++-
>  7 files changed, 121 insertions(+), 58 deletions(-)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 50e233ccca53..ec5c3ce91868 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -8,6 +8,7 @@
>   */
>
>  #include <linux/fs.h>
> +#include <linux/mount.h>
>  #include <linux/slab.h>
>  #include <linux/cred.h>
>  #include <linux/xattr.h>
> @@ -15,6 +16,21 @@
>  #include <linux/ratelimit.h>
>  #include "overlayfs.h"
>
> +
> +static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
> +{
> +       struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
> +       struct ovl_entry *oe = dentry->d_fsdata;
> +
> +       if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
> +               return dev;

This doesn't feel right. Okay, we may want to check against upper
device number for sanity and WARN if doesn't match, although we can
have all sorts of weirdness with btrfs, for example.   So the logic
should be:

if type is upper then use the dev returned by stat
else return the pseudo dev


> +
> +       if (oe->numlower)
> +               return oe->lowerstack[0].layer->pseudo_dev;
> +
> +       return dev;
> +}
> +
>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>         int err;
> @@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>                  */
>                 stat->dev = dentry->d_sb->s_dev;
>                 stat->ino = dentry->d_inode->i_ino;
> +       } else {
> +               stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
>         }
>
>         /*

The above can be a separate patch.  I.e. split this into:

 1) add ovl_layer infrastructure
 2) use that infrastructure in getattr

1) doesn't change functionality while 2) does.

> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index de2dac98e147..78e965527167 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
>
>
>  static int ovl_check_origin(struct dentry *upperdentry,
> -                           struct path *lowerstack, unsigned int numlower,
> -                           struct path **stackp, unsigned int *ctrp)
> +                           struct ovl_path *lower, unsigned int numlower,
> +                           struct ovl_path **stackp, unsigned int *ctrp)
>  {
>         struct vfsmount *mnt;
>         struct dentry *origin = NULL;
>         int i;
>
> -
>         for (i = 0; i < numlower; i++) {
> -               mnt = lowerstack[i].mnt;
> +               mnt = lower[i].layer->mnt;
>                 origin = ovl_get_origin(upperdentry, mnt);
>                 if (IS_ERR(origin))
>                         return PTR_ERR(origin);
> @@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
>
>         BUG_ON(*ctrp);
>         if (!*stackp)
> -               *stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
> +               *stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
>         if (!*stackp) {
>                 dput(origin);
>                 return -ENOMEM;
>         }
> -       **stackp = (struct path) { .dentry = origin, .mnt = mnt };
> +       **stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
>         *ctrp = 1;
>
>         return 0;
> @@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>   * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
>   * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
>   */
> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>                      unsigned int numlower)
>  {
>         struct ovl_fh *fh = NULL;
>         size_t len;
> -       struct path origin = { };
> -       struct path *stack = &origin;
> +       struct ovl_path origin = { };
> +       struct ovl_path *stack = &origin;
>         unsigned int ctr = 0;
>         int err;
>
> @@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>         if (err)
>                 goto fail;
>
> -       err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
> +       err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
>         if (!err && !ctr)
>                 err = -ESTALE;
>         if (err)
> @@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
>         }
>         BUG_ON(idx > oe->numlower);
>         *idxp = idx;
> -       *path = oe->lowerstack[idx - 1];
> +       path->dentry = oe->lowerstack[idx - 1].dentry;
> +       path->mnt = oe->lowerstack[idx - 1].layer->mnt;
>
>         return (idx < oe->numlower) ? idx + 1 : -1;
>  }
>
> +static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
> +{
> +       int i;
> +
> +       for (i = 0; i < ofs->numlower; i++) {
> +               if (ofs->lower_layers[i].mnt == path->layer->mnt)
> +                       break;
> +       }
> +
> +       return i;
> +}
> +
>  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                           unsigned int flags)
>  {
> @@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> -       struct path *stack = NULL;
> +       struct ovl_path *stack = NULL;
>         struct dentry *upperdir, *upperdentry = NULL;
>         struct dentry *index = NULL;
>         unsigned int ctr = 0;
> @@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>
>         if (!d.stop && poe->numlower) {
>                 err = -ENOMEM;
> -               stack = kcalloc(ofs->numlower, sizeof(struct path),
> +               stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
>                                 GFP_KERNEL);
>                 if (!stack)
>                         goto out_put_upper;
>         }
>
>         for (i = 0; !d.stop && i < poe->numlower; i++) {
> -               struct path lowerpath = poe->lowerstack[i];
> +               struct ovl_path lower = poe->lowerstack[i];
>
>                 d.last = i == poe->numlower - 1;
> -               err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
> +               err = ovl_lookup_layer(lower.dentry, &d, &this);
>                 if (err)
>                         goto out_put;
>
> @@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                         continue;
>
>                 stack[ctr].dentry = this;
> -               stack[ctr].mnt = lowerpath.mnt;
> +               stack[ctr].layer = lower.layer;
>                 ctr++;
>
>                 if (d.stop)
> @@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                         poe = roe;
>
>                         /* Find the current layer on the root dentry */
> -                       for (i = 0; i < poe->numlower; i++)
> -                               if (poe->lowerstack[i].mnt == lowerpath.mnt)
> -                                       break;
> -                       if (WARN_ON(i == poe->numlower))
> +                       i = ovl_find_layer(ofs, &lower);
> +                       if (WARN_ON(i == ofs->numlower))
>                                 break;
>                 }
>         }
> @@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                 goto out_put;
>
>         oe->opaque = upperopaque;
> -       memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
> +       memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
>         dentry->d_fsdata = oe;
>
>         if (upperdentry)
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 73ef1e850635..335e9a052995 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
>  /* namei.c */
>  int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>                       struct dentry *origin, bool is_upper, bool set);
> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>                      unsigned int numlower);
>  int ovl_get_index_name(struct dentry *origin, struct qstr *name);
>  int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
> @@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
>  void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>                          struct dentry *dentry, int level);
>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> -                        struct path *lowerstack, unsigned int numlower);
> +                        struct ovl_path *lower, unsigned int numlower);
>
>  /* inode.c */
>  int ovl_set_nlink_upper(struct dentry *dentry);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 25d9b5adcd42..93eb6a044dd2 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -17,11 +17,21 @@ struct ovl_config {
>         bool index;
>  };
>
> +struct ovl_layer {
> +       struct vfsmount *mnt;
> +       dev_t pseudo_dev;
> +};
> +
> +struct ovl_path {
> +       struct ovl_layer *layer;
> +       struct dentry *dentry;
> +};
> +
>  /* private information held for overlayfs's superblock */
>  struct ovl_fs {
>         struct vfsmount *upper_mnt;
>         unsigned numlower;
> -       struct vfsmount **lower_mnt;
> +       struct ovl_layer *lower_layers;
>         /* workbasedir is the path at workdir= mount option */
>         struct dentry *workbasedir;
>         /* workdir is the 'work' directory under workbasedir */
> @@ -52,7 +62,7 @@ struct ovl_entry {
>                 struct rcu_head rcu;
>         };
>         unsigned numlower;
> -       struct path lowerstack[];
> +       struct ovl_path lowerstack[];
>  };
>
>  struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index bcc123c7706c..0ab657f2c1c8 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>  }
>
>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> -                        struct path *lowerstack, unsigned int numlower)
> +                        struct ovl_path *lower, unsigned int numlower)
>  {
>         int err;
>         struct dentry *index = NULL;
> @@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>                         index = NULL;
>                         break;
>                 }
> -               err = ovl_verify_index(index, lowerstack, numlower);
> +               err = ovl_verify_index(index, lower, numlower);
>                 /* Cleanup stale and orphan index entries */
>                 if (err && (err == -ESTALE || err == -ENOENT))
>                         err = ovl_cleanup(dir, index);
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 8702803ba328..323dfd7a0236 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
>         if (ufs->upper_mnt && ufs->upperdir_locked)
>                 ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
>         mntput(ufs->upper_mnt);
> -       for (i = 0; i < ufs->numlower; i++)
> -               mntput(ufs->lower_mnt[i]);
> -       kfree(ufs->lower_mnt);
> +       for (i = 0; i < ufs->numlower; i++) {
> +               mntput(ufs->lower_layers[i].mnt);
> +               free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +       }
> +       kfree(ufs->lower_layers);
>
>         kfree(ufs->config.lowerdir);
>         kfree(ufs->config.upperdir);
> @@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>         }
>
>         err = -ENOMEM;
> -       ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
> -       if (ufs->lower_mnt == NULL)
> +       ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
> +                                   GFP_KERNEL);
> +       if (ufs->lower_layers == NULL)
>                 goto out_put_workdir;
>         for (i = 0; i < numlower; i++) {
> -               struct vfsmount *mnt = clone_private_mount(&stack[i]);
> +               struct vfsmount *mnt;
> +               dev_t dev;
> +
> +               err = get_anon_bdev(&dev);
> +               if (err) {
> +                       pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
> +                       goto out_put_lower_layers;
> +               }
>
> +               mnt = clone_private_mount(&stack[i]);
>                 err = PTR_ERR(mnt);
>                 if (IS_ERR(mnt)) {
>                         pr_err("overlayfs: failed to clone lowerpath\n");
> -                       goto out_put_lower_mnt;
> +                       free_anon_bdev(dev);
> +                       goto out_put_lower_layers;
>                 }
>                 /*
> -                * Make lower_mnt R/O.  That way fchmod/fchown on lower file
> +                * Make lower layers R/O.  That way fchmod/fchown on lower file
>                  * will fail instead of modifying lower fs.
>                  */
>                 mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
>
> -               ufs->lower_mnt[ufs->numlower] = mnt;
> +               ufs->lower_layers[ufs->numlower].mnt = mnt;
> +               ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
>                 ufs->numlower++;
>
>                 /* Check if all lower layers are on same sb */
> @@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>         else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
>                 ufs->same_sb = NULL;
>
> +       err = -ENOMEM;
> +       oe = ovl_alloc_entry(numlower);
> +       if (!oe)
> +               goto out_put_lower_layers;
> +
> +       for (i = 0; i < numlower; i++) {
> +               oe->lowerstack[i].dentry = stack[i].dentry;
> +               oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
> +       }
> +
>         if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
>                 /* Verify lower root is upper root origin */
> -               err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
> -                                       stack[0].dentry, false, true);
> +               err = ovl_verify_origin(upperpath.dentry,
> +                                       oe->lowerstack[0].layer->mnt,
> +                                       oe->lowerstack[0].dentry,
> +                                       false, true);
>                 if (err) {
>                         pr_err("overlayfs: failed to verify upper root origin\n");
> -                       goto out_put_lower_mnt;
> +                       goto out_free_oe;
>                 }
>
>                 ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
> @@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>                         if (!err)
>                                 err = ovl_indexdir_cleanup(ufs->indexdir,
>                                                            ufs->upper_mnt,
> -                                                          stack, numlower);
> +                                                          oe->lowerstack,
> +                                                          numlower);
>                 }
>                 if (err || !ufs->indexdir)
>                         pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
> @@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>         /* Never override disk quota limits or use reserved space */
>         cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
>
> -       err = -ENOMEM;
> -       oe = ovl_alloc_entry(numlower);
> -       if (!oe)
> -               goto out_put_cred;
> -
>         sb->s_magic = OVERLAYFS_SUPER_MAGIC;
>         sb->s_op = &ovl_super_operations;
>         sb->s_xattr = ovl_xattr_handlers;
> @@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>
>         root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
>         if (!root_dentry)
> -               goto out_free_oe;
> +               goto out_put_cred;
>
>         mntput(upperpath.mnt);
>         for (i = 0; i < numlower; i++)
>                 mntput(stack[i].mnt);
> +       kfree(stack);
>         mntput(workpath.mnt);
>         kfree(lowertmp);
>
> @@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>                 if (ovl_is_impuredir(upperpath.dentry))
>                         ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
>         }
> -       for (i = 0; i < numlower; i++) {
> -               oe->lowerstack[i].dentry = stack[i].dentry;
> -               oe->lowerstack[i].mnt = ufs->lower_mnt[i];
> -       }
> -       kfree(stack);
>
>         root_dentry->d_fsdata = oe;
>
> @@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>
>         return 0;
>
> -out_free_oe:
> -       kfree(oe);
>  out_put_cred:
>         put_cred(ufs->creator_cred);
>  out_put_indexdir:
>         dput(ufs->indexdir);
> -out_put_lower_mnt:
> -       for (i = 0; i < ufs->numlower; i++)
> -               mntput(ufs->lower_mnt[i]);
> -       kfree(ufs->lower_mnt);
> +out_free_oe:
> +       kfree(oe);
> +out_put_lower_layers:
> +       for (i = 0; i < ufs->numlower; i++) {
> +               if (ufs->lower_layers[i].mnt)
> +                       free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +               mntput(ufs->lower_layers[i].mnt);
> +       }
> +       kfree(ufs->lower_layers);
>  out_put_workdir:
>         dput(ufs->workdir);
>         mntput(ufs->upper_mnt);

I'm not even attempting to review the ovl_fill_super() changes.  We
need to do something about that function (it's my fault for letting it
grow this big and convoluted).  Do you want me to have a go at
cleaning it up?  Or do you feel an urging desire to do so?

Thanks,
Miklos


> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 9158d17bb320..d6bb1c9f5e7a 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -124,7 +124,12 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
>  {
>         struct ovl_entry *oe = dentry->d_fsdata;
>
> -       *path = oe->numlower ? oe->lowerstack[0] : (struct path) { };
> +       if (oe->numlower) {
> +               path->mnt = oe->lowerstack[0].layer->mnt;
> +               path->dentry = oe->lowerstack[0].dentry;
> +       } else {
> +               *path = (struct path) { };
> +       }
>  }
>
>  enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-31 13:43   ` Miklos Szeredi
@ 2017-10-31 13:53     ` Amir Goldstein
  2017-10-31 23:01       ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2017-10-31 13:53 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

On Tue, Oct 31, 2017 at 3:43 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Oct 30, 2017 at 9:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>>
>> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
>> provides unique values for st_dev. The unique values are obtained by
>> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
>> instance.
>>
>> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/inode.c     | 18 +++++++++++
>>  fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
>>  fs/overlayfs/overlayfs.h |  4 +--
>>  fs/overlayfs/ovl_entry.h | 14 +++++++--
>>  fs/overlayfs/readdir.c   |  4 +--
>>  fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
>>  fs/overlayfs/util.c      |  7 ++++-
>>  7 files changed, 121 insertions(+), 58 deletions(-)
>>
>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>> index 50e233ccca53..ec5c3ce91868 100644
>> --- a/fs/overlayfs/inode.c
>> +++ b/fs/overlayfs/inode.c
>> @@ -8,6 +8,7 @@
>>   */
>>
>>  #include <linux/fs.h>
>> +#include <linux/mount.h>
>>  #include <linux/slab.h>
>>  #include <linux/cred.h>
>>  #include <linux/xattr.h>
>> @@ -15,6 +16,21 @@
>>  #include <linux/ratelimit.h>
>>  #include "overlayfs.h"
>>
>> +
>> +static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
>> +{
>> +       struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>> +       struct ovl_entry *oe = dentry->d_fsdata;
>> +
>> +       if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
>> +               return dev;
>
> This doesn't feel right. Okay, we may want to check against upper
> device number for sanity and WARN if doesn't match, although we can
> have all sorts of weirdness with btrfs, for example.   So the logic
> should be:
>
> if type is upper then use the dev returned by stat
> else return the pseudo dev
>
>
>> +
>> +       if (oe->numlower)
>> +               return oe->lowerstack[0].layer->pseudo_dev;
>> +
>> +       return dev;
>> +}
>> +
>>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>>  {
>>         int err;
>> @@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>>                  */
>>                 stat->dev = dentry->d_sb->s_dev;
>>                 stat->ino = dentry->d_inode->i_ino;
>> +       } else {
>> +               stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
>>         }
>>
>>         /*
>
> The above can be a separate patch.  I.e. split this into:
>
>  1) add ovl_layer infrastructure
>  2) use that infrastructure in getattr
>
> 1) doesn't change functionality while 2) does.
>
>> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
>> index de2dac98e147..78e965527167 100644
>> --- a/fs/overlayfs/namei.c
>> +++ b/fs/overlayfs/namei.c
>> @@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
>>
>>
>>  static int ovl_check_origin(struct dentry *upperdentry,
>> -                           struct path *lowerstack, unsigned int numlower,
>> -                           struct path **stackp, unsigned int *ctrp)
>> +                           struct ovl_path *lower, unsigned int numlower,
>> +                           struct ovl_path **stackp, unsigned int *ctrp)
>>  {
>>         struct vfsmount *mnt;
>>         struct dentry *origin = NULL;
>>         int i;
>>
>> -
>>         for (i = 0; i < numlower; i++) {
>> -               mnt = lowerstack[i].mnt;
>> +               mnt = lower[i].layer->mnt;
>>                 origin = ovl_get_origin(upperdentry, mnt);
>>                 if (IS_ERR(origin))
>>                         return PTR_ERR(origin);
>> @@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
>>
>>         BUG_ON(*ctrp);
>>         if (!*stackp)
>> -               *stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
>> +               *stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
>>         if (!*stackp) {
>>                 dput(origin);
>>                 return -ENOMEM;
>>         }
>> -       **stackp = (struct path) { .dentry = origin, .mnt = mnt };
>> +       **stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
>>         *ctrp = 1;
>>
>>         return 0;
>> @@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>>   * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
>>   * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
>>   */
>> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>>                      unsigned int numlower)
>>  {
>>         struct ovl_fh *fh = NULL;
>>         size_t len;
>> -       struct path origin = { };
>> -       struct path *stack = &origin;
>> +       struct ovl_path origin = { };
>> +       struct ovl_path *stack = &origin;
>>         unsigned int ctr = 0;
>>         int err;
>>
>> @@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>>         if (err)
>>                 goto fail;
>>
>> -       err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
>> +       err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
>>         if (!err && !ctr)
>>                 err = -ESTALE;
>>         if (err)
>> @@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
>>         }
>>         BUG_ON(idx > oe->numlower);
>>         *idxp = idx;
>> -       *path = oe->lowerstack[idx - 1];
>> +       path->dentry = oe->lowerstack[idx - 1].dentry;
>> +       path->mnt = oe->lowerstack[idx - 1].layer->mnt;
>>
>>         return (idx < oe->numlower) ? idx + 1 : -1;
>>  }
>>
>> +static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < ofs->numlower; i++) {
>> +               if (ofs->lower_layers[i].mnt == path->layer->mnt)
>> +                       break;
>> +       }
>> +
>> +       return i;
>> +}
>> +
>>  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                           unsigned int flags)
>>  {
>> @@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
>> -       struct path *stack = NULL;
>> +       struct ovl_path *stack = NULL;
>>         struct dentry *upperdir, *upperdentry = NULL;
>>         struct dentry *index = NULL;
>>         unsigned int ctr = 0;
>> @@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>
>>         if (!d.stop && poe->numlower) {
>>                 err = -ENOMEM;
>> -               stack = kcalloc(ofs->numlower, sizeof(struct path),
>> +               stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
>>                                 GFP_KERNEL);
>>                 if (!stack)
>>                         goto out_put_upper;
>>         }
>>
>>         for (i = 0; !d.stop && i < poe->numlower; i++) {
>> -               struct path lowerpath = poe->lowerstack[i];
>> +               struct ovl_path lower = poe->lowerstack[i];
>>
>>                 d.last = i == poe->numlower - 1;
>> -               err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
>> +               err = ovl_lookup_layer(lower.dentry, &d, &this);
>>                 if (err)
>>                         goto out_put;
>>
>> @@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                         continue;
>>
>>                 stack[ctr].dentry = this;
>> -               stack[ctr].mnt = lowerpath.mnt;
>> +               stack[ctr].layer = lower.layer;
>>                 ctr++;
>>
>>                 if (d.stop)
>> @@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                         poe = roe;
>>
>>                         /* Find the current layer on the root dentry */
>> -                       for (i = 0; i < poe->numlower; i++)
>> -                               if (poe->lowerstack[i].mnt == lowerpath.mnt)
>> -                                       break;
>> -                       if (WARN_ON(i == poe->numlower))
>> +                       i = ovl_find_layer(ofs, &lower);
>> +                       if (WARN_ON(i == ofs->numlower))
>>                                 break;
>>                 }
>>         }
>> @@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                 goto out_put;
>>
>>         oe->opaque = upperopaque;
>> -       memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
>> +       memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
>>         dentry->d_fsdata = oe;
>>
>>         if (upperdentry)
>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>> index 73ef1e850635..335e9a052995 100644
>> --- a/fs/overlayfs/overlayfs.h
>> +++ b/fs/overlayfs/overlayfs.h
>> @@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
>>  /* namei.c */
>>  int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>>                       struct dentry *origin, bool is_upper, bool set);
>> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>>                      unsigned int numlower);
>>  int ovl_get_index_name(struct dentry *origin, struct qstr *name);
>>  int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
>> @@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
>>  void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>>                          struct dentry *dentry, int level);
>>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>> -                        struct path *lowerstack, unsigned int numlower);
>> +                        struct ovl_path *lower, unsigned int numlower);
>>
>>  /* inode.c */
>>  int ovl_set_nlink_upper(struct dentry *dentry);
>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>> index 25d9b5adcd42..93eb6a044dd2 100644
>> --- a/fs/overlayfs/ovl_entry.h
>> +++ b/fs/overlayfs/ovl_entry.h
>> @@ -17,11 +17,21 @@ struct ovl_config {
>>         bool index;
>>  };
>>
>> +struct ovl_layer {
>> +       struct vfsmount *mnt;
>> +       dev_t pseudo_dev;
>> +};
>> +
>> +struct ovl_path {
>> +       struct ovl_layer *layer;
>> +       struct dentry *dentry;
>> +};
>> +
>>  /* private information held for overlayfs's superblock */
>>  struct ovl_fs {
>>         struct vfsmount *upper_mnt;
>>         unsigned numlower;
>> -       struct vfsmount **lower_mnt;
>> +       struct ovl_layer *lower_layers;
>>         /* workbasedir is the path at workdir= mount option */
>>         struct dentry *workbasedir;
>>         /* workdir is the 'work' directory under workbasedir */
>> @@ -52,7 +62,7 @@ struct ovl_entry {
>>                 struct rcu_head rcu;
>>         };
>>         unsigned numlower;
>> -       struct path lowerstack[];
>> +       struct ovl_path lowerstack[];
>>  };
>>
>>  struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
>> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>> index bcc123c7706c..0ab657f2c1c8 100644
>> --- a/fs/overlayfs/readdir.c
>> +++ b/fs/overlayfs/readdir.c
>> @@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>>  }
>>
>>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>> -                        struct path *lowerstack, unsigned int numlower)
>> +                        struct ovl_path *lower, unsigned int numlower)
>>  {
>>         int err;
>>         struct dentry *index = NULL;
>> @@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>>                         index = NULL;
>>                         break;
>>                 }
>> -               err = ovl_verify_index(index, lowerstack, numlower);
>> +               err = ovl_verify_index(index, lower, numlower);
>>                 /* Cleanup stale and orphan index entries */
>>                 if (err && (err == -ESTALE || err == -ENOENT))
>>                         err = ovl_cleanup(dir, index);
>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>> index 8702803ba328..323dfd7a0236 100644
>> --- a/fs/overlayfs/super.c
>> +++ b/fs/overlayfs/super.c
>> @@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
>>         if (ufs->upper_mnt && ufs->upperdir_locked)
>>                 ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
>>         mntput(ufs->upper_mnt);
>> -       for (i = 0; i < ufs->numlower; i++)
>> -               mntput(ufs->lower_mnt[i]);
>> -       kfree(ufs->lower_mnt);
>> +       for (i = 0; i < ufs->numlower; i++) {
>> +               mntput(ufs->lower_layers[i].mnt);
>> +               free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
>> +       }
>> +       kfree(ufs->lower_layers);
>>
>>         kfree(ufs->config.lowerdir);
>>         kfree(ufs->config.upperdir);
>> @@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>         }
>>
>>         err = -ENOMEM;
>> -       ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
>> -       if (ufs->lower_mnt == NULL)
>> +       ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
>> +                                   GFP_KERNEL);
>> +       if (ufs->lower_layers == NULL)
>>                 goto out_put_workdir;
>>         for (i = 0; i < numlower; i++) {
>> -               struct vfsmount *mnt = clone_private_mount(&stack[i]);
>> +               struct vfsmount *mnt;
>> +               dev_t dev;
>> +
>> +               err = get_anon_bdev(&dev);
>> +               if (err) {
>> +                       pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
>> +                       goto out_put_lower_layers;
>> +               }
>>
>> +               mnt = clone_private_mount(&stack[i]);
>>                 err = PTR_ERR(mnt);
>>                 if (IS_ERR(mnt)) {
>>                         pr_err("overlayfs: failed to clone lowerpath\n");
>> -                       goto out_put_lower_mnt;
>> +                       free_anon_bdev(dev);
>> +                       goto out_put_lower_layers;
>>                 }
>>                 /*
>> -                * Make lower_mnt R/O.  That way fchmod/fchown on lower file
>> +                * Make lower layers R/O.  That way fchmod/fchown on lower file
>>                  * will fail instead of modifying lower fs.
>>                  */
>>                 mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
>>
>> -               ufs->lower_mnt[ufs->numlower] = mnt;
>> +               ufs->lower_layers[ufs->numlower].mnt = mnt;
>> +               ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
>>                 ufs->numlower++;
>>
>>                 /* Check if all lower layers are on same sb */
>> @@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>         else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
>>                 ufs->same_sb = NULL;
>>
>> +       err = -ENOMEM;
>> +       oe = ovl_alloc_entry(numlower);
>> +       if (!oe)
>> +               goto out_put_lower_layers;
>> +
>> +       for (i = 0; i < numlower; i++) {
>> +               oe->lowerstack[i].dentry = stack[i].dentry;
>> +               oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
>> +       }
>> +
>>         if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
>>                 /* Verify lower root is upper root origin */
>> -               err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
>> -                                       stack[0].dentry, false, true);
>> +               err = ovl_verify_origin(upperpath.dentry,
>> +                                       oe->lowerstack[0].layer->mnt,
>> +                                       oe->lowerstack[0].dentry,
>> +                                       false, true);
>>                 if (err) {
>>                         pr_err("overlayfs: failed to verify upper root origin\n");
>> -                       goto out_put_lower_mnt;
>> +                       goto out_free_oe;
>>                 }
>>
>>                 ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
>> @@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>                         if (!err)
>>                                 err = ovl_indexdir_cleanup(ufs->indexdir,
>>                                                            ufs->upper_mnt,
>> -                                                          stack, numlower);
>> +                                                          oe->lowerstack,
>> +                                                          numlower);
>>                 }
>>                 if (err || !ufs->indexdir)
>>                         pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
>> @@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>         /* Never override disk quota limits or use reserved space */
>>         cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
>>
>> -       err = -ENOMEM;
>> -       oe = ovl_alloc_entry(numlower);
>> -       if (!oe)
>> -               goto out_put_cred;
>> -
>>         sb->s_magic = OVERLAYFS_SUPER_MAGIC;
>>         sb->s_op = &ovl_super_operations;
>>         sb->s_xattr = ovl_xattr_handlers;
>> @@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>
>>         root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
>>         if (!root_dentry)
>> -               goto out_free_oe;
>> +               goto out_put_cred;
>>
>>         mntput(upperpath.mnt);
>>         for (i = 0; i < numlower; i++)
>>                 mntput(stack[i].mnt);
>> +       kfree(stack);
>>         mntput(workpath.mnt);
>>         kfree(lowertmp);
>>
>> @@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>                 if (ovl_is_impuredir(upperpath.dentry))
>>                         ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
>>         }
>> -       for (i = 0; i < numlower; i++) {
>> -               oe->lowerstack[i].dentry = stack[i].dentry;
>> -               oe->lowerstack[i].mnt = ufs->lower_mnt[i];
>> -       }
>> -       kfree(stack);
>>
>>         root_dentry->d_fsdata = oe;
>>
>> @@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>
>>         return 0;
>>
>> -out_free_oe:
>> -       kfree(oe);
>>  out_put_cred:
>>         put_cred(ufs->creator_cred);
>>  out_put_indexdir:
>>         dput(ufs->indexdir);
>> -out_put_lower_mnt:
>> -       for (i = 0; i < ufs->numlower; i++)
>> -               mntput(ufs->lower_mnt[i]);
>> -       kfree(ufs->lower_mnt);
>> +out_free_oe:
>> +       kfree(oe);
>> +out_put_lower_layers:
>> +       for (i = 0; i < ufs->numlower; i++) {
>> +               if (ufs->lower_layers[i].mnt)
>> +                       free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
>> +               mntput(ufs->lower_layers[i].mnt);
>> +       }
>> +       kfree(ufs->lower_layers);
>>  out_put_workdir:
>>         dput(ufs->workdir);
>>         mntput(ufs->upper_mnt);
>
> I'm not even attempting to review the ovl_fill_super() changes.  We
> need to do something about that function (it's my fault for letting it
> grow this big and convoluted).  Do you want me to have a go at
> cleaning it up?  Or do you feel an urging desire to do so?
>

Well, I am up to my neck with the NFS export patches and besides
I probably won't get the cleanup done the way you would ;-)
so I prefer you have a go at it and I will rebase the patches on top
of your cleanup.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-31 13:53     ` Amir Goldstein
@ 2017-10-31 23:01       ` Amir Goldstein
  2017-11-01 13:17         ` Chandan Rajendra
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2017-10-31 23:01 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Chandan Rajendra, zhangyi, linux-unionfs

On Tue, Oct 31, 2017 at 3:53 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Tue, Oct 31, 2017 at 3:43 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, Oct 30, 2017 at 9:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>>>
>>> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
>>> provides unique values for st_dev. The unique values are obtained by
>>> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
>>> instance.
>>>
>>> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>>> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]
>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> ---
>>>  fs/overlayfs/inode.c     | 18 +++++++++++
>>>  fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
>>>  fs/overlayfs/overlayfs.h |  4 +--
>>>  fs/overlayfs/ovl_entry.h | 14 +++++++--
>>>  fs/overlayfs/readdir.c   |  4 +--
>>>  fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
>>>  fs/overlayfs/util.c      |  7 ++++-
>>>  7 files changed, 121 insertions(+), 58 deletions(-)
>>>
>>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>>> index 50e233ccca53..ec5c3ce91868 100644
>>> --- a/fs/overlayfs/inode.c
>>> +++ b/fs/overlayfs/inode.c
>>> @@ -8,6 +8,7 @@
>>>   */
>>>
>>>  #include <linux/fs.h>
>>> +#include <linux/mount.h>
>>>  #include <linux/slab.h>
>>>  #include <linux/cred.h>
>>>  #include <linux/xattr.h>
>>> @@ -15,6 +16,21 @@
>>>  #include <linux/ratelimit.h>
>>>  #include "overlayfs.h"
>>>
>>> +
>>> +static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
>>> +{
>>> +       struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>>> +       struct ovl_entry *oe = dentry->d_fsdata;
>>> +
>>> +       if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
>>> +               return dev;
>>
>> This doesn't feel right. Okay, we may want to check against upper
>> device number for sanity and WARN if doesn't match, although we can
>> have all sorts of weirdness with btrfs, for example.   So the logic
>> should be:
>>
>> if type is upper then use the dev returned by stat
>> else return the pseudo dev
>>
>>
>>> +
>>> +       if (oe->numlower)
>>> +               return oe->lowerstack[0].layer->pseudo_dev;
>>> +
>>> +       return dev;
>>> +}
>>> +
>>>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>>>  {
>>>         int err;
>>> @@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>>>                  */
>>>                 stat->dev = dentry->d_sb->s_dev;
>>>                 stat->ino = dentry->d_inode->i_ino;
>>> +       } else {
>>> +               stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
>>>         }
>>>
>>>         /*
>>
>> The above can be a separate patch.  I.e. split this into:
>>
>>  1) add ovl_layer infrastructure
>>  2) use that infrastructure in getattr
>>
>> 1) doesn't change functionality while 2) does.
>>
>>> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
>>> index de2dac98e147..78e965527167 100644
>>> --- a/fs/overlayfs/namei.c
>>> +++ b/fs/overlayfs/namei.c
>>> @@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
>>>
>>>
>>>  static int ovl_check_origin(struct dentry *upperdentry,
>>> -                           struct path *lowerstack, unsigned int numlower,
>>> -                           struct path **stackp, unsigned int *ctrp)
>>> +                           struct ovl_path *lower, unsigned int numlower,
>>> +                           struct ovl_path **stackp, unsigned int *ctrp)
>>>  {
>>>         struct vfsmount *mnt;
>>>         struct dentry *origin = NULL;
>>>         int i;
>>>
>>> -
>>>         for (i = 0; i < numlower; i++) {
>>> -               mnt = lowerstack[i].mnt;
>>> +               mnt = lower[i].layer->mnt;
>>>                 origin = ovl_get_origin(upperdentry, mnt);
>>>                 if (IS_ERR(origin))
>>>                         return PTR_ERR(origin);
>>> @@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
>>>
>>>         BUG_ON(*ctrp);
>>>         if (!*stackp)
>>> -               *stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
>>> +               *stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
>>>         if (!*stackp) {
>>>                 dput(origin);
>>>                 return -ENOMEM;
>>>         }
>>> -       **stackp = (struct path) { .dentry = origin, .mnt = mnt };
>>> +       **stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
>>>         *ctrp = 1;
>>>
>>>         return 0;
>>> @@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>>>   * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
>>>   * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
>>>   */
>>> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>>> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>>>                      unsigned int numlower)
>>>  {
>>>         struct ovl_fh *fh = NULL;
>>>         size_t len;
>>> -       struct path origin = { };
>>> -       struct path *stack = &origin;
>>> +       struct ovl_path origin = { };
>>> +       struct ovl_path *stack = &origin;
>>>         unsigned int ctr = 0;
>>>         int err;
>>>
>>> @@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>>>         if (err)
>>>                 goto fail;
>>>
>>> -       err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
>>> +       err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
>>>         if (!err && !ctr)
>>>                 err = -ESTALE;
>>>         if (err)
>>> @@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
>>>         }
>>>         BUG_ON(idx > oe->numlower);
>>>         *idxp = idx;
>>> -       *path = oe->lowerstack[idx - 1];
>>> +       path->dentry = oe->lowerstack[idx - 1].dentry;
>>> +       path->mnt = oe->lowerstack[idx - 1].layer->mnt;
>>>
>>>         return (idx < oe->numlower) ? idx + 1 : -1;
>>>  }
>>>
>>> +static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
>>> +{
>>> +       int i;
>>> +
>>> +       for (i = 0; i < ofs->numlower; i++) {
>>> +               if (ofs->lower_layers[i].mnt == path->layer->mnt)
>>> +                       break;
>>> +       }
>>> +
>>> +       return i;
>>> +}
>>> +
>>>  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>                           unsigned int flags)
>>>  {
>>> @@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>>>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>>>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
>>> -       struct path *stack = NULL;
>>> +       struct ovl_path *stack = NULL;
>>>         struct dentry *upperdir, *upperdentry = NULL;
>>>         struct dentry *index = NULL;
>>>         unsigned int ctr = 0;
>>> @@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>
>>>         if (!d.stop && poe->numlower) {
>>>                 err = -ENOMEM;
>>> -               stack = kcalloc(ofs->numlower, sizeof(struct path),
>>> +               stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
>>>                                 GFP_KERNEL);
>>>                 if (!stack)
>>>                         goto out_put_upper;
>>>         }
>>>
>>>         for (i = 0; !d.stop && i < poe->numlower; i++) {
>>> -               struct path lowerpath = poe->lowerstack[i];
>>> +               struct ovl_path lower = poe->lowerstack[i];
>>>
>>>                 d.last = i == poe->numlower - 1;
>>> -               err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
>>> +               err = ovl_lookup_layer(lower.dentry, &d, &this);
>>>                 if (err)
>>>                         goto out_put;
>>>
>>> @@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>                         continue;
>>>
>>>                 stack[ctr].dentry = this;
>>> -               stack[ctr].mnt = lowerpath.mnt;
>>> +               stack[ctr].layer = lower.layer;
>>>                 ctr++;
>>>
>>>                 if (d.stop)
>>> @@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>                         poe = roe;
>>>
>>>                         /* Find the current layer on the root dentry */
>>> -                       for (i = 0; i < poe->numlower; i++)
>>> -                               if (poe->lowerstack[i].mnt == lowerpath.mnt)
>>> -                                       break;
>>> -                       if (WARN_ON(i == poe->numlower))
>>> +                       i = ovl_find_layer(ofs, &lower);
>>> +                       if (WARN_ON(i == ofs->numlower))
>>>                                 break;
>>>                 }
>>>         }
>>> @@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>>                 goto out_put;
>>>
>>>         oe->opaque = upperopaque;
>>> -       memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
>>> +       memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
>>>         dentry->d_fsdata = oe;
>>>
>>>         if (upperdentry)
>>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>>> index 73ef1e850635..335e9a052995 100644
>>> --- a/fs/overlayfs/overlayfs.h
>>> +++ b/fs/overlayfs/overlayfs.h
>>> @@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
>>>  /* namei.c */
>>>  int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>>>                       struct dentry *origin, bool is_upper, bool set);
>>> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>>> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>>>                      unsigned int numlower);
>>>  int ovl_get_index_name(struct dentry *origin, struct qstr *name);
>>>  int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
>>> @@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
>>>  void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>>>                          struct dentry *dentry, int level);
>>>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>>> -                        struct path *lowerstack, unsigned int numlower);
>>> +                        struct ovl_path *lower, unsigned int numlower);
>>>
>>>  /* inode.c */
>>>  int ovl_set_nlink_upper(struct dentry *dentry);
>>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>>> index 25d9b5adcd42..93eb6a044dd2 100644
>>> --- a/fs/overlayfs/ovl_entry.h
>>> +++ b/fs/overlayfs/ovl_entry.h
>>> @@ -17,11 +17,21 @@ struct ovl_config {
>>>         bool index;
>>>  };
>>>
>>> +struct ovl_layer {
>>> +       struct vfsmount *mnt;
>>> +       dev_t pseudo_dev;
>>> +};
>>> +
>>> +struct ovl_path {
>>> +       struct ovl_layer *layer;
>>> +       struct dentry *dentry;
>>> +};
>>> +
>>>  /* private information held for overlayfs's superblock */
>>>  struct ovl_fs {
>>>         struct vfsmount *upper_mnt;
>>>         unsigned numlower;
>>> -       struct vfsmount **lower_mnt;
>>> +       struct ovl_layer *lower_layers;
>>>         /* workbasedir is the path at workdir= mount option */
>>>         struct dentry *workbasedir;
>>>         /* workdir is the 'work' directory under workbasedir */
>>> @@ -52,7 +62,7 @@ struct ovl_entry {
>>>                 struct rcu_head rcu;
>>>         };
>>>         unsigned numlower;
>>> -       struct path lowerstack[];
>>> +       struct ovl_path lowerstack[];
>>>  };
>>>
>>>  struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
>>> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>> index bcc123c7706c..0ab657f2c1c8 100644
>>> --- a/fs/overlayfs/readdir.c
>>> +++ b/fs/overlayfs/readdir.c
>>> @@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>>>  }
>>>
>>>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>>> -                        struct path *lowerstack, unsigned int numlower)
>>> +                        struct ovl_path *lower, unsigned int numlower)
>>>  {
>>>         int err;
>>>         struct dentry *index = NULL;
>>> @@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>>>                         index = NULL;
>>>                         break;
>>>                 }
>>> -               err = ovl_verify_index(index, lowerstack, numlower);
>>> +               err = ovl_verify_index(index, lower, numlower);
>>>                 /* Cleanup stale and orphan index entries */
>>>                 if (err && (err == -ESTALE || err == -ENOENT))
>>>                         err = ovl_cleanup(dir, index);
>>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>>> index 8702803ba328..323dfd7a0236 100644
>>> --- a/fs/overlayfs/super.c
>>> +++ b/fs/overlayfs/super.c
>>> @@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
>>>         if (ufs->upper_mnt && ufs->upperdir_locked)
>>>                 ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
>>>         mntput(ufs->upper_mnt);
>>> -       for (i = 0; i < ufs->numlower; i++)
>>> -               mntput(ufs->lower_mnt[i]);
>>> -       kfree(ufs->lower_mnt);
>>> +       for (i = 0; i < ufs->numlower; i++) {
>>> +               mntput(ufs->lower_layers[i].mnt);
>>> +               free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
>>> +       }
>>> +       kfree(ufs->lower_layers);
>>>
>>>         kfree(ufs->config.lowerdir);
>>>         kfree(ufs->config.upperdir);
>>> @@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>         }
>>>
>>>         err = -ENOMEM;
>>> -       ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
>>> -       if (ufs->lower_mnt == NULL)
>>> +       ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
>>> +                                   GFP_KERNEL);
>>> +       if (ufs->lower_layers == NULL)
>>>                 goto out_put_workdir;
>>>         for (i = 0; i < numlower; i++) {
>>> -               struct vfsmount *mnt = clone_private_mount(&stack[i]);
>>> +               struct vfsmount *mnt;
>>> +               dev_t dev;
>>> +
>>> +               err = get_anon_bdev(&dev);
>>> +               if (err) {
>>> +                       pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
>>> +                       goto out_put_lower_layers;
>>> +               }
>>>
>>> +               mnt = clone_private_mount(&stack[i]);
>>>                 err = PTR_ERR(mnt);
>>>                 if (IS_ERR(mnt)) {
>>>                         pr_err("overlayfs: failed to clone lowerpath\n");
>>> -                       goto out_put_lower_mnt;
>>> +                       free_anon_bdev(dev);
>>> +                       goto out_put_lower_layers;
>>>                 }
>>>                 /*
>>> -                * Make lower_mnt R/O.  That way fchmod/fchown on lower file
>>> +                * Make lower layers R/O.  That way fchmod/fchown on lower file
>>>                  * will fail instead of modifying lower fs.
>>>                  */
>>>                 mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
>>>
>>> -               ufs->lower_mnt[ufs->numlower] = mnt;
>>> +               ufs->lower_layers[ufs->numlower].mnt = mnt;
>>> +               ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
>>>                 ufs->numlower++;
>>>
>>>                 /* Check if all lower layers are on same sb */
>>> @@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>         else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
>>>                 ufs->same_sb = NULL;
>>>
>>> +       err = -ENOMEM;
>>> +       oe = ovl_alloc_entry(numlower);
>>> +       if (!oe)
>>> +               goto out_put_lower_layers;
>>> +
>>> +       for (i = 0; i < numlower; i++) {
>>> +               oe->lowerstack[i].dentry = stack[i].dentry;
>>> +               oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
>>> +       }
>>> +
>>>         if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
>>>                 /* Verify lower root is upper root origin */
>>> -               err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
>>> -                                       stack[0].dentry, false, true);
>>> +               err = ovl_verify_origin(upperpath.dentry,
>>> +                                       oe->lowerstack[0].layer->mnt,
>>> +                                       oe->lowerstack[0].dentry,
>>> +                                       false, true);
>>>                 if (err) {
>>>                         pr_err("overlayfs: failed to verify upper root origin\n");
>>> -                       goto out_put_lower_mnt;
>>> +                       goto out_free_oe;
>>>                 }
>>>
>>>                 ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
>>> @@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>                         if (!err)
>>>                                 err = ovl_indexdir_cleanup(ufs->indexdir,
>>>                                                            ufs->upper_mnt,
>>> -                                                          stack, numlower);
>>> +                                                          oe->lowerstack,
>>> +                                                          numlower);
>>>                 }
>>>                 if (err || !ufs->indexdir)
>>>                         pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
>>> @@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>         /* Never override disk quota limits or use reserved space */
>>>         cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
>>>
>>> -       err = -ENOMEM;
>>> -       oe = ovl_alloc_entry(numlower);
>>> -       if (!oe)
>>> -               goto out_put_cred;
>>> -
>>>         sb->s_magic = OVERLAYFS_SUPER_MAGIC;
>>>         sb->s_op = &ovl_super_operations;
>>>         sb->s_xattr = ovl_xattr_handlers;
>>> @@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>
>>>         root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
>>>         if (!root_dentry)
>>> -               goto out_free_oe;
>>> +               goto out_put_cred;
>>>
>>>         mntput(upperpath.mnt);
>>>         for (i = 0; i < numlower; i++)
>>>                 mntput(stack[i].mnt);
>>> +       kfree(stack);
>>>         mntput(workpath.mnt);
>>>         kfree(lowertmp);
>>>
>>> @@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>                 if (ovl_is_impuredir(upperpath.dentry))
>>>                         ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
>>>         }
>>> -       for (i = 0; i < numlower; i++) {
>>> -               oe->lowerstack[i].dentry = stack[i].dentry;
>>> -               oe->lowerstack[i].mnt = ufs->lower_mnt[i];
>>> -       }
>>> -       kfree(stack);
>>>
>>>         root_dentry->d_fsdata = oe;
>>>
>>> @@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>
>>>         return 0;
>>>
>>> -out_free_oe:
>>> -       kfree(oe);
>>>  out_put_cred:
>>>         put_cred(ufs->creator_cred);
>>>  out_put_indexdir:
>>>         dput(ufs->indexdir);
>>> -out_put_lower_mnt:
>>> -       for (i = 0; i < ufs->numlower; i++)
>>> -               mntput(ufs->lower_mnt[i]);
>>> -       kfree(ufs->lower_mnt);
>>> +out_free_oe:
>>> +       kfree(oe);
>>> +out_put_lower_layers:
>>> +       for (i = 0; i < ufs->numlower; i++) {
>>> +               if (ufs->lower_layers[i].mnt)
>>> +                       free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
>>> +               mntput(ufs->lower_layers[i].mnt);
>>> +       }
>>> +       kfree(ufs->lower_layers);
>>>  out_put_workdir:
>>>         dput(ufs->workdir);
>>>         mntput(ufs->upper_mnt);
>>
>> I'm not even attempting to review the ovl_fill_super() changes.  We
>> need to do something about that function (it's my fault for letting it
>> grow this big and convoluted).  Do you want me to have a go at
>> cleaning it up?  Or do you feel an urging desire to do so?
>>
>
> Well, I am up to my neck with the NFS export patches and besides
> I probably won't get the cleanup done the way you would ;-)
> so I prefer you have a go at it and I will rebase the patches on top
> of your cleanup.
>

Pushed the requested changes to branch:
https://github.com/amir73il/linux/commits/ovl-nonsamefs-v6

Currently same tip as overlayfs-devel.

Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-31 23:01       ` Amir Goldstein
@ 2017-11-01 13:17         ` Chandan Rajendra
  2017-11-01 13:34           ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Chandan Rajendra @ 2017-11-01 13:17 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, zhangyi, linux-unionfs

On Wednesday, November 1, 2017 4:31:00 AM IST Amir Goldstein wrote:
> On Tue, Oct 31, 2017 at 3:53 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > On Tue, Oct 31, 2017 at 3:43 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> On Mon, Oct 30, 2017 at 9:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> >>> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> >>>
> >>> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
> >>> provides unique values for st_dev. The unique values are obtained by
> >>> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
> >>> instance.
> >>>
> >>> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> >>> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]
> >>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >>> ---
> >>>  fs/overlayfs/inode.c     | 18 +++++++++++
> >>>  fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
> >>>  fs/overlayfs/overlayfs.h |  4 +--
> >>>  fs/overlayfs/ovl_entry.h | 14 +++++++--
> >>>  fs/overlayfs/readdir.c   |  4 +--
> >>>  fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
> >>>  fs/overlayfs/util.c      |  7 ++++-
> >>>  7 files changed, 121 insertions(+), 58 deletions(-)
> >>>
> >>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> >>> index 50e233ccca53..ec5c3ce91868 100644
> >>> --- a/fs/overlayfs/inode.c
> >>> +++ b/fs/overlayfs/inode.c
> >>> @@ -8,6 +8,7 @@
> >>>   */
> >>>
> >>>  #include <linux/fs.h>
> >>> +#include <linux/mount.h>
> >>>  #include <linux/slab.h>
> >>>  #include <linux/cred.h>
> >>>  #include <linux/xattr.h>
> >>> @@ -15,6 +16,21 @@
> >>>  #include <linux/ratelimit.h>
> >>>  #include "overlayfs.h"
> >>>
> >>> +
> >>> +static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
> >>> +{
> >>> +       struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
> >>> +       struct ovl_entry *oe = dentry->d_fsdata;
> >>> +
> >>> +       if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
> >>> +               return dev;
> >>
> >> This doesn't feel right. Okay, we may want to check against upper
> >> device number for sanity and WARN if doesn't match, although we can
> >> have all sorts of weirdness with btrfs, for example.   So the logic
> >> should be:
> >>
> >> if type is upper then use the dev returned by stat
> >> else return the pseudo dev
> >>
> >>
> >>> +
> >>> +       if (oe->numlower)
> >>> +               return oe->lowerstack[0].layer->pseudo_dev;
> >>> +
> >>> +       return dev;
> >>> +}
> >>> +
> >>>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
> >>>  {
> >>>         int err;
> >>> @@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
> >>>                  */
> >>>                 stat->dev = dentry->d_sb->s_dev;
> >>>                 stat->ino = dentry->d_inode->i_ino;
> >>> +       } else {
> >>> +               stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
> >>>         }
> >>>
> >>>         /*
> >>
> >> The above can be a separate patch.  I.e. split this into:
> >>
> >>  1) add ovl_layer infrastructure
> >>  2) use that infrastructure in getattr
> >>
> >> 1) doesn't change functionality while 2) does.
> >>
> >>> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> >>> index de2dac98e147..78e965527167 100644
> >>> --- a/fs/overlayfs/namei.c
> >>> +++ b/fs/overlayfs/namei.c
> >>> @@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
> >>>
> >>>
> >>>  static int ovl_check_origin(struct dentry *upperdentry,
> >>> -                           struct path *lowerstack, unsigned int numlower,
> >>> -                           struct path **stackp, unsigned int *ctrp)
> >>> +                           struct ovl_path *lower, unsigned int numlower,
> >>> +                           struct ovl_path **stackp, unsigned int *ctrp)
> >>>  {
> >>>         struct vfsmount *mnt;
> >>>         struct dentry *origin = NULL;
> >>>         int i;
> >>>
> >>> -
> >>>         for (i = 0; i < numlower; i++) {
> >>> -               mnt = lowerstack[i].mnt;
> >>> +               mnt = lower[i].layer->mnt;
> >>>                 origin = ovl_get_origin(upperdentry, mnt);
> >>>                 if (IS_ERR(origin))
> >>>                         return PTR_ERR(origin);
> >>> @@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
> >>>
> >>>         BUG_ON(*ctrp);
> >>>         if (!*stackp)
> >>> -               *stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
> >>> +               *stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
> >>>         if (!*stackp) {
> >>>                 dput(origin);
> >>>                 return -ENOMEM;
> >>>         }
> >>> -       **stackp = (struct path) { .dentry = origin, .mnt = mnt };
> >>> +       **stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
> >>>         *ctrp = 1;
> >>>
> >>>         return 0;
> >>> @@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
> >>>   * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
> >>>   * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
> >>>   */
> >>> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> >>> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
> >>>                      unsigned int numlower)
> >>>  {
> >>>         struct ovl_fh *fh = NULL;
> >>>         size_t len;
> >>> -       struct path origin = { };
> >>> -       struct path *stack = &origin;
> >>> +       struct ovl_path origin = { };
> >>> +       struct ovl_path *stack = &origin;
> >>>         unsigned int ctr = 0;
> >>>         int err;
> >>>
> >>> @@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> >>>         if (err)
> >>>                 goto fail;
> >>>
> >>> -       err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
> >>> +       err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
> >>>         if (!err && !ctr)
> >>>                 err = -ESTALE;
> >>>         if (err)
> >>> @@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
> >>>         }
> >>>         BUG_ON(idx > oe->numlower);
> >>>         *idxp = idx;
> >>> -       *path = oe->lowerstack[idx - 1];
> >>> +       path->dentry = oe->lowerstack[idx - 1].dentry;
> >>> +       path->mnt = oe->lowerstack[idx - 1].layer->mnt;
> >>>
> >>>         return (idx < oe->numlower) ? idx + 1 : -1;
> >>>  }
> >>>
> >>> +static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
> >>> +{
> >>> +       int i;
> >>> +
> >>> +       for (i = 0; i < ofs->numlower; i++) {
> >>> +               if (ofs->lower_layers[i].mnt == path->layer->mnt)
> >>> +                       break;
> >>> +       }
> >>> +
> >>> +       return i;
> >>> +}
> >>> +
> >>>  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >>>                           unsigned int flags)
> >>>  {
> >>> @@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >>>         struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
> >>>         struct ovl_entry *poe = dentry->d_parent->d_fsdata;
> >>>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> >>> -       struct path *stack = NULL;
> >>> +       struct ovl_path *stack = NULL;
> >>>         struct dentry *upperdir, *upperdentry = NULL;
> >>>         struct dentry *index = NULL;
> >>>         unsigned int ctr = 0;
> >>> @@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >>>
> >>>         if (!d.stop && poe->numlower) {
> >>>                 err = -ENOMEM;
> >>> -               stack = kcalloc(ofs->numlower, sizeof(struct path),
> >>> +               stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
> >>>                                 GFP_KERNEL);
> >>>                 if (!stack)
> >>>                         goto out_put_upper;
> >>>         }
> >>>
> >>>         for (i = 0; !d.stop && i < poe->numlower; i++) {
> >>> -               struct path lowerpath = poe->lowerstack[i];
> >>> +               struct ovl_path lower = poe->lowerstack[i];
> >>>
> >>>                 d.last = i == poe->numlower - 1;
> >>> -               err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
> >>> +               err = ovl_lookup_layer(lower.dentry, &d, &this);
> >>>                 if (err)
> >>>                         goto out_put;
> >>>
> >>> @@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >>>                         continue;
> >>>
> >>>                 stack[ctr].dentry = this;
> >>> -               stack[ctr].mnt = lowerpath.mnt;
> >>> +               stack[ctr].layer = lower.layer;
> >>>                 ctr++;
> >>>
> >>>                 if (d.stop)
> >>> @@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >>>                         poe = roe;
> >>>
> >>>                         /* Find the current layer on the root dentry */
> >>> -                       for (i = 0; i < poe->numlower; i++)
> >>> -                               if (poe->lowerstack[i].mnt == lowerpath.mnt)
> >>> -                                       break;
> >>> -                       if (WARN_ON(i == poe->numlower))
> >>> +                       i = ovl_find_layer(ofs, &lower);
> >>> +                       if (WARN_ON(i == ofs->numlower))
> >>>                                 break;
> >>>                 }
> >>>         }
> >>> @@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
> >>>                 goto out_put;
> >>>
> >>>         oe->opaque = upperopaque;
> >>> -       memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
> >>> +       memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
> >>>         dentry->d_fsdata = oe;
> >>>
> >>>         if (upperdentry)
> >>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> >>> index 73ef1e850635..335e9a052995 100644
> >>> --- a/fs/overlayfs/overlayfs.h
> >>> +++ b/fs/overlayfs/overlayfs.h
> >>> @@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
> >>>  /* namei.c */
> >>>  int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
> >>>                       struct dentry *origin, bool is_upper, bool set);
> >>> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> >>> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
> >>>                      unsigned int numlower);
> >>>  int ovl_get_index_name(struct dentry *origin, struct qstr *name);
> >>>  int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
> >>> @@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
> >>>  void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
> >>>                          struct dentry *dentry, int level);
> >>>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> >>> -                        struct path *lowerstack, unsigned int numlower);
> >>> +                        struct ovl_path *lower, unsigned int numlower);
> >>>
> >>>  /* inode.c */
> >>>  int ovl_set_nlink_upper(struct dentry *dentry);
> >>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> >>> index 25d9b5adcd42..93eb6a044dd2 100644
> >>> --- a/fs/overlayfs/ovl_entry.h
> >>> +++ b/fs/overlayfs/ovl_entry.h
> >>> @@ -17,11 +17,21 @@ struct ovl_config {
> >>>         bool index;
> >>>  };
> >>>
> >>> +struct ovl_layer {
> >>> +       struct vfsmount *mnt;
> >>> +       dev_t pseudo_dev;
> >>> +};
> >>> +
> >>> +struct ovl_path {
> >>> +       struct ovl_layer *layer;
> >>> +       struct dentry *dentry;
> >>> +};
> >>> +
> >>>  /* private information held for overlayfs's superblock */
> >>>  struct ovl_fs {
> >>>         struct vfsmount *upper_mnt;
> >>>         unsigned numlower;
> >>> -       struct vfsmount **lower_mnt;
> >>> +       struct ovl_layer *lower_layers;
> >>>         /* workbasedir is the path at workdir= mount option */
> >>>         struct dentry *workbasedir;
> >>>         /* workdir is the 'work' directory under workbasedir */
> >>> @@ -52,7 +62,7 @@ struct ovl_entry {
> >>>                 struct rcu_head rcu;
> >>>         };
> >>>         unsigned numlower;
> >>> -       struct path lowerstack[];
> >>> +       struct ovl_path lowerstack[];
> >>>  };
> >>>
> >>>  struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
> >>> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> >>> index bcc123c7706c..0ab657f2c1c8 100644
> >>> --- a/fs/overlayfs/readdir.c
> >>> +++ b/fs/overlayfs/readdir.c
> >>> @@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
> >>>  }
> >>>
> >>>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> >>> -                        struct path *lowerstack, unsigned int numlower)
> >>> +                        struct ovl_path *lower, unsigned int numlower)
> >>>  {
> >>>         int err;
> >>>         struct dentry *index = NULL;
> >>> @@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> >>>                         index = NULL;
> >>>                         break;
> >>>                 }
> >>> -               err = ovl_verify_index(index, lowerstack, numlower);
> >>> +               err = ovl_verify_index(index, lower, numlower);
> >>>                 /* Cleanup stale and orphan index entries */
> >>>                 if (err && (err == -ESTALE || err == -ENOENT))
> >>>                         err = ovl_cleanup(dir, index);
> >>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> >>> index 8702803ba328..323dfd7a0236 100644
> >>> --- a/fs/overlayfs/super.c
> >>> +++ b/fs/overlayfs/super.c
> >>> @@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
> >>>         if (ufs->upper_mnt && ufs->upperdir_locked)
> >>>                 ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
> >>>         mntput(ufs->upper_mnt);
> >>> -       for (i = 0; i < ufs->numlower; i++)
> >>> -               mntput(ufs->lower_mnt[i]);
> >>> -       kfree(ufs->lower_mnt);
> >>> +       for (i = 0; i < ufs->numlower; i++) {
> >>> +               mntput(ufs->lower_layers[i].mnt);
> >>> +               free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> >>> +       }
> >>> +       kfree(ufs->lower_layers);
> >>>
> >>>         kfree(ufs->config.lowerdir);
> >>>         kfree(ufs->config.upperdir);
> >>> @@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>         }
> >>>
> >>>         err = -ENOMEM;
> >>> -       ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
> >>> -       if (ufs->lower_mnt == NULL)
> >>> +       ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
> >>> +                                   GFP_KERNEL);
> >>> +       if (ufs->lower_layers == NULL)
> >>>                 goto out_put_workdir;
> >>>         for (i = 0; i < numlower; i++) {
> >>> -               struct vfsmount *mnt = clone_private_mount(&stack[i]);
> >>> +               struct vfsmount *mnt;
> >>> +               dev_t dev;
> >>> +
> >>> +               err = get_anon_bdev(&dev);
> >>> +               if (err) {
> >>> +                       pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
> >>> +                       goto out_put_lower_layers;
> >>> +               }
> >>>
> >>> +               mnt = clone_private_mount(&stack[i]);
> >>>                 err = PTR_ERR(mnt);
> >>>                 if (IS_ERR(mnt)) {
> >>>                         pr_err("overlayfs: failed to clone lowerpath\n");
> >>> -                       goto out_put_lower_mnt;
> >>> +                       free_anon_bdev(dev);
> >>> +                       goto out_put_lower_layers;
> >>>                 }
> >>>                 /*
> >>> -                * Make lower_mnt R/O.  That way fchmod/fchown on lower file
> >>> +                * Make lower layers R/O.  That way fchmod/fchown on lower file
> >>>                  * will fail instead of modifying lower fs.
> >>>                  */
> >>>                 mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
> >>>
> >>> -               ufs->lower_mnt[ufs->numlower] = mnt;
> >>> +               ufs->lower_layers[ufs->numlower].mnt = mnt;
> >>> +               ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
> >>>                 ufs->numlower++;
> >>>
> >>>                 /* Check if all lower layers are on same sb */
> >>> @@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>         else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
> >>>                 ufs->same_sb = NULL;
> >>>
> >>> +       err = -ENOMEM;
> >>> +       oe = ovl_alloc_entry(numlower);
> >>> +       if (!oe)
> >>> +               goto out_put_lower_layers;
> >>> +
> >>> +       for (i = 0; i < numlower; i++) {
> >>> +               oe->lowerstack[i].dentry = stack[i].dentry;
> >>> +               oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
> >>> +       }
> >>> +
> >>>         if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
> >>>                 /* Verify lower root is upper root origin */
> >>> -               err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
> >>> -                                       stack[0].dentry, false, true);
> >>> +               err = ovl_verify_origin(upperpath.dentry,
> >>> +                                       oe->lowerstack[0].layer->mnt,
> >>> +                                       oe->lowerstack[0].dentry,
> >>> +                                       false, true);
> >>>                 if (err) {
> >>>                         pr_err("overlayfs: failed to verify upper root origin\n");
> >>> -                       goto out_put_lower_mnt;
> >>> +                       goto out_free_oe;
> >>>                 }
> >>>
> >>>                 ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
> >>> @@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>                         if (!err)
> >>>                                 err = ovl_indexdir_cleanup(ufs->indexdir,
> >>>                                                            ufs->upper_mnt,
> >>> -                                                          stack, numlower);
> >>> +                                                          oe->lowerstack,
> >>> +                                                          numlower);
> >>>                 }
> >>>                 if (err || !ufs->indexdir)
> >>>                         pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
> >>> @@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>         /* Never override disk quota limits or use reserved space */
> >>>         cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
> >>>
> >>> -       err = -ENOMEM;
> >>> -       oe = ovl_alloc_entry(numlower);
> >>> -       if (!oe)
> >>> -               goto out_put_cred;
> >>> -
> >>>         sb->s_magic = OVERLAYFS_SUPER_MAGIC;
> >>>         sb->s_op = &ovl_super_operations;
> >>>         sb->s_xattr = ovl_xattr_handlers;
> >>> @@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>
> >>>         root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
> >>>         if (!root_dentry)
> >>> -               goto out_free_oe;
> >>> +               goto out_put_cred;
> >>>
> >>>         mntput(upperpath.mnt);
> >>>         for (i = 0; i < numlower; i++)
> >>>                 mntput(stack[i].mnt);
> >>> +       kfree(stack);
> >>>         mntput(workpath.mnt);
> >>>         kfree(lowertmp);
> >>>
> >>> @@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>                 if (ovl_is_impuredir(upperpath.dentry))
> >>>                         ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
> >>>         }
> >>> -       for (i = 0; i < numlower; i++) {
> >>> -               oe->lowerstack[i].dentry = stack[i].dentry;
> >>> -               oe->lowerstack[i].mnt = ufs->lower_mnt[i];
> >>> -       }
> >>> -       kfree(stack);
> >>>
> >>>         root_dentry->d_fsdata = oe;
> >>>
> >>> @@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> >>>
> >>>         return 0;
> >>>
> >>> -out_free_oe:
> >>> -       kfree(oe);
> >>>  out_put_cred:
> >>>         put_cred(ufs->creator_cred);
> >>>  out_put_indexdir:
> >>>         dput(ufs->indexdir);
> >>> -out_put_lower_mnt:
> >>> -       for (i = 0; i < ufs->numlower; i++)
> >>> -               mntput(ufs->lower_mnt[i]);
> >>> -       kfree(ufs->lower_mnt);
> >>> +out_free_oe:
> >>> +       kfree(oe);
> >>> +out_put_lower_layers:
> >>> +       for (i = 0; i < ufs->numlower; i++) {
> >>> +               if (ufs->lower_layers[i].mnt)
> >>> +                       free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> >>> +               mntput(ufs->lower_layers[i].mnt);
> >>> +       }
> >>> +       kfree(ufs->lower_layers);
> >>>  out_put_workdir:
> >>>         dput(ufs->workdir);
> >>>         mntput(ufs->upper_mnt);
> >>
> >> I'm not even attempting to review the ovl_fill_super() changes.  We
> >> need to do something about that function (it's my fault for letting it
> >> grow this big and convoluted).  Do you want me to have a go at
> >> cleaning it up?  Or do you feel an urging desire to do so?
> >>
> >
> > Well, I am up to my neck with the NFS export patches and besides
> > I probably won't get the cleanup done the way you would ;-)
> > so I prefer you have a go at it and I will rebase the patches on top
> > of your cleanup.
> >
> 
> Pushed the requested changes to branch:
> https://github.com/amir73il/linux/commits/ovl-nonsamefs-v6
> 
> Currently same tip as overlayfs-devel.
> 

Amir, The changes look fine to me. Thanks for addressing the review comments.

-- 
chandan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 13:17         ` Chandan Rajendra
@ 2017-11-01 13:34           ` Amir Goldstein
  0 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-11-01 13:34 UTC (permalink / raw)
  To: Chandan Rajendra; +Cc: Miklos Szeredi, zhangyi, linux-unionfs

On Wed, Nov 1, 2017 at 3:17 PM, Chandan Rajendra
<chandan@linux.vnet.ibm.com> wrote:
> On Wednesday, November 1, 2017 4:31:00 AM IST Amir Goldstein wrote:
...
>>
>> Pushed the requested changes to branch:
>> https://github.com/amir73il/linux/commits/ovl-nonsamefs-v6
>>
>> Currently same tip as overlayfs-devel.
>>
>
> Amir, The changes look fine to me. Thanks for addressing the review comments.
>

No problem, but just found another problem in testing -
When we iterate pure lower or upper dir type entries
we do not adjust their d_ino to overlay ino, which is differnt
than real ino for non-samefs dirs.
example ((mnt/b is pure upper):
=====================================
# ls -li /mnt/
total 0
16237 drwxr-xr-x  2 root root   40 Nov  1 13:12 b

# find /mnt -inum 16237
/mnt/b

# ../xfstests/src/t_dir_type /mnt/b 16237
. d

# ../xfstests/src/t_dir_type /mnt/b 16237

<nothing>
=====================================

This problem does not affect 'find -inum' because 'find' always
does stat(2) for directories to check st_ino and more.

But the problem does affect decoding of NFS file handles
because the default get_name() implementation relies on
d_ino lookup to match child name with disconnected child dentry.

I will fix it up.

Chandan,

When you have time, please send a fix to xfstests 038 and 041 to test
this missed use case and please send a nonsamefs test variant of 017.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-30 20:27 ` [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
  2017-10-31 13:43   ` Miklos Szeredi
@ 2017-11-01 14:42   ` Vivek Goyal
  2017-11-01 15:02     ` Amir Goldstein
  2017-11-01 15:41   ` Vivek Goyal
  2 siblings, 1 reply; 22+ messages in thread
From: Vivek Goyal @ 2017-11-01 14:42 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, linux-unionfs

On Mon, Oct 30, 2017 at 10:27:25PM +0200, Amir Goldstein wrote:
> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> 
> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
> provides unique values for st_dev. The unique values are obtained by
> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
> instance.

Hi Amir, Chandan,

In the commit message, can we also mention what's the current behavior
and why this new behavior beneficial/desirable.

Vivek

> 
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c     | 18 +++++++++++
>  fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
>  fs/overlayfs/overlayfs.h |  4 +--
>  fs/overlayfs/ovl_entry.h | 14 +++++++--
>  fs/overlayfs/readdir.c   |  4 +--
>  fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
>  fs/overlayfs/util.c      |  7 ++++-
>  7 files changed, 121 insertions(+), 58 deletions(-)
> 
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 50e233ccca53..ec5c3ce91868 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include <linux/fs.h>
> +#include <linux/mount.h>
>  #include <linux/slab.h>
>  #include <linux/cred.h>
>  #include <linux/xattr.h>
> @@ -15,6 +16,21 @@
>  #include <linux/ratelimit.h>
>  #include "overlayfs.h"
>  
> +
> +static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
> +{
> +	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
> +	struct ovl_entry *oe = dentry->d_fsdata;
> +
> +	if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
> +		return dev;
> +
> +	if (oe->numlower)
> +		return oe->lowerstack[0].layer->pseudo_dev;
> +
> +	return dev;
> +}
> +
>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>  	int err;
> @@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		 */
>  		stat->dev = dentry->d_sb->s_dev;
>  		stat->ino = dentry->d_inode->i_ino;
> +	} else {
> +		stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
>  	}
>  
>  	/*
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index de2dac98e147..78e965527167 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
>  
>  
>  static int ovl_check_origin(struct dentry *upperdentry,
> -			    struct path *lowerstack, unsigned int numlower,
> -			    struct path **stackp, unsigned int *ctrp)
> +			    struct ovl_path *lower, unsigned int numlower,
> +			    struct ovl_path **stackp, unsigned int *ctrp)
>  {
>  	struct vfsmount *mnt;
>  	struct dentry *origin = NULL;
>  	int i;
>  
> -
>  	for (i = 0; i < numlower; i++) {
> -		mnt = lowerstack[i].mnt;
> +		mnt = lower[i].layer->mnt;
>  		origin = ovl_get_origin(upperdentry, mnt);
>  		if (IS_ERR(origin))
>  			return PTR_ERR(origin);
> @@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
>  
>  	BUG_ON(*ctrp);
>  	if (!*stackp)
> -		*stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
> +		*stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
>  	if (!*stackp) {
>  		dput(origin);
>  		return -ENOMEM;
>  	}
> -	**stackp = (struct path) { .dentry = origin, .mnt = mnt };
> +	**stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
>  	*ctrp = 1;
>  
>  	return 0;
> @@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>   * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
>   * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
>   */
> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>  		     unsigned int numlower)
>  {
>  	struct ovl_fh *fh = NULL;
>  	size_t len;
> -	struct path origin = { };
> -	struct path *stack = &origin;
> +	struct ovl_path origin = { };
> +	struct ovl_path *stack = &origin;
>  	unsigned int ctr = 0;
>  	int err;
>  
> @@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>  	if (err)
>  		goto fail;
>  
> -	err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
> +	err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
>  	if (!err && !ctr)
>  		err = -ESTALE;
>  	if (err)
> @@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
>  	}
>  	BUG_ON(idx > oe->numlower);
>  	*idxp = idx;
> -	*path = oe->lowerstack[idx - 1];
> +	path->dentry = oe->lowerstack[idx - 1].dentry;
> +	path->mnt = oe->lowerstack[idx - 1].layer->mnt;
>  
>  	return (idx < oe->numlower) ? idx + 1 : -1;
>  }
>  
> +static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
> +{
> +	int i;
> +
> +	for (i = 0; i < ofs->numlower; i++) {
> +		if (ofs->lower_layers[i].mnt == path->layer->mnt)
> +			break;
> +	}
> +
> +	return i;
> +}
> +
>  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			  unsigned int flags)
>  {
> @@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>  	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>  	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> -	struct path *stack = NULL;
> +	struct ovl_path *stack = NULL;
>  	struct dentry *upperdir, *upperdentry = NULL;
>  	struct dentry *index = NULL;
>  	unsigned int ctr = 0;
> @@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  
>  	if (!d.stop && poe->numlower) {
>  		err = -ENOMEM;
> -		stack = kcalloc(ofs->numlower, sizeof(struct path),
> +		stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
>  				GFP_KERNEL);
>  		if (!stack)
>  			goto out_put_upper;
>  	}
>  
>  	for (i = 0; !d.stop && i < poe->numlower; i++) {
> -		struct path lowerpath = poe->lowerstack[i];
> +		struct ovl_path lower = poe->lowerstack[i];
>  
>  		d.last = i == poe->numlower - 1;
> -		err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
> +		err = ovl_lookup_layer(lower.dentry, &d, &this);
>  		if (err)
>  			goto out_put;
>  
> @@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			continue;
>  
>  		stack[ctr].dentry = this;
> -		stack[ctr].mnt = lowerpath.mnt;
> +		stack[ctr].layer = lower.layer;
>  		ctr++;
>  
>  		if (d.stop)
> @@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			poe = roe;
>  
>  			/* Find the current layer on the root dentry */
> -			for (i = 0; i < poe->numlower; i++)
> -				if (poe->lowerstack[i].mnt == lowerpath.mnt)
> -					break;
> -			if (WARN_ON(i == poe->numlower))
> +			i = ovl_find_layer(ofs, &lower);
> +			if (WARN_ON(i == ofs->numlower))
>  				break;
>  		}
>  	}
> @@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  		goto out_put;
>  
>  	oe->opaque = upperopaque;
> -	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
> +	memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
>  	dentry->d_fsdata = oe;
>  
>  	if (upperdentry)
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 73ef1e850635..335e9a052995 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
>  /* namei.c */
>  int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>  		      struct dentry *origin, bool is_upper, bool set);
> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>  		     unsigned int numlower);
>  int ovl_get_index_name(struct dentry *origin, struct qstr *name);
>  int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
> @@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
>  void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>  			 struct dentry *dentry, int level);
>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> -			 struct path *lowerstack, unsigned int numlower);
> +			 struct ovl_path *lower, unsigned int numlower);
>  
>  /* inode.c */
>  int ovl_set_nlink_upper(struct dentry *dentry);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 25d9b5adcd42..93eb6a044dd2 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -17,11 +17,21 @@ struct ovl_config {
>  	bool index;
>  };
>  
> +struct ovl_layer {
> +	struct vfsmount *mnt;
> +	dev_t pseudo_dev;
> +};
> +
> +struct ovl_path {
> +	struct ovl_layer *layer;
> +	struct dentry *dentry;
> +};
> +
>  /* private information held for overlayfs's superblock */
>  struct ovl_fs {
>  	struct vfsmount *upper_mnt;
>  	unsigned numlower;
> -	struct vfsmount **lower_mnt;
> +	struct ovl_layer *lower_layers;
>  	/* workbasedir is the path at workdir= mount option */
>  	struct dentry *workbasedir;
>  	/* workdir is the 'work' directory under workbasedir */
> @@ -52,7 +62,7 @@ struct ovl_entry {
>  		struct rcu_head rcu;
>  	};
>  	unsigned numlower;
> -	struct path lowerstack[];
> +	struct ovl_path lowerstack[];
>  };
>  
>  struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index bcc123c7706c..0ab657f2c1c8 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>  }
>  
>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> -			 struct path *lowerstack, unsigned int numlower)
> +			 struct ovl_path *lower, unsigned int numlower)
>  {
>  	int err;
>  	struct dentry *index = NULL;
> @@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>  			index = NULL;
>  			break;
>  		}
> -		err = ovl_verify_index(index, lowerstack, numlower);
> +		err = ovl_verify_index(index, lower, numlower);
>  		/* Cleanup stale and orphan index entries */
>  		if (err && (err == -ESTALE || err == -ENOENT))
>  			err = ovl_cleanup(dir, index);
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 8702803ba328..323dfd7a0236 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
>  	if (ufs->upper_mnt && ufs->upperdir_locked)
>  		ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
>  	mntput(ufs->upper_mnt);
> -	for (i = 0; i < ufs->numlower; i++)
> -		mntput(ufs->lower_mnt[i]);
> -	kfree(ufs->lower_mnt);
> +	for (i = 0; i < ufs->numlower; i++) {
> +		mntput(ufs->lower_layers[i].mnt);
> +		free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +	}
> +	kfree(ufs->lower_layers);
>  
>  	kfree(ufs->config.lowerdir);
>  	kfree(ufs->config.upperdir);
> @@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	}
>  
>  	err = -ENOMEM;
> -	ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
> -	if (ufs->lower_mnt == NULL)
> +	ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
> +				    GFP_KERNEL);
> +	if (ufs->lower_layers == NULL)
>  		goto out_put_workdir;
>  	for (i = 0; i < numlower; i++) {
> -		struct vfsmount *mnt = clone_private_mount(&stack[i]);
> +		struct vfsmount *mnt;
> +		dev_t dev;
> +
> +		err = get_anon_bdev(&dev);
> +		if (err) {
> +			pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
> +			goto out_put_lower_layers;
> +		}
>  
> +		mnt = clone_private_mount(&stack[i]);
>  		err = PTR_ERR(mnt);
>  		if (IS_ERR(mnt)) {
>  			pr_err("overlayfs: failed to clone lowerpath\n");
> -			goto out_put_lower_mnt;
> +			free_anon_bdev(dev);
> +			goto out_put_lower_layers;
>  		}
>  		/*
> -		 * Make lower_mnt R/O.  That way fchmod/fchown on lower file
> +		 * Make lower layers R/O.  That way fchmod/fchown on lower file
>  		 * will fail instead of modifying lower fs.
>  		 */
>  		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
>  
> -		ufs->lower_mnt[ufs->numlower] = mnt;
> +		ufs->lower_layers[ufs->numlower].mnt = mnt;
> +		ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
>  		ufs->numlower++;
>  
>  		/* Check if all lower layers are on same sb */
> @@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
>  		ufs->same_sb = NULL;
>  
> +	err = -ENOMEM;
> +	oe = ovl_alloc_entry(numlower);
> +	if (!oe)
> +		goto out_put_lower_layers;
> +
> +	for (i = 0; i < numlower; i++) {
> +		oe->lowerstack[i].dentry = stack[i].dentry;
> +		oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
> +	}
> +
>  	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
>  		/* Verify lower root is upper root origin */
> -		err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
> -					stack[0].dentry, false, true);
> +		err = ovl_verify_origin(upperpath.dentry,
> +					oe->lowerstack[0].layer->mnt,
> +					oe->lowerstack[0].dentry,
> +					false, true);
>  		if (err) {
>  			pr_err("overlayfs: failed to verify upper root origin\n");
> -			goto out_put_lower_mnt;
> +			goto out_free_oe;
>  		}
>  
>  		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
> @@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  			if (!err)
>  				err = ovl_indexdir_cleanup(ufs->indexdir,
>  							   ufs->upper_mnt,
> -							   stack, numlower);
> +							   oe->lowerstack,
> +							   numlower);
>  		}
>  		if (err || !ufs->indexdir)
>  			pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
> @@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	/* Never override disk quota limits or use reserved space */
>  	cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
>  
> -	err = -ENOMEM;
> -	oe = ovl_alloc_entry(numlower);
> -	if (!oe)
> -		goto out_put_cred;
> -
>  	sb->s_magic = OVERLAYFS_SUPER_MAGIC;
>  	sb->s_op = &ovl_super_operations;
>  	sb->s_xattr = ovl_xattr_handlers;
> @@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
>  	if (!root_dentry)
> -		goto out_free_oe;
> +		goto out_put_cred;
>  
>  	mntput(upperpath.mnt);
>  	for (i = 0; i < numlower; i++)
>  		mntput(stack[i].mnt);
> +	kfree(stack);
>  	mntput(workpath.mnt);
>  	kfree(lowertmp);
>  
> @@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		if (ovl_is_impuredir(upperpath.dentry))
>  			ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
>  	}
> -	for (i = 0; i < numlower; i++) {
> -		oe->lowerstack[i].dentry = stack[i].dentry;
> -		oe->lowerstack[i].mnt = ufs->lower_mnt[i];
> -	}
> -	kfree(stack);
>  
>  	root_dentry->d_fsdata = oe;
>  
> @@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	return 0;
>  
> -out_free_oe:
> -	kfree(oe);
>  out_put_cred:
>  	put_cred(ufs->creator_cred);
>  out_put_indexdir:
>  	dput(ufs->indexdir);
> -out_put_lower_mnt:
> -	for (i = 0; i < ufs->numlower; i++)
> -		mntput(ufs->lower_mnt[i]);
> -	kfree(ufs->lower_mnt);
> +out_free_oe:
> +	kfree(oe);
> +out_put_lower_layers:
> +	for (i = 0; i < ufs->numlower; i++) {
> +		if (ufs->lower_layers[i].mnt)
> +			free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +		mntput(ufs->lower_layers[i].mnt);
> +	}
> +	kfree(ufs->lower_layers);
>  out_put_workdir:
>  	dput(ufs->workdir);
>  	mntput(ufs->upper_mnt);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 9158d17bb320..d6bb1c9f5e7a 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -124,7 +124,12 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
>  {
>  	struct ovl_entry *oe = dentry->d_fsdata;
>  
> -	*path = oe->numlower ? oe->lowerstack[0] : (struct path) { };
> +	if (oe->numlower) {
> +		path->mnt = oe->lowerstack[0].layer->mnt;
> +		path->dentry = oe->lowerstack[0].dentry;
> +	} else {
> +		*path = (struct path) { };
> +	}
>  }
>  
>  enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 14:42   ` Vivek Goyal
@ 2017-11-01 15:02     ` Amir Goldstein
  2017-11-01 15:47       ` Vivek Goyal
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2017-11-01 15:02 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Wed, Nov 1, 2017 at 4:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, Oct 30, 2017 at 10:27:25PM +0200, Amir Goldstein wrote:
>> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>>
>> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
>> provides unique values for st_dev. The unique values are obtained by
>> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
>> instance.
>
> Hi Amir, Chandan,
>
> In the commit message, can we also mention what's the current behavior
> and why this new behavior beneficial/desirable.
>

This is the blurb from the uptodate patch on my branch:

     For non-samefs setup, to make sure that st_dev/st_ino pair
    is unique across the system, we return a unique anonymous
    st_dev for stat(2) of lower layer inode.

A bit fatter, but not fat enough...

Actually, it is not accurate, because st_dev/st_ino pair of pure
upper is still same values as underlying inode for non-samefs so the
values are not unique among all inodes in the system.

I can't remember if there was a reason for not allocating anonymous bdev
for upper or if it just because we did not need it to guaranty uniqueness
of st_dev/st_ino *among* overlay inodes while guarantying constant
st_dev/st_ino across copy up.

I will update commit message.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-10-30 20:27 ` [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
  2017-10-31 13:43   ` Miklos Szeredi
  2017-11-01 14:42   ` Vivek Goyal
@ 2017-11-01 15:41   ` Vivek Goyal
  2017-11-01 16:30     ` Amir Goldstein
  2 siblings, 1 reply; 22+ messages in thread
From: Vivek Goyal @ 2017-11-01 15:41 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, linux-unionfs

On Mon, Oct 30, 2017 at 10:27:25PM +0200, Amir Goldstein wrote:
> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> 
> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
> provides unique values for st_dev. The unique values are obtained by
> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
> instance.
> 
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]

Hi Amir,

Would it make sense to do all the renaming in a separate patch. It becomes
much easier to review this patch then. May be renaming can go first and
then anonymous bdev patch can come in.

Vivek

> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c     | 18 +++++++++++
>  fs/overlayfs/namei.c     | 52 ++++++++++++++++++-------------
>  fs/overlayfs/overlayfs.h |  4 +--
>  fs/overlayfs/ovl_entry.h | 14 +++++++--
>  fs/overlayfs/readdir.c   |  4 +--
>  fs/overlayfs/super.c     | 80 ++++++++++++++++++++++++++++++------------------
>  fs/overlayfs/util.c      |  7 ++++-
>  7 files changed, 121 insertions(+), 58 deletions(-)
> 
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 50e233ccca53..ec5c3ce91868 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include <linux/fs.h>
> +#include <linux/mount.h>
>  #include <linux/slab.h>
>  #include <linux/cred.h>
>  #include <linux/xattr.h>
> @@ -15,6 +16,21 @@
>  #include <linux/ratelimit.h>
>  #include "overlayfs.h"
>  
> +
> +static dev_t ovl_get_pseudo_dev(struct dentry *dentry, dev_t dev)
> +{
> +	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
> +	struct ovl_entry *oe = dentry->d_fsdata;
> +
> +	if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb->s_dev == dev)
> +		return dev;
> +
> +	if (oe->numlower)
> +		return oe->lowerstack[0].layer->pseudo_dev;
> +
> +	return dev;
> +}
> +
>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>  	int err;
> @@ -121,6 +137,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		 */
>  		stat->dev = dentry->d_sb->s_dev;
>  		stat->ino = dentry->d_inode->i_ino;
> +	} else {
> +		stat->dev = ovl_get_pseudo_dev(dentry, stat->dev);
>  	}
>  
>  	/*
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index de2dac98e147..78e965527167 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -285,16 +285,15 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
>  
>  
>  static int ovl_check_origin(struct dentry *upperdentry,
> -			    struct path *lowerstack, unsigned int numlower,
> -			    struct path **stackp, unsigned int *ctrp)
> +			    struct ovl_path *lower, unsigned int numlower,
> +			    struct ovl_path **stackp, unsigned int *ctrp)
>  {
>  	struct vfsmount *mnt;
>  	struct dentry *origin = NULL;
>  	int i;
>  
> -
>  	for (i = 0; i < numlower; i++) {
> -		mnt = lowerstack[i].mnt;
> +		mnt = lower[i].layer->mnt;
>  		origin = ovl_get_origin(upperdentry, mnt);
>  		if (IS_ERR(origin))
>  			return PTR_ERR(origin);
> @@ -308,12 +307,12 @@ static int ovl_check_origin(struct dentry *upperdentry,
>  
>  	BUG_ON(*ctrp);
>  	if (!*stackp)
> -		*stackp = kmalloc(sizeof(struct path), GFP_KERNEL);
> +		*stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL);
>  	if (!*stackp) {
>  		dput(origin);
>  		return -ENOMEM;
>  	}
> -	**stackp = (struct path) { .dentry = origin, .mnt = mnt };
> +	**stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer};
>  	*ctrp = 1;
>  
>  	return 0;
> @@ -383,13 +382,13 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>   * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path.
>   * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error.
>   */
> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>  		     unsigned int numlower)
>  {
>  	struct ovl_fh *fh = NULL;
>  	size_t len;
> -	struct path origin = { };
> -	struct path *stack = &origin;
> +	struct ovl_path origin = { };
> +	struct ovl_path *stack = &origin;
>  	unsigned int ctr = 0;
>  	int err;
>  
> @@ -428,7 +427,7 @@ int ovl_verify_index(struct dentry *index, struct path *lowerstack,
>  	if (err)
>  		goto fail;
>  
> -	err = ovl_check_origin(index, lowerstack, numlower, &stack, &ctr);
> +	err = ovl_check_origin(index, lower, numlower, &stack, &ctr);
>  	if (!err && !ctr)
>  		err = -ESTALE;
>  	if (err)
> @@ -570,11 +569,24 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
>  	}
>  	BUG_ON(idx > oe->numlower);
>  	*idxp = idx;
> -	*path = oe->lowerstack[idx - 1];
> +	path->dentry = oe->lowerstack[idx - 1].dentry;
> +	path->mnt = oe->lowerstack[idx - 1].layer->mnt;
>  
>  	return (idx < oe->numlower) ? idx + 1 : -1;
>  }
>  
> +static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path)
> +{
> +	int i;
> +
> +	for (i = 0; i < ofs->numlower; i++) {
> +		if (ofs->lower_layers[i].mnt == path->layer->mnt)
> +			break;
> +	}
> +
> +	return i;
> +}
> +
>  struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			  unsigned int flags)
>  {
> @@ -583,7 +595,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
>  	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
>  	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
> -	struct path *stack = NULL;
> +	struct ovl_path *stack = NULL;
>  	struct dentry *upperdir, *upperdentry = NULL;
>  	struct dentry *index = NULL;
>  	unsigned int ctr = 0;
> @@ -648,17 +660,17 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  
>  	if (!d.stop && poe->numlower) {
>  		err = -ENOMEM;
> -		stack = kcalloc(ofs->numlower, sizeof(struct path),
> +		stack = kcalloc(ofs->numlower, sizeof(struct ovl_path),
>  				GFP_KERNEL);
>  		if (!stack)
>  			goto out_put_upper;
>  	}
>  
>  	for (i = 0; !d.stop && i < poe->numlower; i++) {
> -		struct path lowerpath = poe->lowerstack[i];
> +		struct ovl_path lower = poe->lowerstack[i];
>  
>  		d.last = i == poe->numlower - 1;
> -		err = ovl_lookup_layer(lowerpath.dentry, &d, &this);
> +		err = ovl_lookup_layer(lower.dentry, &d, &this);
>  		if (err)
>  			goto out_put;
>  
> @@ -666,7 +678,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			continue;
>  
>  		stack[ctr].dentry = this;
> -		stack[ctr].mnt = lowerpath.mnt;
> +		stack[ctr].layer = lower.layer;
>  		ctr++;
>  
>  		if (d.stop)
> @@ -676,10 +688,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			poe = roe;
>  
>  			/* Find the current layer on the root dentry */
> -			for (i = 0; i < poe->numlower; i++)
> -				if (poe->lowerstack[i].mnt == lowerpath.mnt)
> -					break;
> -			if (WARN_ON(i == poe->numlower))
> +			i = ovl_find_layer(ofs, &lower);
> +			if (WARN_ON(i == ofs->numlower))
>  				break;
>  		}
>  	}
> @@ -702,7 +712,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  		goto out_put;
>  
>  	oe->opaque = upperopaque;
> -	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
> +	memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr);
>  	dentry->d_fsdata = oe;
>  
>  	if (upperdentry)
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 73ef1e850635..335e9a052995 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -248,7 +248,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
>  /* namei.c */
>  int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
>  		      struct dentry *origin, bool is_upper, bool set);
> -int ovl_verify_index(struct dentry *index, struct path *lowerstack,
> +int ovl_verify_index(struct dentry *index, struct ovl_path *lower,
>  		     unsigned int numlower);
>  int ovl_get_index_name(struct dentry *origin, struct qstr *name);
>  int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
> @@ -265,7 +265,7 @@ int ovl_check_d_type_supported(struct path *realpath);
>  void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>  			 struct dentry *dentry, int level);
>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> -			 struct path *lowerstack, unsigned int numlower);
> +			 struct ovl_path *lower, unsigned int numlower);
>  
>  /* inode.c */
>  int ovl_set_nlink_upper(struct dentry *dentry);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 25d9b5adcd42..93eb6a044dd2 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -17,11 +17,21 @@ struct ovl_config {
>  	bool index;
>  };
>  
> +struct ovl_layer {
> +	struct vfsmount *mnt;
> +	dev_t pseudo_dev;
> +};
> +
> +struct ovl_path {
> +	struct ovl_layer *layer;
> +	struct dentry *dentry;
> +};
> +
>  /* private information held for overlayfs's superblock */
>  struct ovl_fs {
>  	struct vfsmount *upper_mnt;
>  	unsigned numlower;
> -	struct vfsmount **lower_mnt;
> +	struct ovl_layer *lower_layers;
>  	/* workbasedir is the path at workdir= mount option */
>  	struct dentry *workbasedir;
>  	/* workdir is the 'work' directory under workbasedir */
> @@ -52,7 +62,7 @@ struct ovl_entry {
>  		struct rcu_head rcu;
>  	};
>  	unsigned numlower;
> -	struct path lowerstack[];
> +	struct ovl_path lowerstack[];
>  };
>  
>  struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index bcc123c7706c..0ab657f2c1c8 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -1020,7 +1020,7 @@ void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt,
>  }
>  
>  int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
> -			 struct path *lowerstack, unsigned int numlower)
> +			 struct ovl_path *lower, unsigned int numlower)
>  {
>  	int err;
>  	struct dentry *index = NULL;
> @@ -1055,7 +1055,7 @@ int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt,
>  			index = NULL;
>  			break;
>  		}
> -		err = ovl_verify_index(index, lowerstack, numlower);
> +		err = ovl_verify_index(index, lower, numlower);
>  		/* Cleanup stale and orphan index entries */
>  		if (err && (err == -ESTALE || err == -ENOENT))
>  			err = ovl_cleanup(dir, index);
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 8702803ba328..323dfd7a0236 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -219,9 +219,11 @@ static void ovl_put_super(struct super_block *sb)
>  	if (ufs->upper_mnt && ufs->upperdir_locked)
>  		ovl_inuse_unlock(ufs->upper_mnt->mnt_root);
>  	mntput(ufs->upper_mnt);
> -	for (i = 0; i < ufs->numlower; i++)
> -		mntput(ufs->lower_mnt[i]);
> -	kfree(ufs->lower_mnt);
> +	for (i = 0; i < ufs->numlower; i++) {
> +		mntput(ufs->lower_layers[i].mnt);
> +		free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +	}
> +	kfree(ufs->lower_layers);
>  
>  	kfree(ufs->config.lowerdir);
>  	kfree(ufs->config.upperdir);
> @@ -1026,24 +1028,35 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	}
>  
>  	err = -ENOMEM;
> -	ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
> -	if (ufs->lower_mnt == NULL)
> +	ufs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
> +				    GFP_KERNEL);
> +	if (ufs->lower_layers == NULL)
>  		goto out_put_workdir;
>  	for (i = 0; i < numlower; i++) {
> -		struct vfsmount *mnt = clone_private_mount(&stack[i]);
> +		struct vfsmount *mnt;
> +		dev_t dev;
> +
> +		err = get_anon_bdev(&dev);
> +		if (err) {
> +			pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
> +			goto out_put_lower_layers;
> +		}
>  
> +		mnt = clone_private_mount(&stack[i]);
>  		err = PTR_ERR(mnt);
>  		if (IS_ERR(mnt)) {
>  			pr_err("overlayfs: failed to clone lowerpath\n");
> -			goto out_put_lower_mnt;
> +			free_anon_bdev(dev);
> +			goto out_put_lower_layers;
>  		}
>  		/*
> -		 * Make lower_mnt R/O.  That way fchmod/fchown on lower file
> +		 * Make lower layers R/O.  That way fchmod/fchown on lower file
>  		 * will fail instead of modifying lower fs.
>  		 */
>  		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
>  
> -		ufs->lower_mnt[ufs->numlower] = mnt;
> +		ufs->lower_layers[ufs->numlower].mnt = mnt;
> +		ufs->lower_layers[ufs->numlower].pseudo_dev = dev;
>  		ufs->numlower++;
>  
>  		/* Check if all lower layers are on same sb */
> @@ -1059,13 +1072,25 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
>  		ufs->same_sb = NULL;
>  
> +	err = -ENOMEM;
> +	oe = ovl_alloc_entry(numlower);
> +	if (!oe)
> +		goto out_put_lower_layers;
> +
> +	for (i = 0; i < numlower; i++) {
> +		oe->lowerstack[i].dentry = stack[i].dentry;
> +		oe->lowerstack[i].layer = &(ufs->lower_layers[i]);
> +	}
> +
>  	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
>  		/* Verify lower root is upper root origin */
> -		err = ovl_verify_origin(upperpath.dentry, ufs->lower_mnt[0],
> -					stack[0].dentry, false, true);
> +		err = ovl_verify_origin(upperpath.dentry,
> +					oe->lowerstack[0].layer->mnt,
> +					oe->lowerstack[0].dentry,
> +					false, true);
>  		if (err) {
>  			pr_err("overlayfs: failed to verify upper root origin\n");
> -			goto out_put_lower_mnt;
> +			goto out_free_oe;
>  		}
>  
>  		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
> @@ -1081,7 +1106,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  			if (!err)
>  				err = ovl_indexdir_cleanup(ufs->indexdir,
>  							   ufs->upper_mnt,
> -							   stack, numlower);
> +							   oe->lowerstack,
> +							   numlower);
>  		}
>  		if (err || !ufs->indexdir)
>  			pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
> @@ -1106,11 +1132,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	/* Never override disk quota limits or use reserved space */
>  	cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
>  
> -	err = -ENOMEM;
> -	oe = ovl_alloc_entry(numlower);
> -	if (!oe)
> -		goto out_put_cred;
> -
>  	sb->s_magic = OVERLAYFS_SUPER_MAGIC;
>  	sb->s_op = &ovl_super_operations;
>  	sb->s_xattr = ovl_xattr_handlers;
> @@ -1119,11 +1140,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
>  	if (!root_dentry)
> -		goto out_free_oe;
> +		goto out_put_cred;
>  
>  	mntput(upperpath.mnt);
>  	for (i = 0; i < numlower; i++)
>  		mntput(stack[i].mnt);
> +	kfree(stack);
>  	mntput(workpath.mnt);
>  	kfree(lowertmp);
>  
> @@ -1132,11 +1154,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		if (ovl_is_impuredir(upperpath.dentry))
>  			ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
>  	}
> -	for (i = 0; i < numlower; i++) {
> -		oe->lowerstack[i].dentry = stack[i].dentry;
> -		oe->lowerstack[i].mnt = ufs->lower_mnt[i];
> -	}
> -	kfree(stack);
>  
>  	root_dentry->d_fsdata = oe;
>  
> @@ -1147,16 +1164,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	return 0;
>  
> -out_free_oe:
> -	kfree(oe);
>  out_put_cred:
>  	put_cred(ufs->creator_cred);
>  out_put_indexdir:
>  	dput(ufs->indexdir);
> -out_put_lower_mnt:
> -	for (i = 0; i < ufs->numlower; i++)
> -		mntput(ufs->lower_mnt[i]);
> -	kfree(ufs->lower_mnt);
> +out_free_oe:
> +	kfree(oe);
> +out_put_lower_layers:
> +	for (i = 0; i < ufs->numlower; i++) {
> +		if (ufs->lower_layers[i].mnt)
> +			free_anon_bdev(ufs->lower_layers[i].pseudo_dev);
> +		mntput(ufs->lower_layers[i].mnt);
> +	}
> +	kfree(ufs->lower_layers);
>  out_put_workdir:
>  	dput(ufs->workdir);
>  	mntput(ufs->upper_mnt);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 9158d17bb320..d6bb1c9f5e7a 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -124,7 +124,12 @@ void ovl_path_lower(struct dentry *dentry, struct path *path)
>  {
>  	struct ovl_entry *oe = dentry->d_fsdata;
>  
> -	*path = oe->numlower ? oe->lowerstack[0] : (struct path) { };
> +	if (oe->numlower) {
> +		path->mnt = oe->lowerstack[0].layer->mnt;
> +		path->dentry = oe->lowerstack[0].dentry;
> +	} else {
> +		*path = (struct path) { };
> +	}
>  }
>  
>  enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path)
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 15:02     ` Amir Goldstein
@ 2017-11-01 15:47       ` Vivek Goyal
  2017-11-01 16:41         ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Vivek Goyal @ 2017-11-01 15:47 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Wed, Nov 01, 2017 at 05:02:55PM +0200, Amir Goldstein wrote:
> On Wed, Nov 1, 2017 at 4:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Mon, Oct 30, 2017 at 10:27:25PM +0200, Amir Goldstein wrote:
> >> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> >>
> >> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
> >> provides unique values for st_dev. The unique values are obtained by
> >> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
> >> instance.
> >
> > Hi Amir, Chandan,
> >
> > In the commit message, can we also mention what's the current behavior
> > and why this new behavior beneficial/desirable.
> >
> 
> This is the blurb from the uptodate patch on my branch:
> 
>      For non-samefs setup, to make sure that st_dev/st_ino pair
>     is unique across the system, we return a unique anonymous
>     st_dev for stat(2) of lower layer inode.
> 
> A bit fatter, but not fat enough...
> 
> Actually, it is not accurate, because st_dev/st_ino pair of pure
> upper is still same values as underlying inode for non-samefs so the
> values are not unique among all inodes in the system.

Hi Amir,

So as of now for non-samefs non-dir case we return st_dev/st_ino of
lower inode. And with this change we will return st_dev of overlayfs
while inode of lower, right?

What does unique mean in this context. IIUC, st_dev/st_inode of lower
will be unique in the system, isn't it. Which other inode can have
same st_dev/st_ino pair.

Or is it the case that if same inode is accessed through overlayfs, we
want to report a different st_dev.

> 
> I can't remember if there was a reason for not allocating anonymous bdev
> for upper

That's a good point.

> or if it just because we did not need it to guaranty uniqueness
> of st_dev/st_ino *among* overlay inodes

Even for lower, st_dev will be unique for different lower on non same-fs,
right. IOW, when it come to uniqueness of st_dev/st_ino pair, among
overlay inodes, lower and upper should have same requirements.

> while guarantying constant
> st_dev/st_ino across copy up.

Hmm..., I did not get this point. Over copy up, atleast st_ino will change
for non-samefs case. 

I will spend more time on patch.

Vivek

> 
> I will update commit message.
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 15:41   ` Vivek Goyal
@ 2017-11-01 16:30     ` Amir Goldstein
  0 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-11-01 16:30 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Wed, Nov 1, 2017 at 5:41 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, Oct 30, 2017 at 10:27:25PM +0200, Amir Goldstein wrote:
>> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>>
>> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
>> provides unique values for st_dev. The unique values are obtained by
>> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
>> instance.
>>
>> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>> [amir: rename to struct ovl_layer lower_layers and struct ovl_path]
>
> Hi Amir,
>
> Would it make sense to do all the renaming in a separate patch. It becomes
> much easier to review this patch then. May be renaming can go first and
> then anonymous bdev patch can come in.
>

Sounds good. I'll do it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 15:47       ` Vivek Goyal
@ 2017-11-01 16:41         ` Amir Goldstein
  2017-11-02 12:27           ` Vivek Goyal
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2017-11-01 16:41 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Wed, Nov 1, 2017 at 5:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Nov 01, 2017 at 05:02:55PM +0200, Amir Goldstein wrote:
>> On Wed, Nov 1, 2017 at 4:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Mon, Oct 30, 2017 at 10:27:25PM +0200, Amir Goldstein wrote:
>> >> From: Chandan Rajendra <chandan@linux.vnet.ibm.com>
>> >>
>> >> For stat(2) on lowerdir non-dir entries in non-samefs case, this commit
>> >> provides unique values for st_dev. The unique values are obtained by
>> >> allocating anonymous bdevs for each of the lowerdirs in the overlayfs
>> >> instance.
>> >
>> > Hi Amir, Chandan,
>> >
>> > In the commit message, can we also mention what's the current behavior
>> > and why this new behavior beneficial/desirable.
>> >
>>
>> This is the blurb from the uptodate patch on my branch:
>>
>>      For non-samefs setup, to make sure that st_dev/st_ino pair
>>     is unique across the system, we return a unique anonymous
>>     st_dev for stat(2) of lower layer inode.
>>
>> A bit fatter, but not fat enough...
>>
>> Actually, it is not accurate, because st_dev/st_ino pair of pure
>> upper is still same values as underlying inode for non-samefs so the
>> values are not unique among all inodes in the system.
>
> Hi Amir,
>
> So as of now for non-samefs non-dir case we return st_dev/st_ino of
> lower inode. And with this change we will return st_dev of overlayfs
> while inode of lower, right?
>
> What does unique mean in this context. IIUC, st_dev/st_inode of lower
> will be unique in the system, isn't it. Which other inode can have
> same st_dev/st_ino pair.
>
> Or is it the case that if same inode is accessed through overlayfs, we
> want to report a different st_dev.
>

That is the case.
It is not that important and we did not fix this case for upper inodes,
but this effectively the change of this patch.
We actually need this patch for constant st_ino/st_dev across copy up,
which comes right after this.

>>
>> I can't remember if there was a reason for not allocating anonymous bdev
>> for upper
>
> That's a good point.
>
>> or if it just because we did not need it to guaranty uniqueness
>> of st_dev/st_ino *among* overlay inodes
>
> Even for lower, st_dev will be unique for different lower on non same-fs,
> right. IOW, when it come to uniqueness of st_dev/st_ino pair, among
> overlay inodes, lower and upper should have same requirements.
>
>> while guarantying constant
>> st_dev/st_ino across copy up.
>
> Hmm..., I did not get this point. Over copy up, atleast st_ino will change
> for non-samefs case.
>
> I will spend more time on patch.
>

Urgh! It took me a while to remember the reason why system wide uniqueness
is important for lower but less for upper.
An upper object has the same content as the "real" object and they have the
same st_ino/st_dev so its ok that diff will skip comparing them.
A copy-up object does not have the same content as the lower "real" object,
so if it has the same st_ino/st_dev as real object, diff will skip compare and
we have a problem.

Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-01 16:41         ` Amir Goldstein
@ 2017-11-02 12:27           ` Vivek Goyal
  2017-11-02 12:47             ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Vivek Goyal @ 2017-11-02 12:27 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Wed, Nov 01, 2017 at 06:41:51PM +0200, Amir Goldstein wrote:

[..]
> >> I can't remember if there was a reason for not allocating anonymous bdev
> >> for upper
> >
> > That's a good point.
> >
> >> or if it just because we did not need it to guaranty uniqueness
> >> of st_dev/st_ino *among* overlay inodes
> >
> > Even for lower, st_dev will be unique for different lower on non same-fs,
> > right. IOW, when it come to uniqueness of st_dev/st_ino pair, among
> > overlay inodes, lower and upper should have same requirements.
> >
> >> while guarantying constant
> >> st_dev/st_ino across copy up.
> >
> > Hmm..., I did not get this point. Over copy up, atleast st_ino will change
> > for non-samefs case.
> >
> > I will spend more time on patch.
> >
> 
> Urgh! It took me a while to remember the reason why system wide uniqueness
> is important for lower but less for upper.
> An upper object has the same content as the "real" object and they have the
> same st_ino/st_dev so its ok that diff will skip comparing them.
> A copy-up object does not have the same content as the lower "real" object,
> so if it has the same st_ino/st_dev as real object, diff will skip compare and
> we have a problem.

I am not sure I understand this.  So you are doing a diff between a file
on overlayfs and same file accessed outside overlayfs?

If a file is on lower, then it has not been modified and diff skipping
it makes perfect sense?

Vivek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-02 12:27           ` Vivek Goyal
@ 2017-11-02 12:47             ` Amir Goldstein
  2017-11-02 14:05               ` Vivek Goyal
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2017-11-02 12:47 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Thu, Nov 2, 2017 at 2:27 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Wed, Nov 01, 2017 at 06:41:51PM +0200, Amir Goldstein wrote:
>
> [..]
>> >> I can't remember if there was a reason for not allocating anonymous bdev
>> >> for upper
>> >
>> > That's a good point.
>> >
>> >> or if it just because we did not need it to guaranty uniqueness
>> >> of st_dev/st_ino *among* overlay inodes
>> >
>> > Even for lower, st_dev will be unique for different lower on non same-fs,
>> > right. IOW, when it come to uniqueness of st_dev/st_ino pair, among
>> > overlay inodes, lower and upper should have same requirements.
>> >
>> >> while guarantying constant
>> >> st_dev/st_ino across copy up.
>> >
>> > Hmm..., I did not get this point. Over copy up, atleast st_ino will change
>> > for non-samefs case.
>> >
>> > I will spend more time on patch.
>> >
>>
>> Urgh! It took me a while to remember the reason why system wide uniqueness
>> is important for lower but less for upper.
>> An upper object has the same content as the "real" object and they have the
>> same st_ino/st_dev so its ok that diff will skip comparing them.
>> A copy-up object does not have the same content as the lower "real" object,
>> so if it has the same st_ino/st_dev as real object, diff will skip compare and
>> we have a problem.
>
> I am not sure I understand this.  So you are doing a diff between a file
> on overlayfs and same file accessed outside overlayfs?
>

Yes.

> If a file is on lower, then it has not been modified and diff skipping
> it makes perfect sense?

Yes.

But!
With constant st_ino/st_dev across copy up (the next patch)
the overlay object still has the lower inode st_dev/st_inode also *after*
copy up and modification. Now if you diff overlay file and lower
file diff will say they are equal, but in fact they have a different content.

This is how I phrased this in latest patch set per your request:

-----------------------
[V6 4/9] ovl: return anonymous st_dev for lower inodes

For non-samefs setup, to make sure that st_dev/st_ino pair
is unique across the system, we return a unique anonymous
st_dev for stat(2) of lower layer inode.

We need to make this change before fixing constant st_dev/st_ino across
copy up for non-samefs. Otherwise, we can end up with two objects
in the system, the real lower inode and the overlay inode which
have same st_dev/st_ino value, but different content.


Hope this is clear.

Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-02 12:47             ` Amir Goldstein
@ 2017-11-02 14:05               ` Vivek Goyal
  2017-11-02 14:38                 ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Vivek Goyal @ 2017-11-02 14:05 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Thu, Nov 02, 2017 at 02:47:07PM +0200, Amir Goldstein wrote:
> On Thu, Nov 2, 2017 at 2:27 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Wed, Nov 01, 2017 at 06:41:51PM +0200, Amir Goldstein wrote:
> >
> > [..]
> >> >> I can't remember if there was a reason for not allocating anonymous bdev
> >> >> for upper
> >> >
> >> > That's a good point.
> >> >
> >> >> or if it just because we did not need it to guaranty uniqueness
> >> >> of st_dev/st_ino *among* overlay inodes
> >> >
> >> > Even for lower, st_dev will be unique for different lower on non same-fs,
> >> > right. IOW, when it come to uniqueness of st_dev/st_ino pair, among
> >> > overlay inodes, lower and upper should have same requirements.
> >> >
> >> >> while guarantying constant
> >> >> st_dev/st_ino across copy up.
> >> >
> >> > Hmm..., I did not get this point. Over copy up, atleast st_ino will change
> >> > for non-samefs case.
> >> >
> >> > I will spend more time on patch.
> >> >
> >>
> >> Urgh! It took me a while to remember the reason why system wide uniqueness
> >> is important for lower but less for upper.
> >> An upper object has the same content as the "real" object and they have the
> >> same st_ino/st_dev so its ok that diff will skip comparing them.
> >> A copy-up object does not have the same content as the lower "real" object,
> >> so if it has the same st_ino/st_dev as real object, diff will skip compare and
> >> we have a problem.
> >
> > I am not sure I understand this.  So you are doing a diff between a file
> > on overlayfs and same file accessed outside overlayfs?
> >
> 
> Yes.
> 
> > If a file is on lower, then it has not been modified and diff skipping
> > it makes perfect sense?
> 
> Yes.
> 
> But!
> With constant st_ino/st_dev across copy up (the next patch)
> the overlay object still has the lower inode st_dev/st_inode also *after*
> copy up and modification. Now if you diff overlay file and lower
> file diff will say they are equal, but in fact they have a different content.
> 
> This is how I phrased this in latest patch set per your request:

Ok, I think I am beginning to understand it. Here is my understanding.
Please correct me if something is not right.

So we basically have 4 core requirements.

A. contstant st_dev/st_ino over copy up.
B. Persistent st_ino
C. unique st_dev/st_ino
D. Diff works fine even after copy up.

As of today, for non-samefs case, ovl_getattr() reports st_dev/st_ino of
*real* file. This meets requirement B and C and D but not requirement A. 

To meet requirement A, one could make use of ORIGIN xattr and report
st_dev/st_ino of lower (even after file got copied up). This will
meet requirement A, B and C but not D. 

So to make all 4 work for non-samefs case, we don't report real st_dev
of lower and instead create a pseudo dev and report that. IOW, for
non-same fs case, report pseudo_st_dev/real_st_ino of lower. And
that should meet all the 4 core requirements.

And this patch series implements this.

If this description is correct, I feel some of this should be used
in changelog somewhere to make it easier to understand the rationale
behind the change.


Thanks
Vivek



> 
> -----------------------
> [V6 4/9] ovl: return anonymous st_dev for lower inodes
> 
> For non-samefs setup, to make sure that st_dev/st_ino pair
> is unique across the system, we return a unique anonymous
> st_dev for stat(2) of lower layer inode.
> 
> We need to make this change before fixing constant st_dev/st_ino across
> copy up for non-samefs. Otherwise, we can end up with two objects
> in the system, the real lower inode and the overlay inode which
> have same st_dev/st_ino value, but different content.
> 
> 
> Hope this is clear.
> 
> Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs
  2017-11-02 14:05               ` Vivek Goyal
@ 2017-11-02 14:38                 ` Amir Goldstein
  0 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2017-11-02 14:38 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Chandan Rajendra, zhangyi, overlayfs

On Thu, Nov 2, 2017 at 4:05 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Thu, Nov 02, 2017 at 02:47:07PM +0200, Amir Goldstein wrote:
>> On Thu, Nov 2, 2017 at 2:27 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Wed, Nov 01, 2017 at 06:41:51PM +0200, Amir Goldstein wrote:
>> >
>> > [..]
>> >> >> I can't remember if there was a reason for not allocating anonymous bdev
>> >> >> for upper
>> >> >
>> >> > That's a good point.
>> >> >
>> >> >> or if it just because we did not need it to guaranty uniqueness
>> >> >> of st_dev/st_ino *among* overlay inodes
>> >> >
>> >> > Even for lower, st_dev will be unique for different lower on non same-fs,
>> >> > right. IOW, when it come to uniqueness of st_dev/st_ino pair, among
>> >> > overlay inodes, lower and upper should have same requirements.
>> >> >
>> >> >> while guarantying constant
>> >> >> st_dev/st_ino across copy up.
>> >> >
>> >> > Hmm..., I did not get this point. Over copy up, atleast st_ino will change
>> >> > for non-samefs case.
>> >> >
>> >> > I will spend more time on patch.
>> >> >
>> >>
>> >> Urgh! It took me a while to remember the reason why system wide uniqueness
>> >> is important for lower but less for upper.
>> >> An upper object has the same content as the "real" object and they have the
>> >> same st_ino/st_dev so its ok that diff will skip comparing them.
>> >> A copy-up object does not have the same content as the lower "real" object,
>> >> so if it has the same st_ino/st_dev as real object, diff will skip compare and
>> >> we have a problem.
>> >
>> > I am not sure I understand this.  So you are doing a diff between a file
>> > on overlayfs and same file accessed outside overlayfs?
>> >
>>
>> Yes.
>>
>> > If a file is on lower, then it has not been modified and diff skipping
>> > it makes perfect sense?
>>
>> Yes.
>>
>> But!
>> With constant st_ino/st_dev across copy up (the next patch)
>> the overlay object still has the lower inode st_dev/st_inode also *after*
>> copy up and modification. Now if you diff overlay file and lower
>> file diff will say they are equal, but in fact they have a different content.
>>
>> This is how I phrased this in latest patch set per your request:
>
> Ok, I think I am beginning to understand it. Here is my understanding.
> Please correct me if something is not right.
>
> So we basically have 4 core requirements.
>
> A. contstant st_dev/st_ino over copy up.
> B. Persistent st_ino
> C. unique st_dev/st_ino
> D. Diff works fine even after copy up.
>
> As of today, for non-samefs case, ovl_getattr() reports st_dev/st_ino of
> *real* file. This meets requirement B and C and D but not requirement A.
>
> To meet requirement A, one could make use of ORIGIN xattr and report
> st_dev/st_ino of lower (even after file got copied up). This will
> meet requirement A, B and C but not D.
>
> So to make all 4 work for non-samefs case, we don't report real st_dev
> of lower and instead create a pseudo dev and report that. IOW, for
> non-same fs case, report pseudo_st_dev/real_st_ino of lower. And
> that should meet all the 4 core requirements.
>

Yes. Almost accurate.

Requirement B is not met for directory inodes in non-samefs case.
it wasn't met before this change and it is not met by this change.


> And this patch series implements this.
>
> If this description is correct, I feel some of this should be used
> in changelog somewhere to make it easier to understand the rationale
> behind the change.
>
>

I'll see where I can plug this description.
BTW, found some bugs in V6 patch set.
Stay tuned for V7.

Thanks for review!
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-11-02 14:38 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-30 20:27 [PATCH v5 0/4] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
2017-10-30 20:27 ` [PATCH v5 1/4] ovl: move include of ovl_entry.h into overlayfs.h Amir Goldstein
2017-10-31 13:14   ` Miklos Szeredi
2017-10-31 13:22     ` Amir Goldstein
2017-10-30 20:27 ` [PATCH v5 2/4] ovl: allocate anonymous devs for lowerdirs Amir Goldstein
2017-10-31 13:43   ` Miklos Szeredi
2017-10-31 13:53     ` Amir Goldstein
2017-10-31 23:01       ` Amir Goldstein
2017-11-01 13:17         ` Chandan Rajendra
2017-11-01 13:34           ` Amir Goldstein
2017-11-01 14:42   ` Vivek Goyal
2017-11-01 15:02     ` Amir Goldstein
2017-11-01 15:47       ` Vivek Goyal
2017-11-01 16:41         ` Amir Goldstein
2017-11-02 12:27           ` Vivek Goyal
2017-11-02 12:47             ` Amir Goldstein
2017-11-02 14:05               ` Vivek Goyal
2017-11-02 14:38                 ` Amir Goldstein
2017-11-01 15:41   ` Vivek Goyal
2017-11-01 16:30     ` Amir Goldstein
2017-10-30 20:27 ` [PATCH v5 3/4] ovl: relax same fs constrain for constant st_ino Amir Goldstein
2017-10-30 20:27 ` [PATCH v5 4/4] ovl: relax same fs constraint for constant d_ino Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.