linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Sort out overlay layers and fs arrays
@ 2019-11-17 15:43 Amir Goldstein
  2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

Miklos,

When I started generalizing the lower_layers/lower_fs arrays
I noticed a bug that was introduced in v4.17 with xino.

In the case of lower layer on upper fs, we do not have a pseudo_dev
assigned to lower layer and we expose the real lower st_dev;st_ino.
This happens on non-samefs when xino is disabled (default).
This is a very real bug, not really a corner case and I have an
an xfstest [1] for it that I will post later.

In the mean while, I also pushed a fix to unionmount-testsuite devel
branch [2] to demonstrate the issue.

With upstream kernel, this test ends up with a copied up file
from middle layer, whose on same fs as upper and its exposed
st_dev;st_ino are invalid:

 ./run --ov=1 --verify hard-link
 ...
 /mnt/a/no_foo110: File unexpectedly on upper layer

Patch 1 in the series is a small fix for stable that fixes the
v4.17 regression in favor of a different, less severe regression.
The new regression can be demonstrated with:

 ./run --ov=1 --verify --xino hard-link
 ...
 /mnt/a/no_foo110: inode number/layer changed on copy up
 (got 39:24707, was 39:24700)

Patches 2-4 generalize the lower_{layer/fs} arrays to layer/fs arrays
and get rid of some special casing of upper layer.

Patches 5-6 use the cleanup to solve the corner case that you pointed
out with bas_uuid [3] and to fix the regression introduced by patch 1.

After patch 6, both unionmount-testsuite configurations
above pass the test st_dev;st_ino verifications.

I doubt if patches 2-6 are stable material, because not sure the
corner cases they fix are worth the trouble.

The series depends on the bad_uuid patch v5 that I posted on Thursday.

I was also considering setting xino=on by default if xino_auto
is enabled, because what have we got to loose?

The inodes whose st_ino fit in lower bits (by far more common) will
use overlay st_dev and the inodes whose st_ino overflow the lower bits
will use pseudo_dev. Seems like a win-win situation, but I wanted to
get your feedback on this before sending out a patch.

Thanks,
Amir.

[1] https://github.com/amir73il/xfstests/commit/c667f26839ae487c509b95abae670fdca1c535c8
[2] https://github.com/amir73il/unionmount-testsuite/commit/1724ef2245c5e56f73e436b37407d00ef498f9bc
[3] https://lore.kernel.org/lkml/CAJfpegufS=OGcvFbWEVumNSCPO_JXyEuJNAbmO5ubscSarVtRQ@mail.gmail.com/

Amir Goldstein (6):
  ovl: fix corner case of non-unique st_dev;st_ino
  ovl: generalize the lower_layers[] array
  ovl: simplify ovl_same_sb() helper
  ovl: generalize the lower_fs[] array
  ovl: fix corner case of conflicting lower layer uuid
  ovl: fix corner case of non-constant st_dev;st_ino

 fs/overlayfs/export.c    |   6 +-
 fs/overlayfs/inode.c     |  35 +++++------
 fs/overlayfs/namei.c     |  10 ++--
 fs/overlayfs/overlayfs.h |  23 ++++++-
 fs/overlayfs/ovl_entry.h |  14 +++--
 fs/overlayfs/readdir.c   |  11 ++--
 fs/overlayfs/super.c     | 125 ++++++++++++++++++++++-----------------
 fs/overlayfs/util.c      |  18 ++----
 8 files changed, 132 insertions(+), 110 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
  2019-11-17 15:43 ` [PATCH 2/6] ovl: generalize the lower_layers[] array Amir Goldstein
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs and pure upper inodes use
the real upper st_dev.

It is fine for an overlay pure upper inode to use the same st_dev;st_ino
values as the real upper inode, because the content of those two
different filesystem objects is always the same.

In this case, however:
- two filesystems, A and B
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

Non pure upper overlay inode, whose origin is in layer 1 will have the
same st_dev;st_ino values as the real lower inode. This may result with
a false positive results of 'diff' between the real lower and copied up
overlay inode.

Fix this by using the upper st_dev;st_ino values in this case.
This breaks the property of constant st_dev;st_ino across copy up
of this case. This breakage will be fixed by a later patch.

Fixes: 5148626b806a ("ovl: allocate anon bdev per unique lower fs")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index bc14781886bf..b045cf1826fc 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -200,8 +200,14 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
 			    (!ovl_verify_lower(dentry->d_sb) &&
 			     (is_dir || lowerstat.nlink == 1))) {
-				stat->ino = lowerstat.ino;
 				lower_layer = ovl_layer_lower(dentry);
+				/*
+				 * Cannot use origin st_dev;st_ino because
+				 * origin inode content may differ from overlay
+				 * inode content.
+				 */
+				if (samefs || lower_layer->fsid)
+					stat->ino = lowerstat.ino;
 			}
 
 			/*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/6] ovl: generalize the lower_layers[] array
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
  2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
  2019-11-17 15:43 ` [PATCH 3/6] ovl: simplify ovl_same_sb() helper Amir Goldstein
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

Rename lower_layers[] array to layers[], extend its size by one
and initialize layers[0] with upper layer values.
Lower layers are now addressed with index 1..numlower.
layers[0] is reserved even with lower only overlay.

This gets rid of special casing upper layer in ovl_iterate_real().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c    |  6 ++----
 fs/overlayfs/inode.c     |  4 ++--
 fs/overlayfs/namei.c     | 10 +++++-----
 fs/overlayfs/overlayfs.h |  2 +-
 fs/overlayfs/ovl_entry.h |  2 +-
 fs/overlayfs/readdir.c   |  7 +++----
 fs/overlayfs/super.c     | 43 ++++++++++++++++++++++------------------
 fs/overlayfs/util.c      |  6 ++++--
 8 files changed, 42 insertions(+), 38 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 70e55588aedc..9128cbb3b198 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -424,7 +424,6 @@ static struct dentry *ovl_lookup_real_inode(struct super_block *sb,
 					    struct ovl_layer *layer)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
-	struct ovl_layer upper_layer = { .mnt = ofs->upper_mnt };
 	struct dentry *index = NULL;
 	struct dentry *this = NULL;
 	struct inode *inode;
@@ -466,7 +465,7 @@ static struct dentry *ovl_lookup_real_inode(struct super_block *sb,
 		 * recursive call walks back from indexed upper to the topmost
 		 * connected/hashed upper parent (or up to root).
 		 */
-		this = ovl_lookup_real(sb, upper, &upper_layer);
+		this = ovl_lookup_real(sb, upper, &ofs->layers[0]);
 		dput(upper);
 	}
 
@@ -646,8 +645,7 @@ static struct dentry *ovl_get_dentry(struct super_block *sb,
 				     struct dentry *index)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
-	struct ovl_layer upper_layer = { .mnt = ofs->upper_mnt };
-	struct ovl_layer *layer = upper ? &upper_layer : lowerpath->layer;
+	struct ovl_layer *layer = upper ? &ofs->layers[0] : lowerpath->layer;
 	struct dentry *real = upper ?: (index ?: lowerpath->dentry);
 
 	/*
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index b045cf1826fc..dab39933bbd4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -170,7 +170,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	 */
 	if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
 		if (!OVL_TYPE_UPPER(type)) {
-			lower_layer = ovl_layer_lower(dentry);
+			lower_layer = ovl_dentry_layer(dentry);
 		} else if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
 			u32 lowermask = STATX_INO | STATX_BLOCKS |
@@ -200,7 +200,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
 			    (!ovl_verify_lower(dentry->d_sb) &&
 			     (is_dir || lowerstat.nlink == 1))) {
-				lower_layer = ovl_layer_lower(dentry);
+				lower_layer = ovl_dentry_layer(dentry);
 				/*
 				 * Cannot use origin st_dev;st_ino because
 				 * origin inode content may differ from overlay
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 80fbca5219d4..f16bf608eed7 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -324,16 +324,16 @@ int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, bool connected,
 	struct dentry *origin = NULL;
 	int i;
 
-	for (i = 0; i < ofs->numlower; i++) {
+	for (i = 1; i <= ofs->numlower; i++) {
 		/*
 		 * If lower fs uuid is not unique among lower fs we cannot match
 		 * fh->uuid to layer.
 		 */
-		if (ofs->lower_layers[i].fsid &&
-		    ofs->lower_layers[i].fs->bad_uuid)
+		if (ofs->layers[i].fsid &&
+		    ofs->layers[i].fs->bad_uuid)
 			continue;
 
-		origin = ovl_decode_real_fh(fh, ofs->lower_layers[i].mnt,
+		origin = ovl_decode_real_fh(fh, ofs->layers[i].mnt,
 					    connected);
 		if (origin)
 			break;
@@ -356,7 +356,7 @@ int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, bool connected,
 	}
 	**stackp = (struct ovl_path){
 		.dentry = origin,
-		.layer = &ofs->lower_layers[i]
+		.layer = &ofs->layers[i]
 	};
 
 	return 0;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f283b1d69a9e..50d41a314308 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -237,7 +237,7 @@ enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
-struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
+struct ovl_layer *ovl_dentry_layer(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 28348c44ea5b..ffaf7376f4ab 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -48,7 +48,7 @@ struct ovl_fs {
 	unsigned int numlower;
 	/* Number of unique lower sb that differ from upper sb */
 	unsigned int numlowerfs;
-	struct ovl_layer *lower_layers;
+	struct ovl_layer *layers;
 	struct ovl_sb *lower_fs;
 	/* workbasedir is the path at workdir= mount option */
 	struct dentry *workbasedir;
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 47a91c9733a5..32a7f8a38091 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -508,7 +508,7 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 		ino = stat.ino;
 	} else if (xinobits && !OVL_TYPE_UPPER(type)) {
 		ino = ovl_remap_lower_ino(ino, xinobits,
-					  ovl_layer_lower(this)->fsid,
+					  ovl_dentry_layer(this)->fsid,
 					  p->name, p->len);
 	}
 
@@ -685,15 +685,14 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 	int err;
 	struct ovl_dir_file *od = file->private_data;
 	struct dentry *dir = file->f_path.dentry;
-	struct ovl_layer *lower_layer = ovl_layer_lower(dir);
 	struct ovl_readdir_translate rdt = {
 		.ctx.actor = ovl_fill_real,
 		.orig_ctx = ctx,
 		.xinobits = ovl_xino_bits(dir->d_sb),
 	};
 
-	if (rdt.xinobits && lower_layer)
-		rdt.fsid = lower_layer->fsid;
+	if (rdt.xinobits)
+		rdt.fsid = ovl_dentry_layer(dir)->fsid;
 
 	if (OVL_TYPE_MERGE(ovl_path_type(dir->d_parent))) {
 		struct kstat stat;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index cfb236e37fd9..4ff6eede5f90 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -224,13 +224,13 @@ static void ovl_free_fs(struct ovl_fs *ofs)
 	if (ofs->upperdir_locked)
 		ovl_inuse_unlock(ofs->upper_mnt->mnt_root);
 	mntput(ofs->upper_mnt);
-	for (i = 0; i < ofs->numlower; i++) {
-		iput(ofs->lower_layers[i].trap);
-		mntput(ofs->lower_layers[i].mnt);
+	for (i = 1; i <= ofs->numlower; i++) {
+		iput(ofs->layers[i].trap);
+		mntput(ofs->layers[i].mnt);
 	}
 	for (i = 0; i < ofs->numlowerfs; i++)
 		free_anon_bdev(ofs->lower_fs[i].pseudo_dev);
-	kfree(ofs->lower_layers);
+	kfree(ofs->layers);
 	kfree(ofs->lower_fs);
 
 	kfree(ofs->config.lowerdir);
@@ -1318,16 +1318,16 @@ static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
 	return ofs->numlowerfs;
 }
 
-static int ovl_get_lower_layers(struct super_block *sb, struct ovl_fs *ofs,
-				struct path *stack, unsigned int numlower)
+static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
+			  struct path *stack, unsigned int numlower)
 {
 	int err;
 	unsigned int i;
 
 	err = -ENOMEM;
-	ofs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
-				    GFP_KERNEL);
-	if (ofs->lower_layers == NULL)
+	ofs->layers = kcalloc(numlower + 1, sizeof(struct ovl_layer),
+			      GFP_KERNEL);
+	if (ofs->layers == NULL)
 		goto out;
 
 	ofs->lower_fs = kcalloc(numlower, sizeof(struct ovl_sb),
@@ -1335,6 +1335,11 @@ static int ovl_get_lower_layers(struct super_block *sb, struct ovl_fs *ofs,
 	if (ofs->lower_fs == NULL)
 		goto out;
 
+	/* idx 0 is reserved for upper fs even with lower only overlay */
+	ofs->layers[0].mnt = ofs->upper_mnt;
+	ofs->layers[0].idx = 0;
+	ofs->layers[0].fsid = 0;
+
 	for (i = 0; i < numlower; i++) {
 		struct vfsmount *mnt;
 		struct inode *trap;
@@ -1368,15 +1373,15 @@ static int ovl_get_lower_layers(struct super_block *sb, struct ovl_fs *ofs,
 		 */
 		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
 
-		ofs->lower_layers[ofs->numlower].trap = trap;
-		ofs->lower_layers[ofs->numlower].mnt = mnt;
-		ofs->lower_layers[ofs->numlower].idx = i + 1;
-		ofs->lower_layers[ofs->numlower].fsid = fsid;
+		ofs->numlower++;
+		ofs->layers[ofs->numlower].trap = trap;
+		ofs->layers[ofs->numlower].mnt = mnt;
+		ofs->layers[ofs->numlower].idx = ofs->numlower;
+		ofs->layers[ofs->numlower].fsid = fsid;
 		if (fsid) {
-			ofs->lower_layers[ofs->numlower].fs =
+			ofs->layers[ofs->numlower].fs =
 				&ofs->lower_fs[fsid - 1];
 		}
-		ofs->numlower++;
 	}
 
 	/*
@@ -1463,7 +1468,7 @@ static struct ovl_entry *ovl_get_lowerstack(struct super_block *sb,
 		goto out_err;
 	}
 
-	err = ovl_get_lower_layers(sb, ofs, stack, numlower);
+	err = ovl_get_layers(sb, ofs, stack, numlower);
 	if (err)
 		goto out_err;
 
@@ -1474,7 +1479,7 @@ static struct ovl_entry *ovl_get_lowerstack(struct super_block *sb,
 
 	for (i = 0; i < numlower; i++) {
 		oe->lowerstack[i].dentry = dget(stack[i].dentry);
-		oe->lowerstack[i].layer = &ofs->lower_layers[i];
+		oe->lowerstack[i].layer = &ofs->layers[i+1];
 	}
 
 	if (remote)
@@ -1555,9 +1560,9 @@ static int ovl_check_overlapping_layers(struct super_block *sb,
 			return err;
 	}
 
-	for (i = 0; i < ofs->numlower; i++) {
+	for (i = 1; i <= ofs->numlower; i++) {
 		err = ovl_check_layer(sb, ofs,
-				      ofs->lower_layers[i].mnt->mnt_root,
+				      ofs->layers[i].mnt->mnt_root,
 				      "lowerdir");
 		if (err)
 			return err;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f5678a3f8350..3fa1ca8ddd48 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -198,11 +198,13 @@ struct dentry *ovl_dentry_lower(struct dentry *dentry)
 	return oe->numlower ? oe->lowerstack[0].dentry : NULL;
 }
 
-struct ovl_layer *ovl_layer_lower(struct dentry *dentry)
+/* Either top most lower layer or upper layer for pure upper */
+struct ovl_layer *ovl_dentry_layer(struct dentry *dentry)
 {
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *oe = dentry->d_fsdata;
 
-	return oe->numlower ? oe->lowerstack[0].layer : NULL;
+	return oe->numlower ? oe->lowerstack[0].layer : &ofs->layers[0];
 }
 
 /*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/6] ovl: simplify ovl_same_sb() helper
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
  2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
  2019-11-17 15:43 ` [PATCH 2/6] ovl: generalize the lower_layers[] array Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
  2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

No code uses the sb returned from this helper, so make it retrun
a boolean and rename it to ovl_same_fs().

The xino mode is irrelevant when all layers are on same fs, so
instead of describing samefs with mode OVL_XINO_OFF, use a new mode
OVL_XINO_SAME_FS, which is different than the case of non-samefs
with xino=off.

Create a new helper ovl_same_dev(), to use instead of the common check
for (ovl_same_fs() || xinobits).

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |  7 +++----
 fs/overlayfs/overlayfs.h | 21 ++++++++++++++++++++-
 fs/overlayfs/ovl_entry.h |  5 +++++
 fs/overlayfs/readdir.c   |  4 ++--
 fs/overlayfs/super.c     | 13 +++++--------
 fs/overlayfs/util.c      | 12 ------------
 6 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index dab39933bbd4..717337f2b3fb 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -78,7 +78,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
 			   struct ovl_layer *lower_layer)
 {
-	bool samefs = ovl_same_sb(dentry->d_sb);
+	bool samefs = ovl_same_fs(dentry->d_sb);
 	unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
 
 	if (samefs) {
@@ -146,7 +146,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	struct path realpath;
 	const struct cred *old_cred;
 	bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
-	bool samefs = ovl_same_sb(dentry->d_sb);
 	struct ovl_layer *lower_layer = NULL;
 	int err;
 	bool metacopy_blocks = false;
@@ -168,7 +167,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	 * If lower filesystem supports NFS file handles, this also guaranties
 	 * persistent st_ino across mount cycle.
 	 */
-	if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
+	if (!is_dir || ovl_same_dev(dentry->d_sb)) {
 		if (!OVL_TYPE_UPPER(type)) {
 			lower_layer = ovl_dentry_layer(dentry);
 		} else if (OVL_TYPE_ORIGIN(type)) {
@@ -565,7 +564,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
 	 * ovl_new_inode(), ino arg is 0, so i_ino will be updated to real
 	 * upper inode i_ino on ovl_inode_init() or ovl_inode_update().
 	 */
-	if (ovl_same_sb(inode->i_sb) || xinobits) {
+	if (ovl_same_dev(inode->i_sb)) {
 		inode->i_ino = ino;
 		if (xinobits && fsid && !(ino >> (64 - xinobits)))
 			inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 50d41a314308..f7d01c06cdaf 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -45,6 +45,14 @@ enum ovl_entry_flag {
 	OVL_E_CONNECTED,
 };
 
+enum {
+	OVL_XINO_OFF,
+	OVL_XINO_AUTO,
+	OVL_XINO_ON,
+	/* With samefs, xino is irrelevant */
+	OVL_XINO_SAME_FS,
+};
+
 /*
  * The tuple (fh,uuid) is a universal unique identifier for a copy up origin,
  * where:
@@ -221,7 +229,6 @@ int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
-struct super_block *ovl_same_sb(struct super_block *sb);
 int ovl_can_decode_fh(struct super_block *sb);
 struct dentry *ovl_indexdir(struct super_block *sb);
 bool ovl_index_all(struct super_block *sb);
@@ -306,6 +313,18 @@ static inline unsigned int ovl_xino_bits(struct super_block *sb)
 	return ofs->xino_bits;
 }
 
+/* All layers on same fs? */
+static inline bool ovl_same_fs(struct super_block *sb)
+{
+	return OVL_FS(sb)->config.xino == OVL_XINO_SAME_FS;
+}
+
+/* All overlay inodes have same st_dev? */
+static inline bool ovl_same_dev(struct super_block *sb)
+{
+	return OVL_FS(sb)->config.xino != OVL_XINO_OFF;
+}
+
 static inline int ovl_inode_lock(struct inode *inode)
 {
 	return mutex_lock_interruptible(&OVL_I(inode)->lock);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index ffaf7376f4ab..ef05817d8d89 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -75,6 +75,11 @@ struct ovl_fs {
 	unsigned int xino_bits;
 };
 
+static inline struct ovl_fs *OVL_FS(struct super_block *sb)
+{
+	return (struct ovl_fs *)sb->s_fs_info;
+}
+
 /* private information held for every overlayfs dentry */
 struct ovl_entry {
 	union {
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 32a7f8a38091..56f13e9ccbe6 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -469,7 +469,7 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 	int xinobits = ovl_xino_bits(dir->d_sb);
 	int err = 0;
 
-	if (!ovl_same_sb(dir->d_sb) && !xinobits)
+	if (!ovl_same_dev(dir->d_sb))
 		goto out;
 
 	if (p->name[0] == '.') {
@@ -737,7 +737,7 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
 		 * entries.
 		 */
 		if (ovl_xino_bits(dentry->d_sb) ||
-		    (ovl_same_sb(dentry->d_sb) &&
+		    (ovl_same_fs(dentry->d_sb) &&
 		     (ovl_is_impure_dir(file) ||
 		      OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent))))) {
 			return ovl_iterate_real(file, ctx);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 4ff6eede5f90..ae22c3363c33 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -316,12 +316,6 @@ static const char *ovl_redirect_mode_def(void)
 	return ovl_redirect_dir_def ? "on" : "off";
 }
 
-enum {
-	OVL_XINO_OFF,
-	OVL_XINO_AUTO,
-	OVL_XINO_ON,
-};
-
 static const char * const ovl_xino_str[] = {
 	"off",
 	"auto",
@@ -358,7 +352,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.nfs_export != ovl_nfs_export_def)
 		seq_printf(m, ",nfs_export=%s", ofs->config.nfs_export ?
 						"on" : "off");
-	if (ofs->config.xino != ovl_xino_def())
+	if (ofs->config.xino != ovl_xino_def() &&
+	    ofs->config.xino != OVL_XINO_SAME_FS)
 		seq_printf(m, ",xino=%s", ovl_xino_str[ofs->config.xino]);
 	if (ofs->config.metacopy != ovl_metacopy_def)
 		seq_printf(m, ",metacopy=%s",
@@ -1393,8 +1388,10 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	 * inode number.
 	 */
 	if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
+		if (ofs->config.xino == OVL_XINO_ON)
+			pr_info("overlayfs: \"xino=on\" is useless with all layers on same fs, ignore.\n");
 		ofs->xino_bits = 0;
-		ofs->config.xino = OVL_XINO_OFF;
+		ofs->config.xino = OVL_XINO_SAME_FS;
 	} else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
 		/*
 		 * This is a roundup of number of bits needed for numlowerfs+1
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 3fa1ca8ddd48..256f166b4a17 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -40,18 +40,6 @@ const struct cred *ovl_override_creds(struct super_block *sb)
 	return override_creds(ofs->creator_cred);
 }
 
-struct super_block *ovl_same_sb(struct super_block *sb)
-{
-	struct ovl_fs *ofs = sb->s_fs_info;
-
-	if (!ofs->numlowerfs)
-		return ofs->upper_mnt->mnt_sb;
-	else if (ofs->numlowerfs == 1 && !ofs->upper_mnt)
-		return ofs->lower_fs[0].sb;
-	else
-		return NULL;
-}
-
 /*
  * Check if underlying fs supports file handles and try to determine encoding
  * type, in order to deduce maximum inode number used by fs.
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/6] ovl: generalize the lower_fs[] array
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
                   ` (2 preceding siblings ...)
  2019-11-17 15:43 ` [PATCH 3/6] ovl: simplify ovl_same_sb() helper Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
  2019-11-18 17:01   ` Amir Goldstein
  2019-11-17 15:43 ` [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid Amir Goldstein
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

Rename lower_fs[] array to fs[], extend its size by one and initialize
fs[0] with upper fs values. fsid 0 is already reserved even with lower
only overlay, so fs[0] remains uninitialized in this case as well.

Rename numlowerfs to maxfsid and use index 0..maxfsid to access the
fs[] array.

This gets rid of special casing upper layer in ovl_map_dev_ino().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 31 +++++++++----------
 fs/overlayfs/ovl_entry.h |  5 ++-
 fs/overlayfs/super.c     | 66 +++++++++++++++++++++-------------------
 3 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 717337f2b3fb..9e894f5e19b4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -75,8 +75,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 	return err;
 }
 
-static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
-			   struct ovl_layer *lower_layer)
+static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 {
 	bool samefs = ovl_same_fs(dentry->d_sb);
 	unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
@@ -103,9 +102,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
 			pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
 					    dentry, stat->ino, xinobits);
 		} else {
-			if (lower_layer)
-				stat->ino |= ((u64)lower_layer->fsid) << shift;
-
+			stat->ino |= ((u64)fsid) << shift;
 			stat->dev = dentry->d_sb->s_dev;
 			return 0;
 		}
@@ -124,15 +121,15 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
-	} else if (lower_layer && lower_layer->fsid) {
+	} else {
 		/*
 		 * For non-samefs setup, if we cannot map all layers st_ino
 		 * to a unified address space, we need to make sure that st_dev
-		 * is unique per lower fs. Upper layer uses real st_dev and
-		 * lower layers use the unique anonymous bdev assigned to the
-		 * lower fs.
+		 * is unique per lower fs. Layers that are on the same fs as
+		 * upper layer use real upper st_dev and other lower layers use
+		 * the unique anonymous bdev assigned to the lower fs.
 		 */
-		stat->dev = lower_layer->fs->pseudo_dev;
+		stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
 	}
 
 	return 0;
@@ -146,7 +143,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	struct path realpath;
 	const struct cred *old_cred;
 	bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
-	struct ovl_layer *lower_layer = NULL;
+	int fsid;
 	int err;
 	bool metacopy_blocks = false;
 
@@ -168,9 +165,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	 * persistent st_ino across mount cycle.
 	 */
 	if (!is_dir || ovl_same_dev(dentry->d_sb)) {
-		if (!OVL_TYPE_UPPER(type)) {
-			lower_layer = ovl_dentry_layer(dentry);
-		} else if (OVL_TYPE_ORIGIN(type)) {
+		fsid = ovl_dentry_layer(dentry)->fsid;
+		if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
 			u32 lowermask = STATX_INO | STATX_BLOCKS |
 					(!is_dir ? STATX_NLINK : 0);
@@ -199,14 +195,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
 			    (!ovl_verify_lower(dentry->d_sb) &&
 			     (is_dir || lowerstat.nlink == 1))) {
-				lower_layer = ovl_dentry_layer(dentry);
 				/*
 				 * Cannot use origin st_dev;st_ino because
 				 * origin inode content may differ from overlay
 				 * inode content.
 				 */
-				if (samefs || lower_layer->fsid)
+				if (samefs || fsid)
 					stat->ino = lowerstat.ino;
+			} else {
+				fsid = 0;
 			}
 
 			/*
@@ -240,7 +237,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		}
 	}
 
-	err = ovl_map_dev_ino(dentry, stat, lower_layer);
+	err = ovl_map_dev_ino(dentry, stat, fsid);
 	if (err)
 		goto out;
 
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index ef05817d8d89..93e9fd5fba9d 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -46,10 +46,9 @@ struct ovl_path {
 struct ovl_fs {
 	struct vfsmount *upper_mnt;
 	unsigned int numlower;
-	/* Number of unique lower sb that differ from upper sb */
-	unsigned int numlowerfs;
+	unsigned int maxfsid;
 	struct ovl_layer *layers;
-	struct ovl_sb *lower_fs;
+	struct ovl_sb *fs;
 	/* workbasedir is the path at workdir= mount option */
 	struct dentry *workbasedir;
 	/* workdir is the 'work' directory under workbasedir */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ae22c3363c33..09e53780d9bc 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -228,10 +228,13 @@ static void ovl_free_fs(struct ovl_fs *ofs)
 		iput(ofs->layers[i].trap);
 		mntput(ofs->layers[i].mnt);
 	}
-	for (i = 0; i < ofs->numlowerfs; i++)
-		free_anon_bdev(ofs->lower_fs[i].pseudo_dev);
 	kfree(ofs->layers);
-	kfree(ofs->lower_fs);
+	if (ofs->fs) {
+		/* fs[0].pseudo_dev is either null or real upper st_dev */
+		for (i = 1; i <= ofs->maxfsid; i++)
+			free_anon_bdev(ofs->fs[i].pseudo_dev);
+		kfree(ofs->fs);
+	}
 
 	kfree(ofs->config.lowerdir);
 	kfree(ofs->config.upperdir);
@@ -1253,7 +1256,7 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
 	if (!ofs->config.nfs_export && !ofs->upper_mnt)
 		return true;
 
-	for (i = 0; i < ofs->numlowerfs; i++) {
+	for (i = 1; i <= ofs->maxfsid; i++) {
 		/*
 		 * We use uuid to associate an overlay lower file handle with a
 		 * lower layer, so we can accept lower fs with null uuid as long
@@ -1261,8 +1264,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
 		 * if we detect multiple lower fs with the same uuid, we
 		 * disable lower file handle decoding on all of them.
 		 */
-		if (uuid_equal(&ofs->lower_fs[i].sb->s_uuid, uuid)) {
-			ofs->lower_fs[i].bad_uuid = true;
+		if (uuid_equal(&ofs->fs[i].sb->s_uuid, uuid)) {
+			ofs->fs[i].bad_uuid = true;
 			return false;
 		}
 	}
@@ -1278,13 +1281,9 @@ static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
 	int err;
 	bool bad_uuid = false;
 
-	/* fsid 0 is reserved for upper fs even with non upper overlay */
-	if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb == sb)
-		return 0;
-
-	for (i = 0; i < ofs->numlowerfs; i++) {
-		if (ofs->lower_fs[i].sb == sb)
-			return i + 1;
+	for (i = 0; i <= ofs->maxfsid; i++) {
+		if (ofs->fs[i].sb == sb)
+			return i;
 	}
 
 	if (!ovl_lower_uuid_ok(ofs, &sb->s_uuid)) {
@@ -1305,12 +1304,12 @@ static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
 		return err;
 	}
 
-	ofs->lower_fs[ofs->numlowerfs].sb = sb;
-	ofs->lower_fs[ofs->numlowerfs].pseudo_dev = dev;
-	ofs->lower_fs[ofs->numlowerfs].bad_uuid = bad_uuid;
-	ofs->numlowerfs++;
+	ofs->maxfsid++;
+	ofs->fs[ofs->maxfsid].sb = sb;
+	ofs->fs[ofs->maxfsid].pseudo_dev = dev;
+	ofs->fs[ofs->maxfsid].bad_uuid = bad_uuid;
 
-	return ofs->numlowerfs;
+	return ofs->maxfsid;
 }
 
 static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
@@ -1325,16 +1324,24 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	if (ofs->layers == NULL)
 		goto out;
 
-	ofs->lower_fs = kcalloc(numlower, sizeof(struct ovl_sb),
-				GFP_KERNEL);
-	if (ofs->lower_fs == NULL)
+	ofs->fs = kcalloc(numlower + 1, sizeof(struct ovl_sb), GFP_KERNEL);
+	if (ofs->fs == NULL)
 		goto out;
 
-	/* idx 0 is reserved for upper fs even with lower only overlay */
+	/* idx/fsid 0 are reserved for upper fs even with lower only overlay */
 	ofs->layers[0].mnt = ofs->upper_mnt;
 	ofs->layers[0].idx = 0;
 	ofs->layers[0].fsid = 0;
 
+	/*
+	 * All lower layers that share the same fs as upper layer, use the real
+	 * upper st_dev.
+	 */
+	if (ofs->upper_mnt) {
+		ofs->fs[0].sb = ofs->upper_mnt->mnt_sb;
+		ofs->fs[0].pseudo_dev = ofs->upper_mnt->mnt_sb->s_dev;
+	}
+
 	for (i = 0; i < numlower; i++) {
 		struct vfsmount *mnt;
 		struct inode *trap;
@@ -1373,10 +1380,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 		ofs->layers[ofs->numlower].mnt = mnt;
 		ofs->layers[ofs->numlower].idx = ofs->numlower;
 		ofs->layers[ofs->numlower].fsid = fsid;
-		if (fsid) {
-			ofs->layers[ofs->numlower].fs =
-				&ofs->lower_fs[fsid - 1];
-		}
+		ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
 	}
 
 	/*
@@ -1387,19 +1391,19 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	 * bits reserved for fsid, it emits a warning and uses the original
 	 * inode number.
 	 */
-	if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
+	if (ofs->maxfsid == 0 || (ofs->maxfsid == 1 && !ofs->upper_mnt)) {
 		if (ofs->config.xino == OVL_XINO_ON)
 			pr_info("overlayfs: \"xino=on\" is useless with all layers on same fs, ignore.\n");
 		ofs->xino_bits = 0;
 		ofs->config.xino = OVL_XINO_SAME_FS;
 	} else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
 		/*
-		 * This is a roundup of number of bits needed for numlowerfs+1
-		 * (i.e. ilog2(numlowerfs+1 - 1) + 1). fsid 0 is reserved for
-		 * upper fs even with non upper overlay.
+		 * This is a roundup of number of bits needed for encoding
+		 * fsid, where fsid 0 is reserved for upper fs even with
+		 * lower only overlay.
 		 */
 		BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
-		ofs->xino_bits = ilog2(ofs->numlowerfs) + 1;
+		ofs->xino_bits = ilog2(ofs->maxfsid) + 1;
 	}
 
 	if (ofs->xino_bits) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
                   ` (3 preceding siblings ...)
  2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
  2019-11-17 15:43 ` [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino Amir Goldstein
  2019-11-18  6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
  6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

This fixes ovl_lower_uuid_ok() to correctly detect the corner case:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

In this case, bad_uuid would not have been set for B, because the check
only involved the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We check for conflicting (and null) uuid among all lower layers,
including those layers that are on the same fs as the upper layer.

Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/ovl_entry.h | 2 ++
 fs/overlayfs/super.c     | 8 ++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 93e9fd5fba9d..6c04b6a4e6bf 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -24,6 +24,8 @@ struct ovl_sb {
 	dev_t pseudo_dev;
 	/* Unusable (conflicting) uuid */
 	bool bad_uuid;
+	/* Used as a lower layer (but maybe also as upper) */
+	bool is_lower;
 };
 
 struct ovl_layer {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 09e53780d9bc..e80f79bb8a4e 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1256,7 +1256,7 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
 	if (!ofs->config.nfs_export && !ofs->upper_mnt)
 		return true;
 
-	for (i = 1; i <= ofs->maxfsid; i++) {
+	for (i = 0; i <= ofs->maxfsid; i++) {
 		/*
 		 * We use uuid to associate an overlay lower file handle with a
 		 * lower layer, so we can accept lower fs with null uuid as long
@@ -1264,7 +1264,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
 		 * if we detect multiple lower fs with the same uuid, we
 		 * disable lower file handle decoding on all of them.
 		 */
-		if (uuid_equal(&ofs->fs[i].sb->s_uuid, uuid)) {
+		if (ofs->fs[i].is_lower &&
+		    uuid_equal(&ofs->fs[i].sb->s_uuid, uuid)) {
 			ofs->fs[i].bad_uuid = true;
 			return false;
 		}
@@ -1336,10 +1337,12 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	/*
 	 * All lower layers that share the same fs as upper layer, use the real
 	 * upper st_dev.
+	 * is_lower will be set if upper fs is shared with a lower layer.
 	 */
 	if (ofs->upper_mnt) {
 		ofs->fs[0].sb = ofs->upper_mnt->mnt_sb;
 		ofs->fs[0].pseudo_dev = ofs->upper_mnt->mnt_sb->s_dev;
+		ofs->fs[0].is_lower = false;
 	}
 
 	for (i = 0; i < numlower; i++) {
@@ -1381,6 +1384,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 		ofs->layers[ofs->numlower].idx = ofs->numlower;
 		ofs->layers[ofs->numlower].fsid = fsid;
 		ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
+		ofs->fs[fsid].is_lower = true;
 	}
 
 	/*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
                   ` (4 preceding siblings ...)
  2019-11-17 15:43 ` [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
  2019-11-18  6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
  6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs

On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs, but if lower layer is on
the same fs and upper layer, it has no pseudo_dev assigned.

In this overlay layers setup:
- two filesystems, A and B
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

Non pure upper overlay inode, whose origin is in layer 1 will have the
st_dev;st_ino values of the real lower inode before copy up and the
st_dev;st_ino values of the real upper inode after copy up.

Fix this inconsitency by assigning a unique pseudo_dev also for upper fs,
that will be used as st_dev value along with the lower inode st_dev for
overlay inodes in the case above.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 13 +++----------
 fs/overlayfs/super.c | 15 ++++++++++-----
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 9e894f5e19b4..00c4e9c17492 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -125,9 +125,8 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 		/*
 		 * For non-samefs setup, if we cannot map all layers st_ino
 		 * to a unified address space, we need to make sure that st_dev
-		 * is unique per lower fs. Layers that are on the same fs as
-		 * upper layer use real upper st_dev and other lower layers use
-		 * the unique anonymous bdev assigned to the lower fs.
+		 * is unique per underlying fs, so we use the unique anonymous
+		 * bdev assigned to the underlying fs.
 		 */
 		stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
 	}
@@ -195,13 +194,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 			if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
 			    (!ovl_verify_lower(dentry->d_sb) &&
 			     (is_dir || lowerstat.nlink == 1))) {
-				/*
-				 * Cannot use origin st_dev;st_ino because
-				 * origin inode content may differ from overlay
-				 * inode content.
-				 */
-				if (samefs || fsid)
-					stat->ino = lowerstat.ino;
+				stat->ino = lowerstat.ino;
 			} else {
 				fsid = 0;
 			}
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e80f79bb8a4e..9270e059eb9b 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -230,8 +230,7 @@ static void ovl_free_fs(struct ovl_fs *ofs)
 	}
 	kfree(ofs->layers);
 	if (ofs->fs) {
-		/* fs[0].pseudo_dev is either null or real upper st_dev */
-		for (i = 1; i <= ofs->maxfsid; i++)
+		for (i = 0; i <= ofs->maxfsid; i++)
 			free_anon_bdev(ofs->fs[i].pseudo_dev);
 		kfree(ofs->fs);
 	}
@@ -1335,13 +1334,19 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	ofs->layers[0].fsid = 0;
 
 	/*
-	 * All lower layers that share the same fs as upper layer, use the real
-	 * upper st_dev.
+	 * All lower layers that share the same fs as upper layer, use the same
+	 * pseudo_dev as upper layer.  Allocate fs[0].pseudo_dev even for lower
+	 * only overlay to simplify ovl_fs_free().
 	 * is_lower will be set if upper fs is shared with a lower layer.
 	 */
+	err = get_anon_bdev(&ofs->fs[0].pseudo_dev);
+	if (err) {
+		pr_err("overlayfs: failed to get anonymous bdev for upper fs\n");
+		goto out;
+	}
+
 	if (ofs->upper_mnt) {
 		ofs->fs[0].sb = ofs->upper_mnt->mnt_sb;
-		ofs->fs[0].pseudo_dev = ofs->upper_mnt->mnt_sb->s_dev;
 		ofs->fs[0].is_lower = false;
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
  2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
                   ` (5 preceding siblings ...)
  2019-11-17 15:43 ` [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino Amir Goldstein
@ 2019-11-18  6:03 ` Amir Goldstein
  2019-11-18  7:57   ` Amir Goldstein
  6 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-18  6:03 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs

On Sun, Nov 17, 2019 at 5:43 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> Miklos,
>
> When I started generalizing the lower_layers/lower_fs arrays
> I noticed a bug that was introduced in v4.17 with xino.
>
> In the case of lower layer on upper fs, we do not have a pseudo_dev
> assigned to lower layer and we expose the real lower st_dev;st_ino.
> This happens on non-samefs when xino is disabled (default).
> This is a very real bug, not really a corner case and I have an
> an xfstest [1] for it that I will post later.
>
> In the mean while, I also pushed a fix to unionmount-testsuite devel
> branch [2] to demonstrate the issue.
>
> With upstream kernel, this test ends up with a copied up file
> from middle layer, whose on same fs as upper and its exposed
> st_dev;st_ino are invalid:
>
>  ./run --ov=1 --verify hard-link
>  ...
>  /mnt/a/no_foo110: File unexpectedly on upper layer
>
> Patch 1 in the series is a small fix for stable that fixes the
> v4.17 regression in favor of a different, less severe regression.
> The new regression can be demonstrated with:
>
>  ./run --ov=1 --verify --xino hard-link
>  ...
>  /mnt/a/no_foo110: inode number/layer changed on copy up
>  (got 39:24707, was 39:24700)
>
> Patches 2-4 generalize the lower_{layer/fs} arrays to layer/fs arrays
> and get rid of some special casing of upper layer.
>
> Patches 5-6 use the cleanup to solve the corner case that you pointed
> out with bas_uuid [3] and to fix the regression introduced by patch 1.
>
> After patch 6, both unionmount-testsuite configurations
> above pass the test st_dev;st_ino verifications.
>
> I doubt if patches 2-6 are stable material, because not sure the
> corner cases they fix are worth the trouble.
>
> The series depends on the bad_uuid patch v5 that I posted on Thursday.
>
> I was also considering setting xino=on by default if xino_auto
> is enabled, because what have we got to loose?
>
> The inodes whose st_ino fit in lower bits (by far more common) will
> use overlay st_dev and the inodes whose st_ino overflow the lower bits
> will use pseudo_dev. Seems like a win-win situation, but I wanted to
> get your feedback on this before sending out a patch.
>

Arrr.. yes, there is a catch.
Overflowing lower bits has a price beyond just using pseudo_dev.
It introduces the possibility of inode number conflicts on directories,
because directories never use pseudo_dev.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
  2019-11-18  6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
@ 2019-11-18  7:57   ` Amir Goldstein
  2019-11-22  9:31     ` Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-18  7:57 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs

On Mon, Nov 18, 2019 at 8:03 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Sun, Nov 17, 2019 at 5:43 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > Miklos,
> >
> > When I started generalizing the lower_layers/lower_fs arrays
> > I noticed a bug that was introduced in v4.17 with xino.
> >
> > In the case of lower layer on upper fs, we do not have a pseudo_dev
> > assigned to lower layer and we expose the real lower st_dev;st_ino.
> > This happens on non-samefs when xino is disabled (default).
> > This is a very real bug, not really a corner case and I have an
> > an xfstest [1] for it that I will post later.
> >
> > In the mean while, I also pushed a fix to unionmount-testsuite devel
> > branch [2] to demonstrate the issue.
> >
> > With upstream kernel, this test ends up with a copied up file
> > from middle layer, whose on same fs as upper and its exposed
> > st_dev;st_ino are invalid:
> >
> >  ./run --ov=1 --verify hard-link
> >  ...
> >  /mnt/a/no_foo110: File unexpectedly on upper layer
> >
> > Patch 1 in the series is a small fix for stable that fixes the
> > v4.17 regression in favor of a different, less severe regression.
> > The new regression can be demonstrated with:
> >
> >  ./run --ov=1 --verify --xino hard-link
> >  ...
> >  /mnt/a/no_foo110: inode number/layer changed on copy up
> >  (got 39:24707, was 39:24700)
> >
> > Patches 2-4 generalize the lower_{layer/fs} arrays to layer/fs arrays
> > and get rid of some special casing of upper layer.
> >
> > Patches 5-6 use the cleanup to solve the corner case that you pointed
> > out with bas_uuid [3] and to fix the regression introduced by patch 1.
> >
> > After patch 6, both unionmount-testsuite configurations
> > above pass the test st_dev;st_ino verifications.
> >
> > I doubt if patches 2-6 are stable material, because not sure the
> > corner cases they fix are worth the trouble.
> >
> > The series depends on the bad_uuid patch v5 that I posted on Thursday.
> >
> > I was also considering setting xino=on by default if xino_auto
> > is enabled, because what have we got to loose?
> >
> > The inodes whose st_ino fit in lower bits (by far more common) will
> > use overlay st_dev and the inodes whose st_ino overflow the lower bits
> > will use pseudo_dev. Seems like a win-win situation, but I wanted to
> > get your feedback on this before sending out a patch.
> >
>
> Arrr.. yes, there is a catch.
> Overflowing lower bits has a price beyond just using pseudo_dev.
> It introduces the possibility of inode number conflicts on directories,
> because directories never use pseudo_dev.
>

But we could mitigate that problem if we reserve an fsid for volatile
directory inode numbers. get_next_ino is 32bit anyway.
I am going to take a swing at having xino=auto always enabling xino.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/6] ovl: generalize the lower_fs[] array
  2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
@ 2019-11-18 17:01   ` Amir Goldstein
  0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-18 17:01 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs

On Sun, Nov 17, 2019 at 5:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> Rename lower_fs[] array to fs[], extend its size by one and initialize
> fs[0] with upper fs values. fsid 0 is already reserved even with lower
> only overlay, so fs[0] remains uninitialized in this case as well.
>
> Rename numlowerfs to maxfsid and use index 0..maxfsid to access the
> fs[] array.
>

FYI, I don't know what I was thinking with the conversion to maxfsid.
I don't like that and intend to change it back to numfs.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
  2019-11-18  7:57   ` Amir Goldstein
@ 2019-11-22  9:31     ` Amir Goldstein
  2019-11-25 14:45       ` Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-22  9:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs

> > > I was also considering setting xino=on by default if xino_auto
> > > is enabled, because what have we got to loose?
> > >
> > > The inodes whose st_ino fit in lower bits (by far more common) will
> > > use overlay st_dev and the inodes whose st_ino overflow the lower bits
> > > will use pseudo_dev. Seems like a win-win situation, but I wanted to
> > > get your feedback on this before sending out a patch.
> > >
> >
> > Arrr.. yes, there is a catch.
> > Overflowing lower bits has a price beyond just using pseudo_dev.
> > It introduces the possibility of inode number conflicts on directories,
> > because directories never use pseudo_dev.
> >
>
> But we could mitigate that problem if we reserve an fsid for volatile
> directory inode numbers. get_next_ino is 32bit anyway.
> I am going to take a swing at having xino=auto always enabling xino.
>

FWIW, pushed WIP branch to:
https://github.com/amir73il/linux/commits/ovl-ino

It is based on an updated ovl-layers branch of the $SUBJECT series.

During cleanup, I've found several corner cases bugs of setting
i_ino value and fixed them.
None of those bugs are critical.
AFAIK, the only user that complained on inconsistent i_ino is
an xfstest that is checking ino in /proc/locks.

However, I do think that the cleanup I made simplifies the code
which was a bit spaghetti in that area and with some more TLC
we can get to enabling xino from xino=auto even for filesystems
that have seldom high ino bits.

That could be a real benefit, because it is unlikely that users
will have enough knowledge or certainty about underlying fs
to declare xino=on.

I need to clear up some time to test i_ino changes and
the collision avoidance code, but for now, at least the ovl-ino branch
passes the existing regression tests.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
  2019-11-22  9:31     ` Amir Goldstein
@ 2019-11-25 14:45       ` Amir Goldstein
  0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-25 14:45 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs

On Fri, Nov 22, 2019 at 11:31 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> > > > I was also considering setting xino=on by default if xino_auto
> > > > is enabled, because what have we got to loose?
> > > >
> > > > The inodes whose st_ino fit in lower bits (by far more common) will
> > > > use overlay st_dev and the inodes whose st_ino overflow the lower bits
> > > > will use pseudo_dev. Seems like a win-win situation, but I wanted to
> > > > get your feedback on this before sending out a patch.
> > > >
> > >
> > > Arrr.. yes, there is a catch.
> > > Overflowing lower bits has a price beyond just using pseudo_dev.
> > > It introduces the possibility of inode number conflicts on directories,
> > > because directories never use pseudo_dev.
> > >
> >
> > But we could mitigate that problem if we reserve an fsid for volatile
> > directory inode numbers. get_next_ino is 32bit anyway.
> > I am going to take a swing at having xino=auto always enabling xino.
> >
>
> FWIW, pushed WIP branch to:
> https://github.com/amir73il/linux/commits/ovl-ino
>
> It is based on an updated ovl-layers branch of the $SUBJECT series.
>
> During cleanup, I've found several corner cases bugs of setting
> i_ino value and fixed them.
> None of those bugs are critical.
> AFAIK, the only user that complained on inconsistent i_ino is
> an xfstest that is checking ino in /proc/locks.
>
> However, I do think that the cleanup I made simplifies the code
> which was a bit spaghetti in that area and with some more TLC
> we can get to enabling xino from xino=auto even for filesystems
> that have seldom high ino bits.
>
> That could be a real benefit, because it is unlikely that users
> will have enough knowledge or certainty about underlying fs
> to declare xino=on.
>
> I need to clear up some time to test i_ino changes and
> the collision avoidance code, but for now, at least the ovl-ino branch
> passes the existing regression tests.
>

Documenting all this in words has become too complex for me,
so I resorted to visual aids [1].

The table doesn't mention this explicitly, but xino=auto can now
(code in ovl-ino) provide all desired features for any non-samefs setup.

Tests are still WIP. Will post the work when they are done.
I am using a nested overlay [2] as a way to test xino overflow behavior.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/blob/ovl-ino/Documentation/filesystems/overlayfs.rst#inode-properties
[2] https://github.com/amir73il/unionmount-testsuite/commits/ovl-nested

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-25 14:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
2019-11-17 15:43 ` [PATCH 2/6] ovl: generalize the lower_layers[] array Amir Goldstein
2019-11-17 15:43 ` [PATCH 3/6] ovl: simplify ovl_same_sb() helper Amir Goldstein
2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
2019-11-18 17:01   ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid Amir Goldstein
2019-11-17 15:43 ` [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino Amir Goldstein
2019-11-18  6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
2019-11-18  7:57   ` Amir Goldstein
2019-11-22  9:31     ` Amir Goldstein
2019-11-25 14:45       ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).