All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs
@ 2018-03-29 14:18 Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 1/7] ovl: factor out ovl_map_dev_ino() helper Amir Goldstein
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Miklos,

This series provides a solution for some interesting non-samefs cases:
- All the ext* family
- Many other fs with default encode_fh
- xfs and tmpfs with overlay "xino" mount option

The patches are also available on my ovl-xino branch [2]. They are based
on some earlier fix patches and they do not conflict with the NFS export
optimization patches (ovl-nfs-export branch).

I tested this with upstream overlay/nonsamefs xfstest group:
- Tests pass for ext4
- Tests fail for xfs
- Tests pass for xfs with OVERLAY_MOUNT_OPTIONS=-oxino

I also added --xino option to unionmount-testsuite [2], along with
the --verify option, the test verifies constant st_ino and that all
objects are on overlay st_dev.

Changes since v8:
- Use unique fsid instead of layer id
- Assign pseudo_dev per fsid instead of per layer
- Limit "xino" feature to 64bit systems
- Assign xino value i_ino as well for NFSv3 readdir
- Add "xino" documentation patch

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/ovl-xino
[2] https://github.com/amir73il/unionmount-testsuite/commits/overlayfs-devel

Amir Goldstein (7):
  ovl: factor out ovl_map_dev_ino() helper
  ovl: allocate anon bdev per unique lower fs
  ovl: constant st_ino for non-samefs with xino
  ovl: consistent i_ino for non-samefs with xino
  ovl: consistent d_ino for non-samefs with xino
  ovl: add support for "xino" mount option
  ovl: update documentation w.r.t "xino" feature

 Documentation/filesystems/overlayfs.txt |  39 +++++++--
 fs/overlayfs/export.c                   |   2 +-
 fs/overlayfs/inode.c                    | 140 +++++++++++++++++++++-----------
 fs/overlayfs/namei.c                    |   4 +-
 fs/overlayfs/overlayfs.h                |   6 +-
 fs/overlayfs/ovl_entry.h                |  21 +++--
 fs/overlayfs/readdir.c                  |  45 ++++++++--
 fs/overlayfs/super.c                    | 120 ++++++++++++++++++++++-----
 fs/overlayfs/util.c                     |  38 ++++++++-
 9 files changed, 318 insertions(+), 97 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v9 1/7] ovl: factor out ovl_map_dev_ino() helper
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 2/7] ovl: allocate anon bdev per unique lower fs Amir Goldstein
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

A helper for ovl_getattr() to map the values of st_dev and st_ino
according to constant st_ino rules.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 83 +++++++++++++++++++++++++-----------------------
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/util.c      |  7 ++++
 3 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 4689716f23d8..1ae53357b228 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -16,13 +16,6 @@
 #include "overlayfs.h"
 
 
-static dev_t ovl_get_pseudo_dev(struct dentry *dentry)
-{
-	struct ovl_entry *oe = dentry->d_fsdata;
-
-	return oe->lowerstack[0].layer->pseudo_dev;
-}
-
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int err;
@@ -66,6 +59,43 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 	return err;
 }
 
+static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
+{
+	struct ovl_layer *lower_layer = ovl_layer_lower(dentry);
+	bool samefs = ovl_same_sb(dentry->d_sb);
+
+	if (samefs) {
+		/*
+		 * When all layers are on the same fs, all real inode
+		 * number are unique, so we use the overlay st_dev,
+		 * which is friendly to du -x.
+		 */
+		stat->dev = dentry->d_sb->s_dev;
+	} else if (S_ISDIR(dentry->d_inode->i_mode)) {
+		/*
+		 * Always use the overlay st_dev for directories, so 'find
+		 * -xdev' will scan the entire overlay mount and won't cross the
+		 * overlay mount boundaries.
+		 *
+		 * If not all layers are on the same fs the pair {real st_ino;
+		 * overlay st_dev} is not unique, so use the non persistent
+		 * overlay st_ino for directories.
+		 */
+		stat->dev = dentry->d_sb->s_dev;
+		stat->ino = dentry->d_inode->i_ino;
+	} else if (lower_layer) {
+		/*
+		 * For non-samefs setup, if we cannot map all layers st_ino
+		 * to a unified address space, we need to make sure that st_dev
+		 * is unique per layer. Upper layer uses real st_dev and lower
+		 * layers use the unique anonymous bdev.
+		 */
+		stat->dev = lower_layer->pseudo_dev;
+	}
+
+	return 0;
+}
+
 int ovl_getattr(const struct path *path, struct kstat *stat,
 		u32 request_mask, unsigned int flags)
 {
@@ -84,10 +114,10 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		goto out;
 
 	/*
-	 * For non-dir or same fs, we use st_ino of the copy up origin, if we
-	 * know it. This guaranties constant st_dev/st_ino across copy up.
+	 * For non-dir or same fs, we use st_ino of the copy up origin.
+	 * This guaranties constant st_dev/st_ino across copy up.
 	 *
-	 * If filesystem supports NFS export ops, this also guaranties
+	 * If lower filesystem supports NFS file handles, this also guaranties
 	 * persistent st_ino across mount cycle.
 	 */
 	if (!is_dir || samefs) {
@@ -123,38 +153,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 
 			if (samefs)
 				WARN_ON_ONCE(stat->dev != lowerstat.dev);
-			else
-				stat->dev = ovl_get_pseudo_dev(dentry);
-		}
-		if (samefs) {
-			/*
-			 * When all layers are on the same fs, all real inode
-			 * number are unique, so we use the overlay st_dev,
-			 * which is friendly to du -x.
-			 */
-			stat->dev = dentry->d_sb->s_dev;
-		} else if (!OVL_TYPE_UPPER(type)) {
-			/*
-			 * For non-samefs setup, to make sure that st_dev/st_ino
-			 * pair is unique across the system, we use a unique
-			 * anonymous st_dev for lower layer inode.
-			 */
-			stat->dev = ovl_get_pseudo_dev(dentry);
 		}
-	} else {
-		/*
-		 * Always use the overlay st_dev for directories, so 'find
-		 * -xdev' will scan the entire overlay mount and won't cross the
-		 * overlay mount boundaries.
-		 *
-		 * If not all layers are on the same fs the pair {real st_ino;
-		 * overlay st_dev} is not unique, so use the non persistent
-		 * overlay st_ino for directories.
-		 */
-		stat->dev = dentry->d_sb->s_dev;
-		stat->ino = dentry->d_inode->i_ino;
 	}
 
+	err = ovl_map_dev_ino(dentry, stat);
+	if (err)
+		goto out;
+
 	/*
 	 * It's probably not worth it to count subdirs to get the
 	 * correct link count.  nlink=1 seems to pacify 'find' and
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 225ff1171147..4432599ecbc6 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -215,6 +215,7 @@ void ovl_path_lower(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
 struct dentry *ovl_dentry_upper(struct dentry *dentry);
 struct dentry *ovl_dentry_lower(struct dentry *dentry);
+struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
 struct dentry *ovl_dentry_real(struct dentry *dentry);
 struct dentry *ovl_i_dentry_upper(struct inode *inode);
 struct inode *ovl_inode_upper(struct inode *inode);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 493f9b76fbf6..5042293d2078 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -172,6 +172,13 @@ struct dentry *ovl_dentry_lower(struct dentry *dentry)
 	return oe->numlower ? oe->lowerstack[0].dentry : NULL;
 }
 
+struct ovl_layer *ovl_layer_lower(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	return oe->numlower ? oe->lowerstack[0].layer : NULL;
+}
+
 struct dentry *ovl_dentry_real(struct dentry *dentry)
 {
 	return ovl_dentry_upper(dentry) ?: ovl_dentry_lower(dentry);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 2/7] ovl: allocate anon bdev per unique lower fs
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 1/7] ovl: factor out ovl_map_dev_ino() helper Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino Amir Goldstein
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Instead of allocating an anonymous bdev per lower layer, allocate
one anonymous bdev per every unique lower fs that is different than
upper fs.

Every unique lower fs is assigned an fsid > 0 and the number of
unique lower fs are stored in ofs->numlowerfs.

The assigned fsid is stored in the lower layer struct and will be
used also for inode number multiplexing.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |  9 ++++---
 fs/overlayfs/ovl_entry.h | 18 +++++++++----
 fs/overlayfs/super.c     | 66 +++++++++++++++++++++++++++++++++++-------------
 fs/overlayfs/util.c      |  7 ++++-
 4 files changed, 72 insertions(+), 28 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 1ae53357b228..89dfab20fe0e 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -83,14 +83,15 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
 		 */
 		stat->dev = dentry->d_sb->s_dev;
 		stat->ino = dentry->d_inode->i_ino;
-	} else if (lower_layer) {
+	} else if (lower_layer && lower_layer->fsid) {
 		/*
 		 * For non-samefs setup, if we cannot map all layers st_ino
 		 * to a unified address space, we need to make sure that st_dev
-		 * is unique per layer. Upper layer uses real st_dev and lower
-		 * layers use the unique anonymous bdev.
+		 * is unique per lower fs. Upper layer uses real st_dev and
+		 * lower layers use the unique anonymous bdev assigned to the
+		 * lower fs.
 		 */
-		stat->dev = lower_layer->pseudo_dev;
+		stat->dev = lower_layer->fs->pseudo_dev;
 	}
 
 	return 0;
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index bfef6edcc111..e1c838c27a74 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -20,11 +20,18 @@ struct ovl_config {
 	bool nfs_export;
 };
 
+struct ovl_sb {
+	struct super_block *sb;
+	dev_t pseudo_dev;
+};
+
 struct ovl_layer {
 	struct vfsmount *mnt;
-	dev_t pseudo_dev;
-	/* Index of this layer in fs root (upper == 0) */
+	struct ovl_sb *fs;
+	/* Index of this layer in fs root (upper idx == 0) */
 	int idx;
+	/* One fsid per unique underlying sb (upper fsid == 0) */
+	int fsid;
 };
 
 struct ovl_path {
@@ -35,8 +42,11 @@ struct ovl_path {
 /* private information held for overlayfs's superblock */
 struct ovl_fs {
 	struct vfsmount *upper_mnt;
-	unsigned numlower;
+	unsigned int numlower;
+	/* Number of unique lower sb that differ from upper sb */
+	unsigned int numlowerfs;
 	struct ovl_layer *lower_layers;
+	struct ovl_sb *lower_fs;
 	/* workbasedir is the path at workdir= mount option */
 	struct dentry *workbasedir;
 	/* workdir is the 'work' directory under workbasedir */
@@ -50,8 +60,6 @@ struct ovl_fs {
 	const struct cred *creator_cred;
 	bool tmpfile;
 	bool noxattr;
-	/* sb common to all layers */
-	struct super_block *same_sb;
 	/* Did we take the inuse lock? */
 	bool upperdir_locked;
 	bool workdir_locked;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7c24619ae7fc..7d97d30cad39 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -236,11 +236,12 @@ static void ovl_free_fs(struct ovl_fs *ofs)
 	if (ofs->upperdir_locked)
 		ovl_inuse_unlock(ofs->upper_mnt->mnt_root);
 	mntput(ofs->upper_mnt);
-	for (i = 0; i < ofs->numlower; i++) {
+	for (i = 0; i < ofs->numlower; i++)
 		mntput(ofs->lower_layers[i].mnt);
-		free_anon_bdev(ofs->lower_layers[i].pseudo_dev);
-	}
+	for (i = 0; i < ofs->numlowerfs; i++)
+		free_anon_bdev(ofs->lower_fs[i].pseudo_dev);
 	kfree(ofs->lower_layers);
+	kfree(ofs->lower_fs);
 
 	kfree(ofs->config.lowerdir);
 	kfree(ofs->config.upperdir);
@@ -1108,6 +1109,35 @@ static int ovl_get_indexdir(struct ovl_fs *ofs, struct ovl_entry *oe,
 	return err;
 }
 
+/* Get a unique fsid for the layer */
+static int ovl_get_fsid(struct ovl_fs *ofs, struct super_block *sb)
+{
+	unsigned int i;
+	dev_t dev;
+	int err;
+
+	/* fsid 0 is reserved for upper fs even with non upper overlay */
+	if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb == sb)
+		return 0;
+
+	for (i = 0; i < ofs->numlowerfs; i++) {
+		if (ofs->lower_fs[i].sb == sb)
+			return i + 1;
+	}
+
+	err = get_anon_bdev(&dev);
+	if (err) {
+		pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
+		return err;
+	}
+
+	ofs->lower_fs[ofs->numlowerfs].sb = sb;
+	ofs->lower_fs[ofs->numlowerfs].pseudo_dev = dev;
+	ofs->numlowerfs++;
+
+	return ofs->numlowerfs;
+}
+
 static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
 				unsigned int numlower)
 {
@@ -1119,23 +1149,27 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
 				    GFP_KERNEL);
 	if (ofs->lower_layers == NULL)
 		goto out;
+
+	ofs->lower_fs = kcalloc(numlower, sizeof(struct ovl_sb),
+				GFP_KERNEL);
+	if (ofs->lower_fs == NULL)
+		goto out;
+
 	for (i = 0; i < numlower; i++) {
 		struct vfsmount *mnt;
-		dev_t dev;
+		int fsid;
 
-		err = get_anon_bdev(&dev);
-		if (err) {
-			pr_err("overlayfs: failed to get anonymous bdev for lowerpath\n");
+		err = fsid = ovl_get_fsid(ofs, stack[i].mnt->mnt_sb);
+		if (err < 0)
 			goto out;
-		}
 
 		mnt = clone_private_mount(&stack[i]);
 		err = PTR_ERR(mnt);
 		if (IS_ERR(mnt)) {
 			pr_err("overlayfs: failed to clone lowerpath\n");
-			free_anon_bdev(dev);
 			goto out;
 		}
+
 		/*
 		 * Make lower layers R/O.  That way fchmod/fchown on lower file
 		 * will fail instead of modifying lower fs.
@@ -1143,15 +1177,13 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
 		mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
 
 		ofs->lower_layers[ofs->numlower].mnt = mnt;
-		ofs->lower_layers[ofs->numlower].pseudo_dev = dev;
 		ofs->lower_layers[ofs->numlower].idx = i + 1;
+		ofs->lower_layers[ofs->numlower].fsid = fsid;
+		if (fsid) {
+			ofs->lower_layers[ofs->numlower].fs =
+				&ofs->lower_fs[fsid - 1];
+		}
 		ofs->numlower++;
-
-		/* Check if all lower layers are on same sb */
-		if (i == 0)
-			ofs->same_sb = mnt->mnt_sb;
-		else if (ofs->same_sb != mnt->mnt_sb)
-			ofs->same_sb = NULL;
 	}
 	err = 0;
 out:
@@ -1305,8 +1337,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	/* If the upper fs is nonexistent, we mark overlayfs r/o too */
 	if (!ofs->upper_mnt)
 		sb->s_flags |= SB_RDONLY;
-	else if (ofs->upper_mnt->mnt_sb != ofs->same_sb)
-		ofs->same_sb = NULL;
 
 	if (!(ovl_force_readonly(ofs)) && ofs->config.index) {
 		err = ovl_get_indexdir(ofs, oe, &upperpath);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 5042293d2078..9dfec74ec77d 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -47,7 +47,12 @@ struct super_block *ovl_same_sb(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
 
-	return ofs->same_sb;
+	if (!ofs->numlowerfs)
+		return ofs->upper_mnt->mnt_sb;
+	else if (ofs->numlowerfs == 1 && !ofs->upper_mnt)
+		return ofs->lower_fs[0].sb;
+	else
+		return NULL;
 }
 
 bool ovl_can_decode_fh(struct super_block *sb)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 1/7] ovl: factor out ovl_map_dev_ino() helper Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 2/7] ovl: allocate anon bdev per unique lower fs Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 15:58   ` Miklos Szeredi
  2018-03-29 14:18 ` [PATCH v9 4/7] ovl: consistent i_ino " Amir Goldstein
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

On 64bit systems, when overlay layers are not all on the same fs, but
all inode numbers of underlying fs are not using the high bits, use the
high bits to partition the overlay st_ino address space.  The high bits
hold the fsid (upper fsid is 0).  This way overlay inode numbers are unique
and all inodes use overlay st_dev.  Inode numbers are also persistent
for a given layer configuration.

Currently, our only indication for available high ino bits is from a
filesystem that supports file handles and uses the default encode_fh()
operation, which encodes a 32bit inode number.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 31 +++++++++++++++++++++++++++++--
 fs/overlayfs/overlayfs.h |  3 ++-
 fs/overlayfs/ovl_entry.h |  2 ++
 fs/overlayfs/super.c     | 26 ++++++++++++++++++++++----
 fs/overlayfs/util.c      | 24 +++++++++++++++++++++---
 5 files changed, 76 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 89dfab20fe0e..7fc9c83bf2ff 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -63,6 +63,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
 {
 	struct ovl_layer *lower_layer = ovl_layer_lower(dentry);
 	bool samefs = ovl_same_sb(dentry->d_sb);
+	int xinobits = ovl_xino_bits(dentry->d_sb);
 
 	if (samefs) {
 		/*
@@ -71,7 +72,31 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
 		 * which is friendly to du -x.
 		 */
 		stat->dev = dentry->d_sb->s_dev;
-	} else if (S_ISDIR(dentry->d_inode->i_mode)) {
+		return 0;
+	} else if (xinobits) {
+		/*
+		 * All inode numbers of underlying fs should not be using the
+		 * high xinobits, so we use high xinobits to partition the
+		 * overlay st_ino address space. The high bits holds the fsid
+		 * (upper fsid is 0). This way overlay inode numbers are unique
+		 * and all inodes use overlay st_dev. Inode numbers are also
+		 * persistent for a given layer configuration.
+		 */
+		if (stat->ino >> (64 - xinobits)) {
+			pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
+					    dentry, stat->ino, xinobits);
+		} else {
+			if (lower_layer) {
+				stat->ino |= ((u64)lower_layer->fsid)
+					     << (64 - xinobits);
+			}
+			stat->dev = dentry->d_sb->s_dev;
+			return 0;
+		}
+	}
+
+	/* The inode could not be mapped to a unified st_ino address space */
+	if (S_ISDIR(dentry->d_inode->i_mode)) {
 		/*
 		 * Always use the overlay st_dev for directories, so 'find
 		 * -xdev' will scan the entire overlay mount and won't cross the
@@ -117,11 +142,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	/*
 	 * For non-dir or same fs, we use st_ino of the copy up origin.
 	 * This guaranties constant st_dev/st_ino across copy up.
+	 * With xino feature and non-samefs, we use st_ino of the copy up
+	 * origin masked with high bits that represent the layer id.
 	 *
 	 * If lower filesystem supports NFS file handles, this also guaranties
 	 * persistent st_ino across mount cycle.
 	 */
-	if (!is_dir || samefs) {
+	if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
 		if (OVL_TYPE_ORIGIN(type)) {
 			struct kstat lowerstat;
 			u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 4432599ecbc6..09f5b8ce70aa 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -202,7 +202,8 @@ void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
 struct super_block *ovl_same_sb(struct super_block *sb);
-bool ovl_can_decode_fh(struct super_block *sb);
+int ovl_xino_bits(struct super_block *sb);
+int ovl_can_decode_fh(struct super_block *sb);
 struct dentry *ovl_indexdir(struct super_block *sb);
 bool ovl_index_all(struct super_block *sb);
 bool ovl_verify_lower(struct super_block *sb);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index e1c838c27a74..6a077fb2a75f 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -63,6 +63,8 @@ struct ovl_fs {
 	/* Did we take the inuse lock? */
 	bool upperdir_locked;
 	bool workdir_locked;
+	/* Inode numbers in all layers do not use the high xino_bits */
+	int xino_bits;
 };
 
 /* private information held for every overlayfs dentry */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7d97d30cad39..d7284444f404 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -17,6 +17,7 @@
 #include <linux/statfs.h>
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
+#include <linux/exportfs.h>
 #include "overlayfs.h"
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
@@ -701,6 +702,7 @@ static int ovl_check_namelen(struct path *path, struct ovl_fs *ofs,
 static int ovl_lower_dir(const char *name, struct path *path,
 			 struct ovl_fs *ofs, int *stack_depth, bool *remote)
 {
+	int fh_type;
 	int err;
 
 	err = ovl_mount_dir_noesc(name, path);
@@ -720,15 +722,19 @@ static int ovl_lower_dir(const char *name, struct path *path,
 	 * The inodes index feature and NFS export need to encode and decode
 	 * file handles, so they require that all layers support them.
 	 */
+	fh_type = ovl_can_decode_fh(path->dentry->d_sb);
 	if ((ofs->config.nfs_export ||
-	     (ofs->config.index && ofs->config.upperdir)) &&
-	    !ovl_can_decode_fh(path->dentry->d_sb)) {
+	     (ofs->config.index && ofs->config.upperdir)) && !fh_type) {
 		ofs->config.index = false;
 		ofs->config.nfs_export = false;
 		pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off,nfs_export=off.\n",
 			name);
 	}
 
+	/* Check if lower fs has 32bit inode numbers */
+	if (fh_type != FILEID_INO32_GEN)
+		ofs->xino_bits = 0;
+
 	return 0;
 
 out_put:
@@ -952,6 +958,7 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 {
 	struct vfsmount *mnt = ofs->upper_mnt;
 	struct dentry *temp;
+	int fh_type;
 	int err;
 
 	err = mnt_want_write(mnt);
@@ -1001,12 +1008,16 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
 	}
 
 	/* Check if upper/work fs supports file handles */
-	if (ofs->config.index &&
-	    !ovl_can_decode_fh(ofs->workdir->d_sb)) {
+	fh_type = ovl_can_decode_fh(ofs->workdir->d_sb);
+	if (ofs->config.index && !fh_type) {
 		ofs->config.index = false;
 		pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
 	}
 
+	/* Check if upper fs has 32bit inode numbers */
+	if (fh_type != FILEID_INO32_GEN)
+		ofs->xino_bits = 0;
+
 	/* NFS export of r/w mount depends on index */
 	if (ofs->config.nfs_export && !ofs->config.index) {
 		pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
@@ -1185,6 +1196,11 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
 		}
 		ofs->numlower++;
 	}
+
+	/* When all layers on same fs, overlay can use real inode numbers */
+	if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
+		ofs->xino_bits = 0;
+
 	err = 0;
 out:
 	return err;
@@ -1308,6 +1324,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	sb->s_stack_depth = 0;
 	sb->s_maxbytes = MAX_LFS_FILESIZE;
+	/* Assume underlaying fs uses 32bit inodes unless proven otherwise */
+	ofs->xino_bits = BITS_PER_LONG - 32;
 	if (ofs->config.upperdir) {
 		if (!ofs->config.workdir) {
 			pr_err("overlayfs: missing 'workdir'\n");
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9dfec74ec77d..3bdba8b332f4 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -55,10 +55,28 @@ struct super_block *ovl_same_sb(struct super_block *sb)
 		return NULL;
 }
 
-bool ovl_can_decode_fh(struct super_block *sb)
+int ovl_xino_bits(struct super_block *sb)
 {
-	return (sb->s_export_op && sb->s_export_op->fh_to_dentry &&
-		!uuid_is_null(&sb->s_uuid));
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return ofs->xino_bits;
+}
+
+/*
+ * Check if underlying fs supports file handles and try to determine encoding
+ * type, in order to deduce maximum inode number used by fs.
+ *
+ * Return 0 if file handles are not supported.
+ * Return 1 (FILEID_INO32_GEN) if fs uses the default 32bit inode encoding.
+ * Return -1 if fs uses a non default encoding with unknown inode size.
+ */
+int ovl_can_decode_fh(struct super_block *sb)
+{
+	if (!sb->s_export_op || !sb->s_export_op->fh_to_dentry ||
+	    uuid_is_null(&sb->s_uuid))
+		return 0;
+
+	return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN;
 }
 
 struct dentry *ovl_indexdir(struct super_block *sb)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 4/7] ovl: consistent i_ino for non-samefs with xino
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (2 preceding siblings ...)
  2018-03-29 14:18 ` [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 5/7] ovl: consistent d_ino " Amir Goldstein
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

When overlay layers are not all on the same fs, but all inode numbers
of underlying fs do not use the high 'xino' bits, overlay st_ino values
are constant and persistent.

In that case, set i_ino value to the same value as st_ino for nfsd
readdirplus validator.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c    |  2 +-
 fs/overlayfs/inode.c     | 27 ++++++++++++++++++---------
 fs/overlayfs/namei.c     |  4 +---
 fs/overlayfs/overlayfs.h |  2 +-
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 87bd4148f4fb..e53c1f58aef1 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -305,7 +305,7 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lower, index, !!lower);
+	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 7fc9c83bf2ff..71bf6650b284 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -493,19 +493,26 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
 }
 
 static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
-			   unsigned long ino)
+			   unsigned long ino, int fsid)
 {
+	int xinobits = ovl_xino_bits(inode->i_sb);
+
 	/*
 	 * When NFS export is enabled and d_ino is consistent with st_ino
-	 * (samefs), set the same value to i_ino, because nfsd readdirplus
-	 * compares d_ino values to i_ino values of child entries. When called
-	 * from ovl_new_inode(), ino arg is 0, so i_ino will be updated to real
+	 * (samefs or i_ino has enough bits to encode layer), set the same
+	 * value used for d_ino to i_ino, because nfsd readdirplus compares
+	 * d_ino values to i_ino values of child entries. When called from
+	 * ovl_new_inode(), ino arg is 0, so i_ino will be updated to real
 	 * upper inode i_ino on ovl_inode_init() or ovl_inode_update().
 	 */
-	if (inode->i_sb->s_export_op && ovl_same_sb(inode->i_sb))
+	if (inode->i_sb->s_export_op &&
+	    (ovl_same_sb(inode->i_sb) || xinobits)) {
 		inode->i_ino = ino;
-	else
+		if (xinobits && fsid && !(ino >> (64 - xinobits)))
+			inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
+	} else {
 		inode->i_ino = get_next_ino();
+	}
 	inode->i_mode = mode;
 	inode->i_flags |= S_NOCMTIME;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -641,7 +648,7 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 
 	inode = new_inode(sb);
 	if (inode)
-		ovl_fill_inode(inode, mode, rdev, 0);
+		ovl_fill_inode(inode, mode, rdev, 0, 0);
 
 	return inode;
 }
@@ -747,12 +754,14 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
 }
 
 struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct dentry *lowerdentry, struct dentry *index,
+			    struct ovl_path *lowerpath, struct dentry *index,
 			    unsigned int numlower)
 {
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
+	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
 	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
+	int fsid = bylower ? lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
 
@@ -800,7 +809,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 		if (!inode)
 			goto out_nomem;
 	}
-	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino);
+	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
 	ovl_inode_init(inode, upperdentry, lowerdentry);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 87d8bf2c09c6..d33f6213eb8f 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -986,9 +986,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		upperdentry = dget(index);
 
 	if (upperdentry || ctr) {
-		if (ctr)
-			origin = stack[0].dentry;
-		inode = ovl_get_inode(dentry->d_sb, upperdentry, origin, index,
+		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
 				      ctr);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 09f5b8ce70aa..023e35ca6b5f 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -331,7 +331,7 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
 struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct dentry *lowerdentry, struct dentry *index,
+			    struct ovl_path *lowerpath, struct dentry *index,
 			    unsigned int numlower);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 5/7] ovl: consistent d_ino for non-samefs with xino
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (3 preceding siblings ...)
  2018-03-29 14:18 ` [PATCH v9 4/7] ovl: consistent i_ino " Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 14:18 ` [PATCH v9 6/7] ovl: add support for "xino" mount option Amir Goldstein
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

When overlay layers are not all on the same fs, but all inode numbers
of underlying fs do not use the high 'xino' bits, overlay st_ino values
are constant and persistent.

In that case, relax non-samefs constraint for consistent d_ino and always
iterate non-merge dir using ovl_fill_real() actor so we can remap lower
inode numbers to unique lower fs range.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/readdir.c | 45 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 39 insertions(+), 6 deletions(-)

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index c11f5c0906c3..ef1fe42ff7bb 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -120,6 +120,10 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
 	if (!rdd->dentry)
 		return false;
 
+	/* Always recalc d_ino when remapping lower inode numbers */
+	if (ovl_xino_bits(rdd->dentry->d_sb))
+		return true;
+
 	/* Always recalc d_ino for parent */
 	if (strcmp(p->name, "..") == 0)
 		return true;
@@ -435,6 +439,19 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
 	return cache;
 }
 
+/* Map inode number to lower fs unique range */
+static u64 ovl_remap_lower_ino(u64 ino, int xinobits, int fsid,
+			       const char *name, int namelen)
+{
+	if (ino >> (64 - xinobits)) {
+		pr_warn_ratelimited("overlayfs: d_ino too big (%.*s, ino=%llu, xinobits=%d)\n",
+				    namelen, name, ino, xinobits);
+		return ino;
+	}
+
+	return ino | ((u64)fsid) << (64 - xinobits);
+}
+
 /*
  * Set d_ino for upper entries. Non-upper entries should always report
  * the uppermost real inode ino and should not call this function.
@@ -452,9 +469,10 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 	struct dentry *this = NULL;
 	enum ovl_path_type type;
 	u64 ino = p->real_ino;
+	int xinobits = ovl_xino_bits(dir->d_sb);
 	int err = 0;
 
-	if (!ovl_same_sb(dir->d_sb))
+	if (!ovl_same_sb(dir->d_sb) && !xinobits)
 		goto out;
 
 	if (p->name[0] == '.') {
@@ -491,6 +509,10 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 
 		WARN_ON_ONCE(dir->d_sb->s_dev != stat.dev);
 		ino = stat.ino;
+	} else if (xinobits && !OVL_TYPE_UPPER(type)) {
+		ino = ovl_remap_lower_ino(ino, xinobits,
+					  ovl_layer_lower(this)->fsid,
+					  p->name, p->len);
 	}
 
 out:
@@ -618,6 +640,8 @@ struct ovl_readdir_translate {
 	struct ovl_dir_cache *cache;
 	struct dir_context ctx;
 	u64 parent_ino;
+	int fsid;
+	int xinobits;
 };
 
 static int ovl_fill_real(struct dir_context *ctx, const char *name,
@@ -628,14 +652,17 @@ static int ovl_fill_real(struct dir_context *ctx, const char *name,
 		container_of(ctx, struct ovl_readdir_translate, ctx);
 	struct dir_context *orig_ctx = rdt->orig_ctx;
 
-	if (rdt->parent_ino && strcmp(name, "..") == 0)
+	if (rdt->parent_ino && strcmp(name, "..") == 0) {
 		ino = rdt->parent_ino;
-	else if (rdt->cache) {
+	} else if (rdt->cache) {
 		struct ovl_cache_entry *p;
 
 		p = ovl_cache_entry_find(&rdt->cache->root, name, namelen);
 		if (p)
 			ino = p->ino;
+	} else if (rdt->xinobits) {
+		ino = ovl_remap_lower_ino(ino, rdt->xinobits, rdt->fsid,
+					  name, namelen);
 	}
 
 	return orig_ctx->actor(orig_ctx, name, namelen, offset, ino, d_type);
@@ -646,11 +673,16 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 	int err;
 	struct ovl_dir_file *od = file->private_data;
 	struct dentry *dir = file->f_path.dentry;
+	struct ovl_layer *lower_layer = ovl_layer_lower(dir);
 	struct ovl_readdir_translate rdt = {
 		.ctx.actor = ovl_fill_real,
 		.orig_ctx = ctx,
+		.xinobits = ovl_xino_bits(dir->d_sb),
 	};
 
+	if (rdt.xinobits && lower_layer)
+		rdt.fsid = lower_layer->fsid;
+
 	if (OVL_TYPE_MERGE(ovl_path_type(dir->d_parent))) {
 		struct kstat stat;
 		struct path statpath = file->f_path;
@@ -693,9 +725,10 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
 		 * dir is impure then need to adjust d_ino for copied up
 		 * entries.
 		 */
-		if (ovl_same_sb(dentry->d_sb) &&
-		    (ovl_test_flag(OVL_IMPURE, d_inode(dentry)) ||
-		     OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent)))) {
+		if (ovl_xino_bits(dentry->d_sb) ||
+		    (ovl_same_sb(dentry->d_sb) &&
+		     (ovl_test_flag(OVL_IMPURE, d_inode(dentry)) ||
+		      OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent))))) {
 			return ovl_iterate_real(file, ctx);
 		}
 		return iterate_dir(od->realfile, ctx);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (4 preceding siblings ...)
  2018-03-29 14:18 ` [PATCH v9 5/7] ovl: consistent d_ino " Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 15:28   ` Miklos Szeredi
  2018-04-25 14:49   ` J. R. Okajima
  2018-03-29 14:18 ` [PATCH v9 7/7] ovl: update documentation w.r.t "xino" feature Amir Goldstein
  2018-03-29 14:24 ` [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  7 siblings, 2 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

With mount option "xino", mounter declares that there are enough
free high bits in underlying fs to hold the layer fsid.
If overlayfs does encounter underlying inodes using the high xino
bits reserved for layer fsid, a warning will be emitted and the original
inode number will be used.

The mount option name "xino" goes after a similar meaning mount option
of aufs, but in overlayfs case, the mapping is stateless.

An example for a use case of "xino" is when upper/lower is on an xfs
filesystem. xfs uses 64bit inode numbers, but it currently never uses the
upper 8bit for inode numbers exposed via stat(2) and that is not likely to
change in the future without user opting-in for a new xfs feature. The
actual number of unused upper bit is much larger and determined by the xfs
filesystem geometry (64 - agno_log - agblklog - inopblog). That means
that for all practical purpose, there are enough unused bits in xfs
inode numbers for more than OVL_MAX_STACK unique fsid's.

Another example for a use case of "xino" is when upper/lower is on tmpfs.
tmpfs inode numbers are allocated sequentially since boot, so they will
practially never use the high inode number bits.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/super.c     | 34 ++++++++++++++++++++++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 6a077fb2a75f..e830470c77bd 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -18,6 +18,7 @@ struct ovl_config {
 	const char *redirect_mode;
 	bool index;
 	bool nfs_export;
+	bool xino;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d7284444f404..26a5db244081 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -352,6 +352,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.nfs_export != ovl_nfs_export_def)
 		seq_printf(m, ",nfs_export=%s", ofs->config.nfs_export ?
 						"on" : "off");
+	if (ofs->config.xino)
+		seq_puts(m, ",xino");
 	return 0;
 }
 
@@ -386,6 +388,7 @@ enum {
 	OPT_INDEX_OFF,
 	OPT_NFS_EXPORT_ON,
 	OPT_NFS_EXPORT_OFF,
+	OPT_XINO,
 	OPT_ERR,
 };
 
@@ -399,6 +402,7 @@ static const match_table_t ovl_tokens = {
 	{OPT_INDEX_OFF,			"index=off"},
 	{OPT_NFS_EXPORT_ON,		"nfs_export=on"},
 	{OPT_NFS_EXPORT_OFF,		"nfs_export=off"},
+	{OPT_XINO,			"xino"},
 	{OPT_ERR,			NULL}
 };
 
@@ -513,6 +517,10 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->nfs_export = false;
 			break;
 
+		case OPT_XINO:
+			config->xino = true;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -1197,9 +1205,31 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
 		ofs->numlower++;
 	}
 
-	/* When all layers on same fs, overlay can use real inode numbers */
-	if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
+	/*
+	 * When all layers on same fs, overlay can use real inode numbers.
+	 * With mount option "xino", mounter declares that there are enough
+	 * free high bits in underlying fs to hold the unique fsid.
+	 * If overlayfs does encounter underlying inodes using the high xino
+	 * bits reserved for fsid, it emits a warning and uses the original
+	 * inode number.
+	 */
+	if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
 		ofs->xino_bits = 0;
+		ofs->config.xino = false;
+	} else if (ofs->config.xino && !ofs->xino_bits) {
+		/*
+		 * This is a roundup of number of bits needed for numlowerfs+1
+		 * (i.e. ilog2(numlowerfs+1 - 1) + 1). fsid 0 is reserved for
+		 * upper fs even with non upper overlay.
+		 */
+		BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
+		ofs->xino_bits = ilog2(ofs->numlowerfs) + 1;
+	}
+
+	if (ofs->xino_bits) {
+		pr_info("overlayfs: \"xino\" feature enabled using %d upper inode bits.\n",
+			ofs->xino_bits);
+	}
 
 	err = 0;
 out:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v9 7/7] ovl: update documentation w.r.t "xino" feature
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (5 preceding siblings ...)
  2018-03-29 14:18 ` [PATCH v9 6/7] ovl: add support for "xino" mount option Amir Goldstein
@ 2018-03-29 14:18 ` Amir Goldstein
  2018-03-29 14:24 ` [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
  7 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 Documentation/filesystems/overlayfs.txt | 39 ++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 6ea1e64d1464..0cd4eab24899 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -14,9 +14,13 @@ The result will inevitably fail to look exactly like a normal
 filesystem for various technical reasons.  The expectation is that
 many use cases will be able to ignore these differences.
 
-This approach is 'hybrid' because the objects that appear in the
-filesystem do not all appear to belong to that filesystem.  In many
-cases an object accessed in the union will be indistinguishable
+
+Overlay objects
+---------------
+
+The overlay filesystem approach is 'hybrid', because the objects that
+appear in the filesystem do not always appear to belong to that filesystem.
+In many cases, an object accessed in the union will be indistinguishable
 from accessing the corresponding object from the original filesystem.
 This is most obvious from the 'st_dev' field returned by stat(2).
 
@@ -34,6 +38,19 @@ make the overlay mount more compliant with filesystem scanners and
 overlay objects will be distinguishable from the corresponding
 objects in the original filesystem.
 
+On 64bit systems, even if all overlay layers are not on the same
+underlying filesystem, the same compliant behavior could be achieved
+with the "xino" feature.  The "xino" feature composes a unique object
+identifier from the real object st_ino and an underlying fsid index.
+If all underlying filesystems support NFS file handles and export file
+handles with 32bit inode number encoding (e.g. ext4), overlay filesystem
+will use the high inode number bits for fsid.  Even when the underlying
+filesystem uses 64bit inode numbers, users can still enable the "xino"
+feature with the "-o xino" overlay mount option.  That is useful for the
+case of underlying filesystems like xfs and tmpfs, which use 64bit inode
+numbers, but are very unlikely to use the high inode number bit.
+
+
 Upper and Lower
 ---------------
 
@@ -290,10 +307,19 @@ Non-standard behavior
 ---------------------
 
 The copy_up operation essentially creates a new, identical file and
-moves it over to the old name.  The new file may be on a different
-filesystem, so both st_dev and st_ino of the file may change.
+moves it over to the old name.  Any open files referring to this inode
+will access the old data.
+
+The new file may be on a different filesystem, so both st_dev and st_ino
+of the real file may change.  The values of st_dev and st_ino returned by
+stat(2) on an overlay object are often not the same as the real file
+stat(2) values to prevent the values from changing on copy_up.
 
-Any open files referring to this inode will access the old data.
+Unless "xino" feature is enabled, when overlay layers are not all on the
+same underlying filesystem, the value of st_dev may be different for two
+non-directory objects in the same overlay filesystem and the value of
+st_ino for directory objects may be non persistent and could change even
+while the overlay filesystem is still mounted.
 
 Unless "inode index" feature is enabled, if a file with multiple hard
 links is copied up, then this will "break" the link.  Changes will not be
@@ -302,6 +328,7 @@ propagated to other names referring to the same inode.
 Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
 directory will fail with EXDEV.
 
+
 Changes to underlying filesystems
 ---------------------------------
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs
  2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
                   ` (6 preceding siblings ...)
  2018-03-29 14:18 ` [PATCH v9 7/7] ovl: update documentation w.r.t "xino" feature Amir Goldstein
@ 2018-03-29 14:24 ` Amir Goldstein
  7 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 14:24 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Thu, Mar 29, 2018 at 5:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Miklos,
>
> This series provides a solution for some interesting non-samefs cases:
> - All the ext* family
> - Many other fs with default encode_fh
> - xfs and tmpfs with overlay "xino" mount option
>
> The patches are also available on my ovl-xino branch [2]. They are based
> on some earlier fix patches and they do not conflict with the NFS export
> optimization patches (ovl-nfs-export branch).
>
> I tested this with upstream overlay/nonsamefs xfstest group:
> - Tests pass for ext4
> - Tests fail for xfs
> - Tests pass for xfs with OVERLAY_MOUNT_OPTIONS=-oxino
>
> I also added --xino option to unionmount-testsuite [2], along with
> the --verify option, the test verifies constant st_ino and that all
> objects are on overlay st_dev.
>
> Changes since v8:
> - Use unique fsid instead of layer id
> - Assign pseudo_dev per fsid instead of per layer
> - Limit "xino" feature to 64bit systems
> - Assign xino value i_ino as well for NFSv3 readdir
> - Add "xino" documentation patch

And:
- Instead of -EOVERFLOW on stat(2) fallback to non-xino behavior

>
> Thanks,
> Amir.
>
> [1] https://github.com/amir73il/linux/commits/ovl-xino
> [2] https://github.com/amir73il/unionmount-testsuite/commits/overlayfs-devel
>
> Amir Goldstein (7):
>   ovl: factor out ovl_map_dev_ino() helper
>   ovl: allocate anon bdev per unique lower fs
>   ovl: constant st_ino for non-samefs with xino
>   ovl: consistent i_ino for non-samefs with xino
>   ovl: consistent d_ino for non-samefs with xino
>   ovl: add support for "xino" mount option
>   ovl: update documentation w.r.t "xino" feature
>
>  Documentation/filesystems/overlayfs.txt |  39 +++++++--
>  fs/overlayfs/export.c                   |   2 +-
>  fs/overlayfs/inode.c                    | 140 +++++++++++++++++++++-----------
>  fs/overlayfs/namei.c                    |   4 +-
>  fs/overlayfs/overlayfs.h                |   6 +-
>  fs/overlayfs/ovl_entry.h                |  21 +++--
>  fs/overlayfs/readdir.c                  |  45 ++++++++--
>  fs/overlayfs/super.c                    | 120 ++++++++++++++++++++++-----
>  fs/overlayfs/util.c                     |  38 ++++++++-
>  9 files changed, 318 insertions(+), 97 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-03-29 14:18 ` [PATCH v9 6/7] ovl: add support for "xino" mount option Amir Goldstein
@ 2018-03-29 15:28   ` Miklos Szeredi
  2018-03-29 16:37     ` Amir Goldstein
  2018-04-25 14:49   ` J. R. Okajima
  1 sibling, 1 reply; 23+ messages in thread
From: Miklos Szeredi @ 2018-03-29 15:28 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Thu, Mar 29, 2018 at 4:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> With mount option "xino", mounter declares that there are enough
> free high bits in underlying fs to hold the layer fsid.
> If overlayfs does encounter underlying inodes using the high xino
> bits reserved for layer fsid, a warning will be emitted and the original
> inode number will be used.
>
> The mount option name "xino" goes after a similar meaning mount option
> of aufs, but in overlayfs case, the mapping is stateless.
>
> An example for a use case of "xino" is when upper/lower is on an xfs
> filesystem. xfs uses 64bit inode numbers, but it currently never uses the
> upper 8bit for inode numbers exposed via stat(2) and that is not likely to
> change in the future without user opting-in for a new xfs feature. The
> actual number of unused upper bit is much larger and determined by the xfs
> filesystem geometry (64 - agno_log - agblklog - inopblog). That means
> that for all practical purpose, there are enough unused bits in xfs
> inode numbers for more than OVL_MAX_STACK unique fsid's.
>
> Another example for a use case of "xino" is when upper/lower is on tmpfs.
> tmpfs inode numbers are allocated sequentially since boot, so they will
> practially never use the high inode number bits.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/super.c     | 34 ++++++++++++++++++++++++++++++++--
>  2 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 6a077fb2a75f..e830470c77bd 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -18,6 +18,7 @@ struct ovl_config {
>         const char *redirect_mode;
>         bool index;
>         bool nfs_export;
> +       bool xino;
>  };
>
>  struct ovl_sb {
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index d7284444f404..26a5db244081 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -352,6 +352,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>         if (ofs->config.nfs_export != ovl_nfs_export_def)
>                 seq_printf(m, ",nfs_export=%s", ofs->config.nfs_export ?
>                                                 "on" : "off");
> +       if (ofs->config.xino)
> +               seq_puts(m, ",xino");
>         return 0;
>  }
>
> @@ -386,6 +388,7 @@ enum {
>         OPT_INDEX_OFF,
>         OPT_NFS_EXPORT_ON,
>         OPT_NFS_EXPORT_OFF,
> +       OPT_XINO,
>         OPT_ERR,
>  };
>
> @@ -399,6 +402,7 @@ static const match_table_t ovl_tokens = {
>         {OPT_INDEX_OFF,                 "index=off"},
>         {OPT_NFS_EXPORT_ON,             "nfs_export=on"},
>         {OPT_NFS_EXPORT_OFF,            "nfs_export=off"},
> +       {OPT_XINO,                      "xino"},
>         {OPT_ERR,                       NULL}
>  };
>
> @@ -513,6 +517,10 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>                         config->nfs_export = false;
>                         break;
>
> +               case OPT_XINO:
> +                       config->xino = true;
> +                       break;
> +
>                 default:
>                         pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>                         return -EINVAL;
> @@ -1197,9 +1205,31 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
>                 ofs->numlower++;
>         }
>
> -       /* When all layers on same fs, overlay can use real inode numbers */
> -       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
> +       /*
> +        * When all layers on same fs, overlay can use real inode numbers.
> +        * With mount option "xino", mounter declares that there are enough
> +        * free high bits in underlying fs to hold the unique fsid.
> +        * If overlayfs does encounter underlying inodes using the high xino
> +        * bits reserved for fsid, it emits a warning and uses the original
> +        * inode number.
> +        */
> +       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
>                 ofs->xino_bits = 0;
> +               ofs->config.xino = false;
> +       } else if (ofs->config.xino && !ofs->xino_bits) {
> +               /*
> +                * This is a roundup of number of bits needed for numlowerfs+1
> +                * (i.e. ilog2(numlowerfs+1 - 1) + 1). fsid 0 is reserved for
> +                * upper fs even with non upper overlay.
> +                */
> +               BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
> +               ofs->xino_bits = ilog2(ofs->numlowerfs) + 1;

Shouldn't this be

  ilog2(ofs->numlowerfs + (ofs->upper_mnt ? 1 : 0))

?

Upper layer doesn't require a separate bit, just a separate fsid slot.

Thanks,
Miklos


> +       }
> +
> +       if (ofs->xino_bits) {
> +               pr_info("overlayfs: \"xino\" feature enabled using %d upper inode bits.\n",
> +                       ofs->xino_bits);
> +       }
>
>         err = 0;
>  out:
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 14:18 ` [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino Amir Goldstein
@ 2018-03-29 15:58   ` Miklos Szeredi
  2018-03-29 16:42     ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Miklos Szeredi @ 2018-03-29 15:58 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Thu, Mar 29, 2018 at 4:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On 64bit systems, when overlay layers are not all on the same fs, but
> all inode numbers of underlying fs are not using the high bits, use the
> high bits to partition the overlay st_ino address space.  The high bits
> hold the fsid (upper fsid is 0).  This way overlay inode numbers are unique
> and all inodes use overlay st_dev.  Inode numbers are also persistent
> for a given layer configuration.
>
> Currently, our only indication for available high ino bits is from a
> filesystem that supports file handles and uses the default encode_fh()
> operation, which encodes a 32bit inode number.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c     | 31 +++++++++++++++++++++++++++++--
>  fs/overlayfs/overlayfs.h |  3 ++-
>  fs/overlayfs/ovl_entry.h |  2 ++
>  fs/overlayfs/super.c     | 26 ++++++++++++++++++++++----
>  fs/overlayfs/util.c      | 24 +++++++++++++++++++++---
>  5 files changed, 76 insertions(+), 10 deletions(-)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 89dfab20fe0e..7fc9c83bf2ff 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -63,6 +63,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>  {
>         struct ovl_layer *lower_layer = ovl_layer_lower(dentry);
>         bool samefs = ovl_same_sb(dentry->d_sb);
> +       int xinobits = ovl_xino_bits(dentry->d_sb);
>
>         if (samefs) {
>                 /*
> @@ -71,7 +72,31 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>                  * which is friendly to du -x.
>                  */
>                 stat->dev = dentry->d_sb->s_dev;
> -       } else if (S_ISDIR(dentry->d_inode->i_mode)) {
> +               return 0;
> +       } else if (xinobits) {
> +               /*
> +                * All inode numbers of underlying fs should not be using the
> +                * high xinobits, so we use high xinobits to partition the
> +                * overlay st_ino address space. The high bits holds the fsid
> +                * (upper fsid is 0). This way overlay inode numbers are unique
> +                * and all inodes use overlay st_dev. Inode numbers are also
> +                * persistent for a given layer configuration.
> +                */
> +               if (stat->ino >> (64 - xinobits)) {
> +                       pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
> +                                           dentry, stat->ino, xinobits);
> +               } else {
> +                       if (lower_layer) {
> +                               stat->ino |= ((u64)lower_layer->fsid)
> +                                            << (64 - xinobits);
> +                       }
> +                       stat->dev = dentry->d_sb->s_dev;
> +                       return 0;
> +               }
> +       }
> +
> +       /* The inode could not be mapped to a unified st_ino address space */
> +       if (S_ISDIR(dentry->d_inode->i_mode)) {
>                 /*
>                  * Always use the overlay st_dev for directories, so 'find
>                  * -xdev' will scan the entire overlay mount and won't cross the
> @@ -117,11 +142,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>         /*
>          * For non-dir or same fs, we use st_ino of the copy up origin.
>          * This guaranties constant st_dev/st_ino across copy up.
> +        * With xino feature and non-samefs, we use st_ino of the copy up
> +        * origin masked with high bits that represent the layer id.
>          *
>          * If lower filesystem supports NFS file handles, this also guaranties
>          * persistent st_ino across mount cycle.
>          */
> -       if (!is_dir || samefs) {
> +       if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
>                 if (OVL_TYPE_ORIGIN(type)) {
>                         struct kstat lowerstat;
>                         u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 4432599ecbc6..09f5b8ce70aa 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -202,7 +202,8 @@ void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
>  struct super_block *ovl_same_sb(struct super_block *sb);
> -bool ovl_can_decode_fh(struct super_block *sb);
> +int ovl_xino_bits(struct super_block *sb);
> +int ovl_can_decode_fh(struct super_block *sb);
>  struct dentry *ovl_indexdir(struct super_block *sb);
>  bool ovl_index_all(struct super_block *sb);
>  bool ovl_verify_lower(struct super_block *sb);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index e1c838c27a74..6a077fb2a75f 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -63,6 +63,8 @@ struct ovl_fs {
>         /* Did we take the inuse lock? */
>         bool upperdir_locked;
>         bool workdir_locked;
> +       /* Inode numbers in all layers do not use the high xino_bits */
> +       int xino_bits;
>  };
>
>  /* private information held for every overlayfs dentry */
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 7d97d30cad39..d7284444f404 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -17,6 +17,7 @@
>  #include <linux/statfs.h>
>  #include <linux/seq_file.h>
>  #include <linux/posix_acl_xattr.h>
> +#include <linux/exportfs.h>
>  #include "overlayfs.h"
>
>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
> @@ -701,6 +702,7 @@ static int ovl_check_namelen(struct path *path, struct ovl_fs *ofs,
>  static int ovl_lower_dir(const char *name, struct path *path,
>                          struct ovl_fs *ofs, int *stack_depth, bool *remote)
>  {
> +       int fh_type;
>         int err;
>
>         err = ovl_mount_dir_noesc(name, path);
> @@ -720,15 +722,19 @@ static int ovl_lower_dir(const char *name, struct path *path,
>          * The inodes index feature and NFS export need to encode and decode
>          * file handles, so they require that all layers support them.
>          */
> +       fh_type = ovl_can_decode_fh(path->dentry->d_sb);
>         if ((ofs->config.nfs_export ||
> -            (ofs->config.index && ofs->config.upperdir)) &&
> -           !ovl_can_decode_fh(path->dentry->d_sb)) {
> +            (ofs->config.index && ofs->config.upperdir)) && !fh_type) {
>                 ofs->config.index = false;
>                 ofs->config.nfs_export = false;
>                 pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off,nfs_export=off.\n",
>                         name);
>         }
>
> +       /* Check if lower fs has 32bit inode numbers */
> +       if (fh_type != FILEID_INO32_GEN)
> +               ofs->xino_bits = 0;
> +
>         return 0;
>
>  out_put:
> @@ -952,6 +958,7 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>  {
>         struct vfsmount *mnt = ofs->upper_mnt;
>         struct dentry *temp;
> +       int fh_type;
>         int err;
>
>         err = mnt_want_write(mnt);
> @@ -1001,12 +1008,16 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>         }
>
>         /* Check if upper/work fs supports file handles */
> -       if (ofs->config.index &&
> -           !ovl_can_decode_fh(ofs->workdir->d_sb)) {
> +       fh_type = ovl_can_decode_fh(ofs->workdir->d_sb);
> +       if (ofs->config.index && !fh_type) {
>                 ofs->config.index = false;
>                 pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
>         }
>
> +       /* Check if upper fs has 32bit inode numbers */
> +       if (fh_type != FILEID_INO32_GEN)
> +               ofs->xino_bits = 0;
> +
>         /* NFS export of r/w mount depends on index */
>         if (ofs->config.nfs_export && !ofs->config.index) {
>                 pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
> @@ -1185,6 +1196,11 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
>                 }
>                 ofs->numlower++;
>         }
> +
> +       /* When all layers on same fs, overlay can use real inode numbers */
> +       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
> +               ofs->xino_bits = 0;
> +
>         err = 0;
>  out:
>         return err;
> @@ -1308,6 +1324,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>
>         sb->s_stack_depth = 0;
>         sb->s_maxbytes = MAX_LFS_FILESIZE;
> +       /* Assume underlaying fs uses 32bit inodes unless proven otherwise */
> +       ofs->xino_bits = BITS_PER_LONG - 32;

This disables xino for 32bit archs.  Which is probably the right thing
to do, otherwise there might be a regression in some cases since
kernel will return EOVERFLOW if st_ino would overflow. Well, this is
true for 32bit mode in 64bit kernel as well, so the above is not a
perfect solution.

Not sure if we need to worry.  For 32bit archs, I think disabling xino
is OK; it can be enabled explicitly if needed.  For 64bit archs, let's
hope it doesn't regress for anybody and if it does, we need to take
steps.

So if you agree, I'll just add these as a comment.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-03-29 15:28   ` Miklos Szeredi
@ 2018-03-29 16:37     ` Amir Goldstein
  0 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 16:37 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Thu, Mar 29, 2018 at 6:28 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Mar 29, 2018 at 4:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> With mount option "xino", mounter declares that there are enough
>> free high bits in underlying fs to hold the layer fsid.
>> If overlayfs does encounter underlying inodes using the high xino
>> bits reserved for layer fsid, a warning will be emitted and the original
>> inode number will be used.
>>
>> The mount option name "xino" goes after a similar meaning mount option
>> of aufs, but in overlayfs case, the mapping is stateless.
>>
>> An example for a use case of "xino" is when upper/lower is on an xfs
>> filesystem. xfs uses 64bit inode numbers, but it currently never uses the
>> upper 8bit for inode numbers exposed via stat(2) and that is not likely to
>> change in the future without user opting-in for a new xfs feature. The
>> actual number of unused upper bit is much larger and determined by the xfs
>> filesystem geometry (64 - agno_log - agblklog - inopblog). That means
>> that for all practical purpose, there are enough unused bits in xfs
>> inode numbers for more than OVL_MAX_STACK unique fsid's.
>>
>> Another example for a use case of "xino" is when upper/lower is on tmpfs.
>> tmpfs inode numbers are allocated sequentially since boot, so they will
>> practially never use the high inode number bits.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/ovl_entry.h |  1 +
>>  fs/overlayfs/super.c     | 34 ++++++++++++++++++++++++++++++++--
>>  2 files changed, 33 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>> index 6a077fb2a75f..e830470c77bd 100644
>> --- a/fs/overlayfs/ovl_entry.h
>> +++ b/fs/overlayfs/ovl_entry.h
>> @@ -18,6 +18,7 @@ struct ovl_config {
>>         const char *redirect_mode;
>>         bool index;
>>         bool nfs_export;
>> +       bool xino;
>>  };
>>
>>  struct ovl_sb {
>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>> index d7284444f404..26a5db244081 100644
>> --- a/fs/overlayfs/super.c
>> +++ b/fs/overlayfs/super.c
>> @@ -352,6 +352,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>>         if (ofs->config.nfs_export != ovl_nfs_export_def)
>>                 seq_printf(m, ",nfs_export=%s", ofs->config.nfs_export ?
>>                                                 "on" : "off");
>> +       if (ofs->config.xino)
>> +               seq_puts(m, ",xino");
>>         return 0;
>>  }
>>
>> @@ -386,6 +388,7 @@ enum {
>>         OPT_INDEX_OFF,
>>         OPT_NFS_EXPORT_ON,
>>         OPT_NFS_EXPORT_OFF,
>> +       OPT_XINO,
>>         OPT_ERR,
>>  };
>>
>> @@ -399,6 +402,7 @@ static const match_table_t ovl_tokens = {
>>         {OPT_INDEX_OFF,                 "index=off"},
>>         {OPT_NFS_EXPORT_ON,             "nfs_export=on"},
>>         {OPT_NFS_EXPORT_OFF,            "nfs_export=off"},
>> +       {OPT_XINO,                      "xino"},
>>         {OPT_ERR,                       NULL}
>>  };
>>
>> @@ -513,6 +517,10 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>>                         config->nfs_export = false;
>>                         break;
>>
>> +               case OPT_XINO:
>> +                       config->xino = true;
>> +                       break;
>> +
>>                 default:
>>                         pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>>                         return -EINVAL;
>> @@ -1197,9 +1205,31 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
>>                 ofs->numlower++;
>>         }
>>
>> -       /* When all layers on same fs, overlay can use real inode numbers */
>> -       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
>> +       /*
>> +        * When all layers on same fs, overlay can use real inode numbers.
>> +        * With mount option "xino", mounter declares that there are enough
>> +        * free high bits in underlying fs to hold the unique fsid.
>> +        * If overlayfs does encounter underlying inodes using the high xino
>> +        * bits reserved for fsid, it emits a warning and uses the original
>> +        * inode number.
>> +        */
>> +       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
>>                 ofs->xino_bits = 0;
>> +               ofs->config.xino = false;
>> +       } else if (ofs->config.xino && !ofs->xino_bits) {
>> +               /*
>> +                * This is a roundup of number of bits needed for numlowerfs+1
>> +                * (i.e. ilog2(numlowerfs+1 - 1) + 1). fsid 0 is reserved for
>> +                * upper fs even with non upper overlay.
>> +                */
>> +               BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
>> +               ofs->xino_bits = ilog2(ofs->numlowerfs) + 1;
>
> Shouldn't this be
>
>   ilog2(ofs->numlowerfs + (ofs->upper_mnt ? 1 : 0))
>
> ?
>
> Upper layer doesn't require a separate bit, just a separate fsid slot.
>

+1 is not for upper fs bit its for round up.
This is confusing hence the comment above.
ilog2(2^N+a) returns log2 or the "rounded down" value (i.e. N).
So for 2^N+a fsids we need N+1 bits.
The accurate expression is therefore:

 ilog2(ofs->numlowerfs + (ofs->upper_mnt ? 1 : 0) - 1) + 1

However, for simplicity, if there is no upper_mnt, first fsid is still 1
so I ommitted the condition and left with

 ilog2(ofs->numlowerfs + 1 - 1) + 1

I leave it to you as an exercise to see how hard it would be to get
rid of not reserving fsid 0 for upper fs (it makes reference into the
lower_fs array conditional on upper_mnt.
Maybe I just didn't try hard enough or wasn't creative enough.
Anyway, I did not think it was important not reserving 1 fsid for
non upper case.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 15:58   ` Miklos Szeredi
@ 2018-03-29 16:42     ` Amir Goldstein
  2018-03-29 18:26       ` Miklos Szeredi
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 16:42 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Thu, Mar 29, 2018 at 6:58 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Mar 29, 2018 at 4:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On 64bit systems, when overlay layers are not all on the same fs, but
>> all inode numbers of underlying fs are not using the high bits, use the
>> high bits to partition the overlay st_ino address space.  The high bits
>> hold the fsid (upper fsid is 0).  This way overlay inode numbers are unique
>> and all inodes use overlay st_dev.  Inode numbers are also persistent
>> for a given layer configuration.
>>
>> Currently, our only indication for available high ino bits is from a
>> filesystem that supports file handles and uses the default encode_fh()
>> operation, which encodes a 32bit inode number.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/inode.c     | 31 +++++++++++++++++++++++++++++--
>>  fs/overlayfs/overlayfs.h |  3 ++-
>>  fs/overlayfs/ovl_entry.h |  2 ++
>>  fs/overlayfs/super.c     | 26 ++++++++++++++++++++++----
>>  fs/overlayfs/util.c      | 24 +++++++++++++++++++++---
>>  5 files changed, 76 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>> index 89dfab20fe0e..7fc9c83bf2ff 100644
>> --- a/fs/overlayfs/inode.c
>> +++ b/fs/overlayfs/inode.c
>> @@ -63,6 +63,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>>  {
>>         struct ovl_layer *lower_layer = ovl_layer_lower(dentry);
>>         bool samefs = ovl_same_sb(dentry->d_sb);
>> +       int xinobits = ovl_xino_bits(dentry->d_sb);
>>
>>         if (samefs) {
>>                 /*
>> @@ -71,7 +72,31 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>>                  * which is friendly to du -x.
>>                  */
>>                 stat->dev = dentry->d_sb->s_dev;
>> -       } else if (S_ISDIR(dentry->d_inode->i_mode)) {
>> +               return 0;
>> +       } else if (xinobits) {
>> +               /*
>> +                * All inode numbers of underlying fs should not be using the
>> +                * high xinobits, so we use high xinobits to partition the
>> +                * overlay st_ino address space. The high bits holds the fsid
>> +                * (upper fsid is 0). This way overlay inode numbers are unique
>> +                * and all inodes use overlay st_dev. Inode numbers are also
>> +                * persistent for a given layer configuration.
>> +                */
>> +               if (stat->ino >> (64 - xinobits)) {
>> +                       pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
>> +                                           dentry, stat->ino, xinobits);
>> +               } else {
>> +                       if (lower_layer) {
>> +                               stat->ino |= ((u64)lower_layer->fsid)
>> +                                            << (64 - xinobits);
>> +                       }
>> +                       stat->dev = dentry->d_sb->s_dev;
>> +                       return 0;
>> +               }
>> +       }
>> +
>> +       /* The inode could not be mapped to a unified st_ino address space */
>> +       if (S_ISDIR(dentry->d_inode->i_mode)) {
>>                 /*
>>                  * Always use the overlay st_dev for directories, so 'find
>>                  * -xdev' will scan the entire overlay mount and won't cross the
>> @@ -117,11 +142,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>>         /*
>>          * For non-dir or same fs, we use st_ino of the copy up origin.
>>          * This guaranties constant st_dev/st_ino across copy up.
>> +        * With xino feature and non-samefs, we use st_ino of the copy up
>> +        * origin masked with high bits that represent the layer id.
>>          *
>>          * If lower filesystem supports NFS file handles, this also guaranties
>>          * persistent st_ino across mount cycle.
>>          */
>> -       if (!is_dir || samefs) {
>> +       if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
>>                 if (OVL_TYPE_ORIGIN(type)) {
>>                         struct kstat lowerstat;
>>                         u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>> index 4432599ecbc6..09f5b8ce70aa 100644
>> --- a/fs/overlayfs/overlayfs.h
>> +++ b/fs/overlayfs/overlayfs.h
>> @@ -202,7 +202,8 @@ void ovl_drop_write(struct dentry *dentry);
>>  struct dentry *ovl_workdir(struct dentry *dentry);
>>  const struct cred *ovl_override_creds(struct super_block *sb);
>>  struct super_block *ovl_same_sb(struct super_block *sb);
>> -bool ovl_can_decode_fh(struct super_block *sb);
>> +int ovl_xino_bits(struct super_block *sb);
>> +int ovl_can_decode_fh(struct super_block *sb);
>>  struct dentry *ovl_indexdir(struct super_block *sb);
>>  bool ovl_index_all(struct super_block *sb);
>>  bool ovl_verify_lower(struct super_block *sb);
>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>> index e1c838c27a74..6a077fb2a75f 100644
>> --- a/fs/overlayfs/ovl_entry.h
>> +++ b/fs/overlayfs/ovl_entry.h
>> @@ -63,6 +63,8 @@ struct ovl_fs {
>>         /* Did we take the inuse lock? */
>>         bool upperdir_locked;
>>         bool workdir_locked;
>> +       /* Inode numbers in all layers do not use the high xino_bits */
>> +       int xino_bits;
>>  };
>>
>>  /* private information held for every overlayfs dentry */
>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>> index 7d97d30cad39..d7284444f404 100644
>> --- a/fs/overlayfs/super.c
>> +++ b/fs/overlayfs/super.c
>> @@ -17,6 +17,7 @@
>>  #include <linux/statfs.h>
>>  #include <linux/seq_file.h>
>>  #include <linux/posix_acl_xattr.h>
>> +#include <linux/exportfs.h>
>>  #include "overlayfs.h"
>>
>>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
>> @@ -701,6 +702,7 @@ static int ovl_check_namelen(struct path *path, struct ovl_fs *ofs,
>>  static int ovl_lower_dir(const char *name, struct path *path,
>>                          struct ovl_fs *ofs, int *stack_depth, bool *remote)
>>  {
>> +       int fh_type;
>>         int err;
>>
>>         err = ovl_mount_dir_noesc(name, path);
>> @@ -720,15 +722,19 @@ static int ovl_lower_dir(const char *name, struct path *path,
>>          * The inodes index feature and NFS export need to encode and decode
>>          * file handles, so they require that all layers support them.
>>          */
>> +       fh_type = ovl_can_decode_fh(path->dentry->d_sb);
>>         if ((ofs->config.nfs_export ||
>> -            (ofs->config.index && ofs->config.upperdir)) &&
>> -           !ovl_can_decode_fh(path->dentry->d_sb)) {
>> +            (ofs->config.index && ofs->config.upperdir)) && !fh_type) {
>>                 ofs->config.index = false;
>>                 ofs->config.nfs_export = false;
>>                 pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off,nfs_export=off.\n",
>>                         name);
>>         }
>>
>> +       /* Check if lower fs has 32bit inode numbers */
>> +       if (fh_type != FILEID_INO32_GEN)
>> +               ofs->xino_bits = 0;
>> +
>>         return 0;
>>
>>  out_put:
>> @@ -952,6 +958,7 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>>  {
>>         struct vfsmount *mnt = ofs->upper_mnt;
>>         struct dentry *temp;
>> +       int fh_type;
>>         int err;
>>
>>         err = mnt_want_write(mnt);
>> @@ -1001,12 +1008,16 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>>         }
>>
>>         /* Check if upper/work fs supports file handles */
>> -       if (ofs->config.index &&
>> -           !ovl_can_decode_fh(ofs->workdir->d_sb)) {
>> +       fh_type = ovl_can_decode_fh(ofs->workdir->d_sb);
>> +       if (ofs->config.index && !fh_type) {
>>                 ofs->config.index = false;
>>                 pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
>>         }
>>
>> +       /* Check if upper fs has 32bit inode numbers */
>> +       if (fh_type != FILEID_INO32_GEN)
>> +               ofs->xino_bits = 0;
>> +
>>         /* NFS export of r/w mount depends on index */
>>         if (ofs->config.nfs_export && !ofs->config.index) {
>>                 pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
>> @@ -1185,6 +1196,11 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
>>                 }
>>                 ofs->numlower++;
>>         }
>> +
>> +       /* When all layers on same fs, overlay can use real inode numbers */
>> +       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
>> +               ofs->xino_bits = 0;
>> +
>>         err = 0;
>>  out:
>>         return err;
>> @@ -1308,6 +1324,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>
>>         sb->s_stack_depth = 0;
>>         sb->s_maxbytes = MAX_LFS_FILESIZE;
>> +       /* Assume underlaying fs uses 32bit inodes unless proven otherwise */
>> +       ofs->xino_bits = BITS_PER_LONG - 32;
>
> This disables xino for 32bit archs.  Which is probably the right thing
> to do, otherwise there might be a regression in some cases since
> kernel will return EOVERFLOW if st_ino would overflow. Well, this is
> true for 32bit mode in 64bit kernel as well, so the above is not a
> perfect solution.
>
> Not sure if we need to worry.  For 32bit archs, I think disabling xino
> is OK; it can be enabled explicitly if needed.  For 64bit archs, let's
> hope it doesn't regress for anybody and if it does, we need to take
> steps.

Perhaps, the steps would be to implement -o noxino.

>
> So if you agree, I'll just add these as a comment.
>

Ok.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 16:42     ` Amir Goldstein
@ 2018-03-29 18:26       ` Miklos Szeredi
  2018-03-29 19:02         ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Miklos Szeredi @ 2018-03-29 18:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Thu, Mar 29, 2018 at 6:42 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Mar 29, 2018 at 6:58 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, Mar 29, 2018 at 4:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On 64bit systems, when overlay layers are not all on the same fs, but
>>> all inode numbers of underlying fs are not using the high bits, use the
>>> high bits to partition the overlay st_ino address space.  The high bits
>>> hold the fsid (upper fsid is 0).  This way overlay inode numbers are unique
>>> and all inodes use overlay st_dev.  Inode numbers are also persistent
>>> for a given layer configuration.
>>>
>>> Currently, our only indication for available high ino bits is from a
>>> filesystem that supports file handles and uses the default encode_fh()
>>> operation, which encodes a 32bit inode number.
>>>
>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> ---
>>>  fs/overlayfs/inode.c     | 31 +++++++++++++++++++++++++++++--
>>>  fs/overlayfs/overlayfs.h |  3 ++-
>>>  fs/overlayfs/ovl_entry.h |  2 ++
>>>  fs/overlayfs/super.c     | 26 ++++++++++++++++++++++----
>>>  fs/overlayfs/util.c      | 24 +++++++++++++++++++++---
>>>  5 files changed, 76 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>>> index 89dfab20fe0e..7fc9c83bf2ff 100644
>>> --- a/fs/overlayfs/inode.c
>>> +++ b/fs/overlayfs/inode.c
>>> @@ -63,6 +63,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>>>  {
>>>         struct ovl_layer *lower_layer = ovl_layer_lower(dentry);
>>>         bool samefs = ovl_same_sb(dentry->d_sb);
>>> +       int xinobits = ovl_xino_bits(dentry->d_sb);
>>>
>>>         if (samefs) {
>>>                 /*
>>> @@ -71,7 +72,31 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>>>                  * which is friendly to du -x.
>>>                  */
>>>                 stat->dev = dentry->d_sb->s_dev;
>>> -       } else if (S_ISDIR(dentry->d_inode->i_mode)) {
>>> +               return 0;
>>> +       } else if (xinobits) {
>>> +               /*
>>> +                * All inode numbers of underlying fs should not be using the
>>> +                * high xinobits, so we use high xinobits to partition the
>>> +                * overlay st_ino address space. The high bits holds the fsid
>>> +                * (upper fsid is 0). This way overlay inode numbers are unique
>>> +                * and all inodes use overlay st_dev. Inode numbers are also
>>> +                * persistent for a given layer configuration.
>>> +                */
>>> +               if (stat->ino >> (64 - xinobits)) {
>>> +                       pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
>>> +                                           dentry, stat->ino, xinobits);
>>> +               } else {
>>> +                       if (lower_layer) {
>>> +                               stat->ino |= ((u64)lower_layer->fsid)
>>> +                                            << (64 - xinobits);
>>> +                       }
>>> +                       stat->dev = dentry->d_sb->s_dev;
>>> +                       return 0;
>>> +               }
>>> +       }
>>> +
>>> +       /* The inode could not be mapped to a unified st_ino address space */
>>> +       if (S_ISDIR(dentry->d_inode->i_mode)) {
>>>                 /*
>>>                  * Always use the overlay st_dev for directories, so 'find
>>>                  * -xdev' will scan the entire overlay mount and won't cross the
>>> @@ -117,11 +142,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>>>         /*
>>>          * For non-dir or same fs, we use st_ino of the copy up origin.
>>>          * This guaranties constant st_dev/st_ino across copy up.
>>> +        * With xino feature and non-samefs, we use st_ino of the copy up
>>> +        * origin masked with high bits that represent the layer id.
>>>          *
>>>          * If lower filesystem supports NFS file handles, this also guaranties
>>>          * persistent st_ino across mount cycle.
>>>          */
>>> -       if (!is_dir || samefs) {
>>> +       if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
>>>                 if (OVL_TYPE_ORIGIN(type)) {
>>>                         struct kstat lowerstat;
>>>                         u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
>>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>>> index 4432599ecbc6..09f5b8ce70aa 100644
>>> --- a/fs/overlayfs/overlayfs.h
>>> +++ b/fs/overlayfs/overlayfs.h
>>> @@ -202,7 +202,8 @@ void ovl_drop_write(struct dentry *dentry);
>>>  struct dentry *ovl_workdir(struct dentry *dentry);
>>>  const struct cred *ovl_override_creds(struct super_block *sb);
>>>  struct super_block *ovl_same_sb(struct super_block *sb);
>>> -bool ovl_can_decode_fh(struct super_block *sb);
>>> +int ovl_xino_bits(struct super_block *sb);
>>> +int ovl_can_decode_fh(struct super_block *sb);
>>>  struct dentry *ovl_indexdir(struct super_block *sb);
>>>  bool ovl_index_all(struct super_block *sb);
>>>  bool ovl_verify_lower(struct super_block *sb);
>>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>>> index e1c838c27a74..6a077fb2a75f 100644
>>> --- a/fs/overlayfs/ovl_entry.h
>>> +++ b/fs/overlayfs/ovl_entry.h
>>> @@ -63,6 +63,8 @@ struct ovl_fs {
>>>         /* Did we take the inuse lock? */
>>>         bool upperdir_locked;
>>>         bool workdir_locked;
>>> +       /* Inode numbers in all layers do not use the high xino_bits */
>>> +       int xino_bits;
>>>  };
>>>
>>>  /* private information held for every overlayfs dentry */
>>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>>> index 7d97d30cad39..d7284444f404 100644
>>> --- a/fs/overlayfs/super.c
>>> +++ b/fs/overlayfs/super.c
>>> @@ -17,6 +17,7 @@
>>>  #include <linux/statfs.h>
>>>  #include <linux/seq_file.h>
>>>  #include <linux/posix_acl_xattr.h>
>>> +#include <linux/exportfs.h>
>>>  #include "overlayfs.h"
>>>
>>>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
>>> @@ -701,6 +702,7 @@ static int ovl_check_namelen(struct path *path, struct ovl_fs *ofs,
>>>  static int ovl_lower_dir(const char *name, struct path *path,
>>>                          struct ovl_fs *ofs, int *stack_depth, bool *remote)
>>>  {
>>> +       int fh_type;
>>>         int err;
>>>
>>>         err = ovl_mount_dir_noesc(name, path);
>>> @@ -720,15 +722,19 @@ static int ovl_lower_dir(const char *name, struct path *path,
>>>          * The inodes index feature and NFS export need to encode and decode
>>>          * file handles, so they require that all layers support them.
>>>          */
>>> +       fh_type = ovl_can_decode_fh(path->dentry->d_sb);
>>>         if ((ofs->config.nfs_export ||
>>> -            (ofs->config.index && ofs->config.upperdir)) &&
>>> -           !ovl_can_decode_fh(path->dentry->d_sb)) {
>>> +            (ofs->config.index && ofs->config.upperdir)) && !fh_type) {
>>>                 ofs->config.index = false;
>>>                 ofs->config.nfs_export = false;
>>>                 pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off,nfs_export=off.\n",
>>>                         name);
>>>         }
>>>
>>> +       /* Check if lower fs has 32bit inode numbers */
>>> +       if (fh_type != FILEID_INO32_GEN)
>>> +               ofs->xino_bits = 0;
>>> +
>>>         return 0;
>>>
>>>  out_put:
>>> @@ -952,6 +958,7 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>>>  {
>>>         struct vfsmount *mnt = ofs->upper_mnt;
>>>         struct dentry *temp;
>>> +       int fh_type;
>>>         int err;
>>>
>>>         err = mnt_want_write(mnt);
>>> @@ -1001,12 +1008,16 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>>>         }
>>>
>>>         /* Check if upper/work fs supports file handles */
>>> -       if (ofs->config.index &&
>>> -           !ovl_can_decode_fh(ofs->workdir->d_sb)) {
>>> +       fh_type = ovl_can_decode_fh(ofs->workdir->d_sb);
>>> +       if (ofs->config.index && !fh_type) {
>>>                 ofs->config.index = false;
>>>                 pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
>>>         }
>>>
>>> +       /* Check if upper fs has 32bit inode numbers */
>>> +       if (fh_type != FILEID_INO32_GEN)
>>> +               ofs->xino_bits = 0;
>>> +
>>>         /* NFS export of r/w mount depends on index */
>>>         if (ofs->config.nfs_export && !ofs->config.index) {
>>>                 pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
>>> @@ -1185,6 +1196,11 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
>>>                 }
>>>                 ofs->numlower++;
>>>         }
>>> +
>>> +       /* When all layers on same fs, overlay can use real inode numbers */
>>> +       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
>>> +               ofs->xino_bits = 0;
>>> +
>>>         err = 0;
>>>  out:
>>>         return err;
>>> @@ -1308,6 +1324,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>
>>>         sb->s_stack_depth = 0;
>>>         sb->s_maxbytes = MAX_LFS_FILESIZE;
>>> +       /* Assume underlaying fs uses 32bit inodes unless proven otherwise */
>>> +       ofs->xino_bits = BITS_PER_LONG - 32;
>>
>> This disables xino for 32bit archs.  Which is probably the right thing
>> to do, otherwise there might be a regression in some cases since
>> kernel will return EOVERFLOW if st_ino would overflow. Well, this is
>> true for 32bit mode in 64bit kernel as well, so the above is not a
>> perfect solution.
>>
>> Not sure if we need to worry.  For 32bit archs, I think disabling xino
>> is OK; it can be enabled explicitly if needed.  For 64bit archs, let's
>> hope it doesn't regress for anybody and if it does, we need to take
>> steps.
>
> Perhaps, the steps would be to implement -o noxino.

And that would need to be the default if selected by a kernel config
option.   Should we do that now, to prevent surprises?  Maybe we
should; it does't cost anything in complexity, and I think it's better
to let the user choose between backward compatibility and new feature.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 18:26       ` Miklos Szeredi
@ 2018-03-29 19:02         ` Amir Goldstein
  2018-03-29 19:41           ` Miklos Szeredi
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 19:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Thu, Mar 29, 2018 at 9:26 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Mar 29, 2018 at 6:42 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Thu, Mar 29, 2018 at 6:58 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Thu, Mar 29, 2018 at 4:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> On 64bit systems, when overlay layers are not all on the same fs, but
>>>> all inode numbers of underlying fs are not using the high bits, use the
>>>> high bits to partition the overlay st_ino address space.  The high bits
>>>> hold the fsid (upper fsid is 0).  This way overlay inode numbers are unique
>>>> and all inodes use overlay st_dev.  Inode numbers are also persistent
>>>> for a given layer configuration.
>>>>
>>>> Currently, our only indication for available high ino bits is from a
>>>> filesystem that supports file handles and uses the default encode_fh()
>>>> operation, which encodes a 32bit inode number.
>>>>
>>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>>> ---
>>>>  fs/overlayfs/inode.c     | 31 +++++++++++++++++++++++++++++--
>>>>  fs/overlayfs/overlayfs.h |  3 ++-
>>>>  fs/overlayfs/ovl_entry.h |  2 ++
>>>>  fs/overlayfs/super.c     | 26 ++++++++++++++++++++++----
>>>>  fs/overlayfs/util.c      | 24 +++++++++++++++++++++---
>>>>  5 files changed, 76 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>>>> index 89dfab20fe0e..7fc9c83bf2ff 100644
>>>> --- a/fs/overlayfs/inode.c
>>>> +++ b/fs/overlayfs/inode.c
>>>> @@ -63,6 +63,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>>>>  {
>>>>         struct ovl_layer *lower_layer = ovl_layer_lower(dentry);
>>>>         bool samefs = ovl_same_sb(dentry->d_sb);
>>>> +       int xinobits = ovl_xino_bits(dentry->d_sb);
>>>>
>>>>         if (samefs) {
>>>>                 /*
>>>> @@ -71,7 +72,31 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat)
>>>>                  * which is friendly to du -x.
>>>>                  */
>>>>                 stat->dev = dentry->d_sb->s_dev;
>>>> -       } else if (S_ISDIR(dentry->d_inode->i_mode)) {
>>>> +               return 0;
>>>> +       } else if (xinobits) {
>>>> +               /*
>>>> +                * All inode numbers of underlying fs should not be using the
>>>> +                * high xinobits, so we use high xinobits to partition the
>>>> +                * overlay st_ino address space. The high bits holds the fsid
>>>> +                * (upper fsid is 0). This way overlay inode numbers are unique
>>>> +                * and all inodes use overlay st_dev. Inode numbers are also
>>>> +                * persistent for a given layer configuration.
>>>> +                */
>>>> +               if (stat->ino >> (64 - xinobits)) {
>>>> +                       pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
>>>> +                                           dentry, stat->ino, xinobits);
>>>> +               } else {
>>>> +                       if (lower_layer) {
>>>> +                               stat->ino |= ((u64)lower_layer->fsid)
>>>> +                                            << (64 - xinobits);
>>>> +                       }
>>>> +                       stat->dev = dentry->d_sb->s_dev;
>>>> +                       return 0;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       /* The inode could not be mapped to a unified st_ino address space */
>>>> +       if (S_ISDIR(dentry->d_inode->i_mode)) {
>>>>                 /*
>>>>                  * Always use the overlay st_dev for directories, so 'find
>>>>                  * -xdev' will scan the entire overlay mount and won't cross the
>>>> @@ -117,11 +142,13 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>>>>         /*
>>>>          * For non-dir or same fs, we use st_ino of the copy up origin.
>>>>          * This guaranties constant st_dev/st_ino across copy up.
>>>> +        * With xino feature and non-samefs, we use st_ino of the copy up
>>>> +        * origin masked with high bits that represent the layer id.
>>>>          *
>>>>          * If lower filesystem supports NFS file handles, this also guaranties
>>>>          * persistent st_ino across mount cycle.
>>>>          */
>>>> -       if (!is_dir || samefs) {
>>>> +       if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
>>>>                 if (OVL_TYPE_ORIGIN(type)) {
>>>>                         struct kstat lowerstat;
>>>>                         u32 lowermask = STATX_INO | (!is_dir ? STATX_NLINK : 0);
>>>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>>>> index 4432599ecbc6..09f5b8ce70aa 100644
>>>> --- a/fs/overlayfs/overlayfs.h
>>>> +++ b/fs/overlayfs/overlayfs.h
>>>> @@ -202,7 +202,8 @@ void ovl_drop_write(struct dentry *dentry);
>>>>  struct dentry *ovl_workdir(struct dentry *dentry);
>>>>  const struct cred *ovl_override_creds(struct super_block *sb);
>>>>  struct super_block *ovl_same_sb(struct super_block *sb);
>>>> -bool ovl_can_decode_fh(struct super_block *sb);
>>>> +int ovl_xino_bits(struct super_block *sb);
>>>> +int ovl_can_decode_fh(struct super_block *sb);
>>>>  struct dentry *ovl_indexdir(struct super_block *sb);
>>>>  bool ovl_index_all(struct super_block *sb);
>>>>  bool ovl_verify_lower(struct super_block *sb);
>>>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>>>> index e1c838c27a74..6a077fb2a75f 100644
>>>> --- a/fs/overlayfs/ovl_entry.h
>>>> +++ b/fs/overlayfs/ovl_entry.h
>>>> @@ -63,6 +63,8 @@ struct ovl_fs {
>>>>         /* Did we take the inuse lock? */
>>>>         bool upperdir_locked;
>>>>         bool workdir_locked;
>>>> +       /* Inode numbers in all layers do not use the high xino_bits */
>>>> +       int xino_bits;
>>>>  };
>>>>
>>>>  /* private information held for every overlayfs dentry */
>>>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>>>> index 7d97d30cad39..d7284444f404 100644
>>>> --- a/fs/overlayfs/super.c
>>>> +++ b/fs/overlayfs/super.c
>>>> @@ -17,6 +17,7 @@
>>>>  #include <linux/statfs.h>
>>>>  #include <linux/seq_file.h>
>>>>  #include <linux/posix_acl_xattr.h>
>>>> +#include <linux/exportfs.h>
>>>>  #include "overlayfs.h"
>>>>
>>>>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
>>>> @@ -701,6 +702,7 @@ static int ovl_check_namelen(struct path *path, struct ovl_fs *ofs,
>>>>  static int ovl_lower_dir(const char *name, struct path *path,
>>>>                          struct ovl_fs *ofs, int *stack_depth, bool *remote)
>>>>  {
>>>> +       int fh_type;
>>>>         int err;
>>>>
>>>>         err = ovl_mount_dir_noesc(name, path);
>>>> @@ -720,15 +722,19 @@ static int ovl_lower_dir(const char *name, struct path *path,
>>>>          * The inodes index feature and NFS export need to encode and decode
>>>>          * file handles, so they require that all layers support them.
>>>>          */
>>>> +       fh_type = ovl_can_decode_fh(path->dentry->d_sb);
>>>>         if ((ofs->config.nfs_export ||
>>>> -            (ofs->config.index && ofs->config.upperdir)) &&
>>>> -           !ovl_can_decode_fh(path->dentry->d_sb)) {
>>>> +            (ofs->config.index && ofs->config.upperdir)) && !fh_type) {
>>>>                 ofs->config.index = false;
>>>>                 ofs->config.nfs_export = false;
>>>>                 pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off,nfs_export=off.\n",
>>>>                         name);
>>>>         }
>>>>
>>>> +       /* Check if lower fs has 32bit inode numbers */
>>>> +       if (fh_type != FILEID_INO32_GEN)
>>>> +               ofs->xino_bits = 0;
>>>> +
>>>>         return 0;
>>>>
>>>>  out_put:
>>>> @@ -952,6 +958,7 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>>>>  {
>>>>         struct vfsmount *mnt = ofs->upper_mnt;
>>>>         struct dentry *temp;
>>>> +       int fh_type;
>>>>         int err;
>>>>
>>>>         err = mnt_want_write(mnt);
>>>> @@ -1001,12 +1008,16 @@ static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath)
>>>>         }
>>>>
>>>>         /* Check if upper/work fs supports file handles */
>>>> -       if (ofs->config.index &&
>>>> -           !ovl_can_decode_fh(ofs->workdir->d_sb)) {
>>>> +       fh_type = ovl_can_decode_fh(ofs->workdir->d_sb);
>>>> +       if (ofs->config.index && !fh_type) {
>>>>                 ofs->config.index = false;
>>>>                 pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n");
>>>>         }
>>>>
>>>> +       /* Check if upper fs has 32bit inode numbers */
>>>> +       if (fh_type != FILEID_INO32_GEN)
>>>> +               ofs->xino_bits = 0;
>>>> +
>>>>         /* NFS export of r/w mount depends on index */
>>>>         if (ofs->config.nfs_export && !ofs->config.index) {
>>>>                 pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n");
>>>> @@ -1185,6 +1196,11 @@ static int ovl_get_lower_layers(struct ovl_fs *ofs, struct path *stack,
>>>>                 }
>>>>                 ofs->numlower++;
>>>>         }
>>>> +
>>>> +       /* When all layers on same fs, overlay can use real inode numbers */
>>>> +       if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt))
>>>> +               ofs->xino_bits = 0;
>>>> +
>>>>         err = 0;
>>>>  out:
>>>>         return err;
>>>> @@ -1308,6 +1324,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>>>
>>>>         sb->s_stack_depth = 0;
>>>>         sb->s_maxbytes = MAX_LFS_FILESIZE;
>>>> +       /* Assume underlaying fs uses 32bit inodes unless proven otherwise */
>>>> +       ofs->xino_bits = BITS_PER_LONG - 32;
>>>
>>> This disables xino for 32bit archs.  Which is probably the right thing
>>> to do, otherwise there might be a regression in some cases since
>>> kernel will return EOVERFLOW if st_ino would overflow. Well, this is
>>> true for 32bit mode in 64bit kernel as well, so the above is not a
>>> perfect solution.
>>>
>>> Not sure if we need to worry.  For 32bit archs, I think disabling xino
>>> is OK; it can be enabled explicitly if needed.  For 64bit archs, let's
>>> hope it doesn't regress for anybody and if it does, we need to take
>>> steps.
>>
>> Perhaps, the steps would be to implement -o noxino.
>
> And that would need to be the default if selected by a kernel config
> option.   Should we do that now, to prevent surprises?  Maybe we
> should; it does't cost anything in complexity, and I think it's better
> to let the user choose between backward compatibility and new feature.
>

So you mean same as other features?

OVERLAY_FS_XINO defaults to n
and mount options xino=on/off.

Makes sense.
I can do that.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 19:02         ` Amir Goldstein
@ 2018-03-29 19:41           ` Miklos Szeredi
  2018-03-29 19:49             ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Miklos Szeredi @ 2018-03-29 19:41 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Thu, Mar 29, 2018 at 9:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:

> So you mean same as other features?
>
> OVERLAY_FS_XINO defaults to n
> and mount options xino=on/off.

Yes.

> Makes sense.
> I can do that.

Okay, thanks,

Miklos

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 19:41           ` Miklos Szeredi
@ 2018-03-29 19:49             ` Amir Goldstein
  2018-03-29 20:04               ` Miklos Szeredi
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 19:49 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Thu, Mar 29, 2018 at 10:41 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Mar 29, 2018 at 9:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>> So you mean same as other features?
>>
>> OVERLAY_FS_XINO defaults to n
>> and mount options xino=on/off.
>
> Yes.
>

Wait a minute, but if OVERLAY_FS_XINO=y it doesn't mean
that all overlay mounts enforce xino, does it?
I think it should be something like
OVERLAY_FS_XINO_AUTO to determine only if the auto enabling
of xino (i.e. for 32bit file handle encoding) is enabled.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 19:49             ` Amir Goldstein
@ 2018-03-29 20:04               ` Miklos Szeredi
  2018-03-29 22:10                 ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: Miklos Szeredi @ 2018-03-29 20:04 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Thu, Mar 29, 2018 at 9:49 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Mar 29, 2018 at 10:41 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, Mar 29, 2018 at 9:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>>> So you mean same as other features?
>>>
>>> OVERLAY_FS_XINO defaults to n
>>> and mount options xino=on/off.
>>
>> Yes.
>>
>
> Wait a minute, but if OVERLAY_FS_XINO=y it doesn't mean
> that all overlay mounts enforce xino, does it?

Right.

> I think it should be something like
> OVERLAY_FS_XINO_AUTO to determine only if the auto enabling
> of xino (i.e. for 32bit file handle encoding) is enabled.

Okay.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino
  2018-03-29 20:04               ` Miklos Szeredi
@ 2018-03-29 22:10                 ` Amir Goldstein
  0 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-03-29 22:10 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Thu, Mar 29, 2018 at 11:04 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Mar 29, 2018 at 9:49 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Thu, Mar 29, 2018 at 10:41 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Thu, Mar 29, 2018 at 9:02 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>
>>>> So you mean same as other features?
>>>>
>>>> OVERLAY_FS_XINO defaults to n
>>>> and mount options xino=on/off.
>>>
>>> Yes.
>>>
>>
>> Wait a minute, but if OVERLAY_FS_XINO=y it doesn't mean
>> that all overlay mounts enforce xino, does it?
>
> Right.
>
>> I think it should be something like
>> OVERLAY_FS_XINO_AUTO to determine only if the auto enabling
>> of xino (i.e. for 32bit file handle encoding) is enabled.
>

Pushed with this change to ovl-xino-v10

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-03-29 14:18 ` [PATCH v9 6/7] ovl: add support for "xino" mount option Amir Goldstein
  2018-03-29 15:28   ` Miklos Szeredi
@ 2018-04-25 14:49   ` J. R. Okajima
  2018-04-25 15:00     ` Amir Goldstein
  1 sibling, 1 reply; 23+ messages in thread
From: J. R. Okajima @ 2018-04-25 14:49 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

Amir Goldstein:
> The mount option name "xino" goes after a similar meaning mount option
> of aufs, but in overlayfs case, the mapping is stateless.

The name "xino" is really confusing to me.
I know what aufs xino is, and also understand what problem overlayfs
wants to solve. For those who knows these two points,
- the same name with totaly different implementation
- very similar option usage
is very bad.

I never say that "xino" name is patented and only aufs can use it, but
I'd ask you to change the name to help (future) users from confusing.


J. R. Okajima

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-04-25 14:49   ` J. R. Okajima
@ 2018-04-25 15:00     ` Amir Goldstein
  2018-04-25 15:26       ` J. R. Okajima
  0 siblings, 1 reply; 23+ messages in thread
From: Amir Goldstein @ 2018-04-25 15:00 UTC (permalink / raw)
  To: J. R. Okajima; +Cc: Miklos Szeredi, overlayfs

On Wed, Apr 25, 2018 at 7:49 AM, J. R. Okajima <hooanon05g@gmail.com> wrote:
> Amir Goldstein:
>> The mount option name "xino" goes after a similar meaning mount option
>> of aufs, but in overlayfs case, the mapping is stateless.
>
> The name "xino" is really confusing to me.
> I know what aufs xino is, and also understand what problem overlayfs
> wants to solve. For those who knows these two points,
> - the same name with totaly different implementation
> - very similar option usage
> is very bad.
>
> I never say that "xino" name is patented and only aufs can use it, but
> I'd ask you to change the name to help (future) users from confusing.
>

I certainly do not want to be confusing users, but why should users
care about implementation at all? Users that request a feature only
care about the outcome, which is persistent and consistent inode
numbers.

>From aufs documentation, 'xino' is described:
"Use external inode number bitmap and translation table"
The meaning of the feature is exactly the same in overlayfs
from user perspective, besides the fact that there is not actually
an external file to do the translation table. But that is a minor
technical detail that users should not be concerned with.

In my opinion, choosing the same name for the same outcome
is quite the opposite of confusing users.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-04-25 15:00     ` Amir Goldstein
@ 2018-04-25 15:26       ` J. R. Okajima
  2018-04-25 15:35         ` Amir Goldstein
  0 siblings, 1 reply; 23+ messages in thread
From: J. R. Okajima @ 2018-04-25 15:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, overlayfs

Amir Goldstein:
> In my opinion, choosing the same name for the same outcome
> is quite the opposite of confusing users.

If the syntax (or the meaning) of the option was fully compatible, you
might be right.

- aufs: xino=<path> or noxino
- overlayfs: xino=on|off

I am afraid its confusing.


J. R. Okajima

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v9 6/7] ovl: add support for "xino" mount option
  2018-04-25 15:26       ` J. R. Okajima
@ 2018-04-25 15:35         ` Amir Goldstein
  0 siblings, 0 replies; 23+ messages in thread
From: Amir Goldstein @ 2018-04-25 15:35 UTC (permalink / raw)
  To: J. R. Okajima; +Cc: Miklos Szeredi, overlayfs

On Wed, Apr 25, 2018 at 8:26 AM, J. R. Okajima <hooanon05g@gmail.com> wrote:
> Amir Goldstein:
>> In my opinion, choosing the same name for the same outcome
>> is quite the opposite of confusing users.
>
> If the syntax (or the meaning) of the option was fully compatible, you
> might be right.
>
> - aufs: xino=<path> or noxino
> - overlayfs: xino=on|off
>
> I am afraid its confusing.
>

Ok, let's see if users complain.
I have no objection to also support the syntax
xino/noxino in addition to xino=on|off|auto

It's up to Miklos.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-04-25 15:35 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-29 14:18 [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 1/7] ovl: factor out ovl_map_dev_ino() helper Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 2/7] ovl: allocate anon bdev per unique lower fs Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 3/7] ovl: constant st_ino for non-samefs with xino Amir Goldstein
2018-03-29 15:58   ` Miklos Szeredi
2018-03-29 16:42     ` Amir Goldstein
2018-03-29 18:26       ` Miklos Szeredi
2018-03-29 19:02         ` Amir Goldstein
2018-03-29 19:41           ` Miklos Szeredi
2018-03-29 19:49             ` Amir Goldstein
2018-03-29 20:04               ` Miklos Szeredi
2018-03-29 22:10                 ` Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 4/7] ovl: consistent i_ino " Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 5/7] ovl: consistent d_ino " Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 6/7] ovl: add support for "xino" mount option Amir Goldstein
2018-03-29 15:28   ` Miklos Szeredi
2018-03-29 16:37     ` Amir Goldstein
2018-04-25 14:49   ` J. R. Okajima
2018-04-25 15:00     ` Amir Goldstein
2018-04-25 15:26       ` J. R. Okajima
2018-04-25 15:35         ` Amir Goldstein
2018-03-29 14:18 ` [PATCH v9 7/7] ovl: update documentation w.r.t "xino" feature Amir Goldstein
2018-03-29 14:24 ` [PATCH v9 0/7] Overlayfs: constant st_ino/d_ino for non-samefs Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.