linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Sort out overlay ino numbers
@ 2020-01-01 17:58 Amir Goldstein
  2020-01-01 17:58 ` [PATCH 1/7] ovl: fix value of i_ino for lower hardlink corner case Amir Goldstein
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Miklos,

Most of this series is sorting out internal overlayfs messy code
having to do with i_ino.

The value of i_ino was very inconsistent in different setups, so this
series also sorts this out and adds documentation of expected values
for st_ino/d_ino/i_ino in different setups.

Patch #5 fixes a potential inode number collision bug (st_ino;st_dev).
It happens with the "less tested" case of xino bits overflow.
I have recently posted [1] some xfstests which also test overlay i_ino.
They require a /proc/locks fix that was merged to v5.5-rc4.
The test overlay/071 also tests a case of xino bits overflow.

Patch #6 includes a change of behavior, which auto enables xino for
tmpfs/xfs.

The series is available on branch ovl-ino on my github [2] and depends
on the previously posted ovl-layers pathces [3].

[1] https://lore.kernel.org/fstests/20191230141423.31695-1-amir73il@gmail.com
[2] https://github.com/amir73il/linux/commits/ovl-ino
[3] https://marc.info/?l=linux-unionfs&m=157700209100564&w=2

Amir Goldstein (7):
  ovl: fix value of i_ino for lower hardlink corner case
  ovl: fix out of date comment and unreachable code
  ovl: factor out helper ovl_get_root()
  ovl: simplify i_ino initialization
  ovl: avoid possible inode number collisions with xino=on
  ovl: enable xino automatically in more cases
  ovl: document xino expected behavior

 Documentation/filesystems/overlayfs.rst |  38 ++++++++-
 fs/overlayfs/inode.c                    | 101 ++++++++++++++++++------
 fs/overlayfs/overlayfs.h                |  21 ++++-
 fs/overlayfs/readdir.c                  |  17 ++--
 fs/overlayfs/super.c                    |  83 ++++++++++++-------
 fs/overlayfs/util.c                     |  20 -----
 6 files changed, 198 insertions(+), 82 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/7] ovl: fix value of i_ino for lower hardlink corner case
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  2020-01-01 17:58 ` [PATCH 2/7] ovl: fix out of date comment and unreachable code Amir Goldstein
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Commit 6dde1e42f497 ("ovl: make i_ino consistent with st_ino in more
cases"), relaxed the condition nfs_export=on in order to set the value
of i_ino to xino map of real ino.

Specifically, it also relaxed the pre-condition that index=on for
consistent i_ino. This opened the corner case of lower hardlink in
ovl_get_inode(), which calls ovl_fill_inode() with ino=0 and then
ovl_init_inode() is called to set i_ino to lower real ino without
the xino mapping.

Pass the correct values of ino;fsid in this case to ovl_fill_inode(),
so it can initialize i_ino correctly.

Fixes: 6dde1e42f497 ("ovl: make i_ino consistent with st_ino in more ...")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 3afae2e2d0ea..9d2fff7d747d 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -891,7 +891,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
 	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
 					oip->index);
-	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
+	int fsid = bylower ? lowerpath->layer->fsid : 0;
 	bool is_dir, metacopy = false;
 	unsigned long ino = 0;
 	int err = oip->newinode ? -EEXIST : -ENOMEM;
@@ -941,6 +941,8 @@ struct inode *ovl_get_inode(struct super_block *sb,
 			err = -ENOMEM;
 			goto out_err;
 		}
+		ino = realinode->i_ino;
+		fsid = lowerpath->layer->fsid;
 	}
 	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
 	ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/7] ovl: fix out of date comment and unreachable code
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
  2020-01-01 17:58 ` [PATCH 1/7] ovl: fix value of i_ino for lower hardlink corner case Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  2020-01-01 17:58 ` [PATCH 3/7] ovl: factor out helper ovl_get_root() Amir Goldstein
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

ovl_inode_update() is no longer called from create object code path.

Fixes: 01b39dcc9568 ("ovl: use inode_insert5() to hash a newly...")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 8 +++++---
 fs/overlayfs/util.c  | 2 --
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 9d2fff7d747d..9ed94c70e3cb 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -571,9 +571,11 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
 	 * bits to encode layer), set the same value used for st_ino to i_ino,
 	 * so inode number exposed via /proc/locks and a like will be
 	 * consistent with d_ino and st_ino values. An i_ino value inconsistent
-	 * with d_ino also causes nfsd readdirplus to fail.  When called from
-	 * ovl_new_inode(), ino arg is 0, so i_ino will be updated to real
-	 * upper inode i_ino on ovl_inode_init() or ovl_inode_update().
+	 * with d_ino also causes nfsd readdirplus to fail.
+	 *
+	 * When called from ovl_create_object() => ovl_new_inode(), with
+	 * ino = 0, i_ino will be updated to consistent value later on in
+	 * ovl_get_inode() => ovl_fill_inode().
 	 */
 	if (ovl_same_dev(inode->i_sb)) {
 		inode->i_ino = ino;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 256f166b4a17..b28ccede1da9 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -418,8 +418,6 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
 	smp_wmb();
 	OVL_I(inode)->__upperdentry = upperdentry;
 	if (inode_unhashed(inode)) {
-		if (!inode->i_ino)
-			inode->i_ino = upperinode->i_ino;
 		inode->i_private = upperinode;
 		__insert_inode_hash(inode, (unsigned long) upperinode);
 	}
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/7] ovl: factor out helper ovl_get_root()
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
  2020-01-01 17:58 ` [PATCH 1/7] ovl: fix value of i_ino for lower hardlink corner case Amir Goldstein
  2020-01-01 17:58 ` [PATCH 2/7] ovl: fix out of date comment and unreachable code Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  2020-01-01 17:58 ` [PATCH 4/7] ovl: simplify i_ino initialization Amir Goldstein
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Allocates and initializes the root dentry and inode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/super.c | 44 +++++++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 2afa60ab9133..18db065d73b3 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1581,6 +1581,34 @@ static int ovl_check_overlapping_layers(struct super_block *sb,
 	return 0;
 }
 
+static struct dentry *ovl_get_root(struct super_block *sb,
+				   struct dentry *upperdentry,
+				   struct ovl_entry *oe)
+{
+	struct dentry *root;
+
+	root = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
+	if (!root)
+		return NULL;
+
+	root->d_fsdata = oe;
+
+	if (upperdentry) {
+		ovl_dentry_set_upper_alias(root);
+		if (ovl_is_impuredir(upperdentry))
+			ovl_set_flag(OVL_IMPURE, d_inode(root));
+	}
+
+	/* Root is always merge -> can have whiteouts */
+	ovl_set_flag(OVL_WHITEOUTS, d_inode(root));
+	ovl_dentry_set_flag(OVL_E_CONNECTED, root);
+	ovl_set_upperdata(d_inode(root));
+	ovl_inode_init(d_inode(root), upperdentry, ovl_dentry_lower(root),
+		       NULL);
+
+	return root;
+}
+
 static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 {
 	struct path upperpath = { };
@@ -1697,25 +1725,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_flags |= SB_POSIXACL;
 
 	err = -ENOMEM;
-	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
+	root_dentry = ovl_get_root(sb, upperpath.dentry, oe);
 	if (!root_dentry)
 		goto out_free_oe;
 
-	root_dentry->d_fsdata = oe;
-
 	mntput(upperpath.mnt);
-	if (upperpath.dentry) {
-		ovl_dentry_set_upper_alias(root_dentry);
-		if (ovl_is_impuredir(upperpath.dentry))
-			ovl_set_flag(OVL_IMPURE, d_inode(root_dentry));
-	}
-
-	/* Root is always merge -> can have whiteouts */
-	ovl_set_flag(OVL_WHITEOUTS, d_inode(root_dentry));
-	ovl_dentry_set_flag(OVL_E_CONNECTED, root_dentry);
-	ovl_set_upperdata(d_inode(root_dentry));
-	ovl_inode_init(d_inode(root_dentry), upperpath.dentry,
-		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/7] ovl: simplify i_ino initialization
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
                   ` (2 preceding siblings ...)
  2020-01-01 17:58 ` [PATCH 3/7] ovl: factor out helper ovl_get_root() Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  2020-01-01 17:58 ` [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on Amir Goldstein
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Move i_ino initialization to ovl_inode_init() to avoid the dance
of setting i_ino in ovl_fill_inode() sometimes on the first call
and sometimes on the seconds call.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 35 ++++++++++++++++++++++++++---------
 fs/overlayfs/overlayfs.h |  4 ++--
 fs/overlayfs/super.c     | 13 +++++++++++--
 fs/overlayfs/util.c      | 18 ------------------
 4 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 9ed94c70e3cb..04e8e8de2012 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -561,8 +561,7 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
 #endif
 }
 
-static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
-			   unsigned long ino, int fsid)
+static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
 {
 	int xinobits = ovl_xino_bits(inode->i_sb);
 
@@ -572,10 +571,6 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
 	 * so inode number exposed via /proc/locks and a like will be
 	 * consistent with d_ino and st_ino values. An i_ino value inconsistent
 	 * with d_ino also causes nfsd readdirplus to fail.
-	 *
-	 * When called from ovl_create_object() => ovl_new_inode(), with
-	 * ino = 0, i_ino will be updated to consistent value later on in
-	 * ovl_get_inode() => ovl_fill_inode().
 	 */
 	if (ovl_same_dev(inode->i_sb)) {
 		inode->i_ino = ino;
@@ -584,6 +579,28 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
 	} else {
 		inode->i_ino = get_next_ino();
 	}
+}
+
+void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
+		    unsigned long ino, int fsid)
+{
+	struct inode *realinode;
+
+	if (oip->upperdentry)
+		OVL_I(inode)->__upperdentry = oip->upperdentry;
+	if (oip->lowerpath && oip->lowerpath->dentry)
+		OVL_I(inode)->lower = igrab(d_inode(oip->lowerpath->dentry));
+	if (oip->lowerdata)
+		OVL_I(inode)->lowerdata = igrab(d_inode(oip->lowerdata));
+
+	realinode = ovl_inode_real(inode);
+	ovl_copyattr(realinode, inode);
+	ovl_copyflags(realinode, inode);
+	ovl_map_ino(inode, ino, fsid);
+}
+
+static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev)
+{
 	inode->i_mode = mode;
 	inode->i_flags |= S_NOCMTIME;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -721,7 +738,7 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 
 	inode = new_inode(sb);
 	if (inode)
-		ovl_fill_inode(inode, mode, rdev, 0, 0);
+		ovl_fill_inode(inode, mode, rdev);
 
 	return inode;
 }
@@ -946,8 +963,8 @@ struct inode *ovl_get_inode(struct super_block *sb,
 		ino = realinode->i_ino;
 		fsid = lowerpath->layer->fsid;
 	}
-	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid);
-	ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata);
+	ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
+	ovl_inode_init(inode, oip, ino, fsid);
 
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f7d01c06cdaf..140510d24d9f 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -269,8 +269,6 @@ void ovl_set_upperdata(struct inode *inode);
 bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
-void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry, struct dentry *lowerdata);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
 void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
@@ -412,6 +410,8 @@ struct ovl_inode_params {
 	char *redirect;
 	struct dentry *lowerdata;
 };
+void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
+		    unsigned long ino, int fsid);
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 18db065d73b3..d072f982d3de 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1586,6 +1586,13 @@ static struct dentry *ovl_get_root(struct super_block *sb,
 				   struct ovl_entry *oe)
 {
 	struct dentry *root;
+	struct ovl_path *lowerpath = &oe->lowerstack[0];
+	unsigned long ino = d_inode(lowerpath->dentry)->i_ino;
+	int fsid = lowerpath->layer->fsid;
+	struct ovl_inode_params oip = {
+		.upperdentry = upperdentry,
+		.lowerpath = lowerpath,
+	};
 
 	root = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
 	if (!root)
@@ -1594,6 +1601,9 @@ static struct dentry *ovl_get_root(struct super_block *sb,
 	root->d_fsdata = oe;
 
 	if (upperdentry) {
+		/* Root inode uses upper st_ino/i_ino */
+		ino = d_inode(upperdentry)->i_ino;
+		fsid = 0;
 		ovl_dentry_set_upper_alias(root);
 		if (ovl_is_impuredir(upperdentry))
 			ovl_set_flag(OVL_IMPURE, d_inode(root));
@@ -1603,8 +1613,7 @@ static struct dentry *ovl_get_root(struct super_block *sb,
 	ovl_set_flag(OVL_WHITEOUTS, d_inode(root));
 	ovl_dentry_set_flag(OVL_E_CONNECTED, root);
 	ovl_set_upperdata(d_inode(root));
-	ovl_inode_init(d_inode(root), upperdentry, ovl_dentry_lower(root),
-		       NULL);
+	ovl_inode_init(d_inode(root), &oip, ino, fsid);
 
 	return root;
 }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index b28ccede1da9..d6e3f8f42647 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -388,24 +388,6 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
 	oi->redirect = redirect;
 }
 
-void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
-		    struct dentry *lowerdentry, struct dentry *lowerdata)
-{
-	struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
-
-	if (upperdentry)
-		OVL_I(inode)->__upperdentry = upperdentry;
-	if (lowerdentry)
-		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
-	if (lowerdata)
-		OVL_I(inode)->lowerdata = igrab(d_inode(lowerdata));
-
-	ovl_copyattr(realinode, inode);
-	ovl_copyflags(realinode, inode);
-	if (!inode->i_ino)
-		inode->i_ino = realinode->i_ino;
-}
-
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
 {
 	struct inode *upperinode = d_inode(upperdentry);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
                   ` (3 preceding siblings ...)
  2020-01-01 17:58 ` [PATCH 4/7] ovl: simplify i_ino initialization Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  2020-02-19 14:25   ` Miklos Szeredi
  2020-01-01 17:58 ` [PATCH 6/7] ovl: enable xino automatically in more cases Amir Goldstein
  2020-01-01 17:58 ` [PATCH 7/7] ovl: document xino expected behavior Amir Goldstein
  6 siblings, 1 reply; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

When xino feature is enabled and a real directory inode number
overflows the lower xino bits, we cannot map this directory inode
number to a unique and persistent inode number and we fall back to
the real inode st_ino and overlay st_dev.

The real inode st_ino with high bits may collide with a lower inode
number on overlay st_dev that was mapped using xino.

To avoid possible collision with legitimate xino values, map a non
persistent inode number to a dedicated range in the xino address space.
The dedicated range is created by adding one more bit to the number of
reserved high xino bits.  We could have added just one more fsid, but
that would have had the undesired effect of changing persistent overlay
inode numbers on kernel or require more complex xino mapping code.

The non persistent inode number is allocated with get_next_ino()
and stored in i_generation.  To reduce the burn rate of get_next_ino()
numbers in the system, we avoid calling get_next_ino() on non dir,
because we are not going to use it and we reuse i_generation from
recycled ovl_inode objects.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 54 ++++++++++++++++++++++++++++++++--------
 fs/overlayfs/overlayfs.h |  7 ++++++
 fs/overlayfs/super.c     | 28 ++++++++++++---------
 3 files changed, 66 insertions(+), 23 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 04e8e8de2012..415d9efa4799 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -79,6 +79,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 {
 	bool samefs = ovl_same_fs(dentry->d_sb);
 	unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
+	unsigned int shift = 64 - xinobits;
 
 	if (samefs) {
 		/*
@@ -89,7 +90,6 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 		stat->dev = dentry->d_sb->s_dev;
 		return 0;
 	} else if (xinobits) {
-		unsigned int shift = 64 - xinobits;
 		/*
 		 * All inode numbers of underlying fs should not be using the
 		 * high xinobits, so we use high xinobits to partition the
@@ -116,11 +116,25 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 		 * overlay mount boundaries.
 		 *
 		 * If not all layers are on the same fs the pair {real st_ino;
-		 * overlay st_dev} is not unique, so use the non persistent
-		 * overlay st_ino for directories.
+		 * overlay st_dev} is not unique, so use a non persistent
+		 * overlay st_ino for directories, which is allocated with
+		 * get_next_ino() and stored in i_generation.
+		 *
+		 * To avoid ino collision with legitimate xino values from upper
+		 * layer (fsid 0), use fsid 1 to map the non persistent inode
+		 * numbers to the unified st_ino address space.
+		 *
+		 * In this case (xino bits overflow on directory inode), the
+		 * value of overlay inode st_ino will not be consistent with
+		 * d_ino and i_ino. i_ino will have the value of the real inode
+		 * i_ino and d_ino will have either the value of i_ino or the
+		 * value of st_ino, depending on the directory iteration mode
+		 * that is used on the parent (i.e. real/merge/impure).
 		 */
 		stat->dev = dentry->d_sb->s_dev;
-		stat->ino = dentry->d_inode->i_ino;
+		stat->ino = dentry->d_inode->i_generation;
+		if (xinobits)
+			stat->ino |= ((u64)1) << shift;
 	} else {
 		/*
 		 * For non-samefs setup, if we cannot map all layers st_ino
@@ -128,7 +142,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 		 * is unique per underlying fs, so we use the unique anonymous
 		 * bdev assigned to the underlying fs.
 		 */
-		stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
+		stat->dev = ovl_dentry_fs(dentry, fsid)->pseudo_dev;
 	}
 
 	return 0;
@@ -564,6 +578,7 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
 static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
 {
 	int xinobits = ovl_xino_bits(inode->i_sb);
+	bool overflow = ino >> (64 - xinobits);
 
 	/*
 	 * When d_ino is consistent with st_ino (samefs or i_ino has enough
@@ -571,13 +586,30 @@ static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
 	 * so inode number exposed via /proc/locks and a like will be
 	 * consistent with d_ino and st_ino values. An i_ino value inconsistent
 	 * with d_ino also causes nfsd readdirplus to fail.
+	 *
+	 * When real inode bits overflow into xino bits, we leave i_ino value as
+	 * the real inode to be consitent with d_ino.  For directory inodes on
+	 * non-samefs with xino disabled or xino overflow, we reserve a unique
+	 * 32bit generation number, to be used for resolving st_ino collisions
+	 * in ovl_map_dev_ino(). With xino disabled, this 32bit number is also
+	 * used as i_ino.
 	 */
-	if (ovl_same_dev(inode->i_sb)) {
-		inode->i_ino = ino;
-		if (xinobits && fsid && !(ino >> (64 - xinobits)))
-			inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
-	} else {
-		inode->i_ino = get_next_ino();
+	inode->i_ino = ino;
+	if (ovl_same_dev(inode->i_sb) && !overflow) {
+		inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
+	} else if (S_ISDIR(inode->i_mode)) {
+		/*
+		 * Reuse unique 32bit ino from recycled ovl_inode object.
+		 * get_next_ino() wraps around at 32bit, but may be extended
+		 * to 64bit in the future, so be prepared.
+		 */
+		if (!inode->i_generation) {
+			inode->i_generation = (u32)get_next_ino();
+			if (unlikely(!inode->i_generation))
+				inode->i_generation = (u32)get_next_ino();
+		}
+		if (!ovl_same_dev(inode->i_sb))
+			inode->i_ino = inode->i_generation;
 	}
 }
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 140510d24d9f..c0b15fd2b395 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -311,6 +311,13 @@ static inline unsigned int ovl_xino_bits(struct super_block *sb)
 	return ofs->xino_bits;
 }
 
+static inline struct ovl_sb *ovl_dentry_fs(struct dentry *dentry, int fsid)
+{
+	/* fsid bit 1 is reserved for non persistent ino range */
+	WARN_ON_ONCE(fsid & 1);
+	return &OVL_FS(dentry->d_sb)->fs[fsid >> 1];
+}
+
 /* All layers on same fs? */
 static inline bool ovl_same_fs(struct super_block *sb)
 {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d072f982d3de..d636a23df541 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1272,8 +1272,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
 	return true;
 }
 
-/* Get a unique fsid for the layer */
-static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
+/* Get a unique fs for the layer */
+static int ovl_get_fs(struct ovl_fs *ofs, const struct path *path)
 {
 	struct super_block *sb = path->mnt->mnt_sb;
 	unsigned int i;
@@ -1353,9 +1353,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	for (i = 0; i < numlower; i++) {
 		struct vfsmount *mnt;
 		struct inode *trap;
-		int fsid;
+		int n;
 
-		err = fsid = ovl_get_fsid(ofs, &stack[i]);
+		err = n = ovl_get_fs(ofs, &stack[i]);
 		if (err < 0)
 			goto out;
 
@@ -1387,9 +1387,10 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 		ofs->layers[ofs->numlower].trap = trap;
 		ofs->layers[ofs->numlower].mnt = mnt;
 		ofs->layers[ofs->numlower].idx = ofs->numlower;
-		ofs->layers[ofs->numlower].fsid = fsid;
-		ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
-		ofs->fs[fsid].is_lower = true;
+		/* fsid bit 1 is reserved for non persistent ino range */
+		ofs->layers[ofs->numlower].fsid = n << 1;
+		ofs->layers[ofs->numlower].fs = &ofs->fs[n];
+		ofs->fs[n].is_lower = true;
 	}
 
 	/*
@@ -1398,7 +1399,8 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	 * free high bits in underlying fs to hold the unique fsid.
 	 * If overlayfs does encounter underlying inodes using the high xino
 	 * bits reserved for fsid, it emits a warning and uses the original
-	 * inode number.
+	 * inode number or a non persistent inode number allocated from a
+	 * dedicated range.
 	 */
 	if (ofs->numfs == 1 || (ofs->numfs == 2 && !ofs->upper_mnt)) {
 		if (ofs->config.xino == OVL_XINO_ON)
@@ -1408,11 +1410,13 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 	} else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
 		/*
 		 * This is a roundup of number of bits needed for encoding
-		 * fsid, where fsid 0 is reserved for upper fs even with
-		 * lower only overlay.
+		 * fsid, where fsid 0 is reserved for upper fs (even with
+		 * lower only overlay) and fsid bit 1 is reserved for the non
+		 * persistent inode number range that is used for resolving
+		 * xino lower bits overflow.
 		 */
-		BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
-		ofs->xino_bits = ilog2(ofs->numfs - 1) + 1;
+		BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 30);
+		ofs->xino_bits = ilog2(ofs->numfs - 1) + 2;
 	}
 
 	if (ofs->xino_bits) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/7] ovl: enable xino automatically in more cases
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
                   ` (4 preceding siblings ...)
  2020-01-01 17:58 ` [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  2020-01-01 17:58 ` [PATCH 7/7] ovl: document xino expected behavior Amir Goldstein
  6 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

So far, with xino=auto, we only enable xino if we know that all
underlying filesystem use 32bit inode numbers.

When users configure overlay with xino=auto, they already declare that
they are ready to handle 64bit inode number from overlay.

It is a very common case, that underlying filesystem uses 64bit ino,
but rarely or never uses the high inode number bits (e.g. tmpfs, xfs).
Leaving it for the users to declare high ino bits are unused with
xino=on is not a recipe for many users to enjoy the benefits of xino.

There appears to be very little reason not to enable xino when users
declare xino=auto even if we do not know how many bits underlying
filesystem uses for inode numbers.

In the worst case of xino bits overflow by real inode number, we
already fall back to the non-xino behavior - real inode number with
unique pseudo dev or to non persistent inode number and overlay st_dev
(for directories).

The only annoyance from auto enabling xino is that xino bits overflow
emits a warning to kmsg. Suppress those warnings unless users explicitly
asked for xino=on, suggesting that they expected high ino bits to be
unused by underlying filesystem.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |  8 ++++----
 fs/overlayfs/overlayfs.h | 10 ++++++++++
 fs/overlayfs/readdir.c   | 17 +++++++++++------
 fs/overlayfs/super.c     |  2 +-
 4 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 415d9efa4799..7b94f0338536 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -98,13 +98,13 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
 		 * and all inodes use overlay st_dev. Inode numbers are also
 		 * persistent for a given layer configuration.
 		 */
-		if (stat->ino >> shift) {
-			pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
-					    dentry, stat->ino, xinobits);
-		} else {
+		if (likely(!(stat->ino >> shift))) {
 			stat->ino |= ((u64)fsid) << shift;
 			stat->dev = dentry->d_sb->s_dev;
 			return 0;
+		} else if (ovl_xino_warn(dentry->d_sb)) {
+			pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
+					    dentry, stat->ino, xinobits);
 		}
 	}
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index c0b15fd2b395..667e8096f56c 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -318,6 +318,16 @@ static inline struct ovl_sb *ovl_dentry_fs(struct dentry *dentry, int fsid)
 	return &OVL_FS(dentry->d_sb)->fs[fsid >> 1];
 }
 
+/*
+ * With xino=auto, we do best effort to keep all inodes on same st_dev and
+ * d_ino consistent with st_ino.
+ * With xino=on, we do the same effort but we warn if we failed.
+ */
+static inline bool ovl_xino_warn(struct super_block *sb)
+{
+	return OVL_FS(sb)->config.xino == OVL_XINO_ON;
+}
+
 /* All layers on same fs? */
 static inline bool ovl_same_fs(struct super_block *sb)
 {
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 0f5ab53b4184..7dbf3df99150 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -438,11 +438,13 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
 
 /* Map inode number to lower fs unique range */
 static u64 ovl_remap_lower_ino(u64 ino, int xinobits, int fsid,
-			       const char *name, int namelen)
+			       const char *name, int namelen, bool warn)
 {
-	if (ino >> (64 - xinobits)) {
-		pr_warn_ratelimited("overlayfs: d_ino too big (%.*s, ino=%llu, xinobits=%d)\n",
-				    namelen, name, ino, xinobits);
+	if (unlikely(ino >> (64 - xinobits))) {
+		if (warn) {
+			pr_warn_ratelimited("overlayfs: d_ino too big (%.*s, ino=%llu, xinobits=%d)\n",
+					    namelen, name, ino, xinobits);
+		}
 		return ino;
 	}
 
@@ -515,7 +517,8 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
 	} else if (xinobits && !OVL_TYPE_UPPER(type)) {
 		ino = ovl_remap_lower_ino(ino, xinobits,
 					  ovl_dentry_layer(this)->fsid,
-					  p->name, p->len);
+					  p->name, p->len,
+					  ovl_xino_warn(dir->d_sb));
 	}
 
 out:
@@ -645,6 +648,7 @@ struct ovl_readdir_translate {
 	u64 parent_ino;
 	int fsid;
 	int xinobits;
+	bool xinowarn;
 };
 
 static int ovl_fill_real(struct dir_context *ctx, const char *name,
@@ -665,7 +669,7 @@ static int ovl_fill_real(struct dir_context *ctx, const char *name,
 			ino = p->ino;
 	} else if (rdt->xinobits) {
 		ino = ovl_remap_lower_ino(ino, rdt->xinobits, rdt->fsid,
-					  name, namelen);
+					  name, namelen, rdt->xinowarn);
 	}
 
 	return orig_ctx->actor(orig_ctx, name, namelen, offset, ino, d_type);
@@ -695,6 +699,7 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 		.ctx.actor = ovl_fill_real,
 		.orig_ctx = ctx,
 		.xinobits = ovl_xino_bits(dir->d_sb),
+		.xinowarn = ovl_xino_warn(dir->d_sb),
 	};
 
 	if (rdt.xinobits)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d636a23df541..ca3204fe87bc 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1407,7 +1407,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 			pr_info("overlayfs: \"xino=on\" is useless with all layers on same fs, ignore.\n");
 		ofs->xino_bits = 0;
 		ofs->config.xino = OVL_XINO_SAME_FS;
-	} else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
+	} else if (ofs->config.xino != OVL_XINO_OFF && !ofs->xino_bits) {
 		/*
 		 * This is a roundup of number of bits needed for encoding
 		 * fsid, where fsid 0 is reserved for upper fs (even with
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/7] ovl: document xino expected behavior
  2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
                   ` (5 preceding siblings ...)
  2020-01-01 17:58 ` [PATCH 6/7] ovl: enable xino automatically in more cases Amir Goldstein
@ 2020-01-01 17:58 ` Amir Goldstein
  6 siblings, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-01-01 17:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Summarize the inode properties of different configurations in a table.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 Documentation/filesystems/overlayfs.rst | 38 +++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
index e443be7928db..fcd8537b8402 100644
--- a/Documentation/filesystems/overlayfs.rst
+++ b/Documentation/filesystems/overlayfs.rst
@@ -40,13 +40,46 @@ On 64bit systems, even if all overlay layers are not on the same
 underlying filesystem, the same compliant behavior could be achieved
 with the "xino" feature.  The "xino" feature composes a unique object
 identifier from the real object st_ino and an underlying fsid index.
+
 If all underlying filesystems support NFS file handles and export file
 handles with 32bit inode number encoding (e.g. ext4), overlay filesystem
 will use the high inode number bits for fsid.  Even when the underlying
 filesystem uses 64bit inode numbers, users can still enable the "xino"
 feature with the "-o xino=on" overlay mount option.  That is useful for the
 case of underlying filesystems like xfs and tmpfs, which use 64bit inode
-numbers, but are very unlikely to use the high inode number bit.
+numbers, but are very unlikely to use the high inode number bits.  In case
+the underlying inode number does overflow into the high xino bits, overlay
+filesystem will fall back to the non xino behavior for that inode.
+
+The following table summarizes what can be expected in different overlay
+configurations.
+
+Inode properties
+````````````````
+
++--------------+------------+------------+-----------------+----------------+
+|Configuration | Persistent | Uniform    | st_ino == d_ino | d_ino == i_ino |
+|              | st_ino     | st_dev     |                 | [*]            |
++==============+=====+======+=====+======+========+========+========+=======+
+|              | dir | !dir | dir | !dir |  dir   +  !dir  |  dir   | !dir  |
++--------------+-----+------+-----+------+--------+--------+--------+-------+
+| All layers   |  Y  |  Y   |  Y  |  Y   |  Y     |   Y    |  Y     |  Y    |
+| on same fs   |     |      |     |      |        |        |        |       |
++--------------+-----+------+-----+------+--------+--------+--------+-------+
+| Layers not   |  N  |  Y   |  Y  |  N   |  N     |   Y    |  N     |  Y    |
+| on same fs,  |     |      |     |      |        |        |        |       |
+| xino=off     |     |      |     |      |        |        |        |       |
++--------------+-----+------+-----+------+--------+--------+--------+-------+
+| xino=on/auto |  Y  |  Y   |  Y  |  Y   |  Y     |   Y    |  Y     |  Y    |
+|              |     |      |     |      |        |        |        |       |
++--------------+-----+------+-----+------+--------+--------+--------+-------+
+| xino=on/auto,|  N  |  Y   |  Y  |  N   |  N     |   Y    |  N     |  Y    |
+| ino overflow |     |      |     |      |        |        |        |       |
++--------------+-----+------+-----+------+--------+--------+--------+-------+
+
+[*] nfsd v3 readdirplus verifies d_ino == i_ino. i_ino is exposed via several
+/proc files, such as /proc/locks and /proc/self/fdinfo/<fd> of an inotify
+file descriptor.
 
 
 Upper and Lower
@@ -383,7 +416,8 @@ guarantee that the values of st_ino and st_dev returned by stat(2) and the
 value of d_ino returned by readdir(3) will act like on a normal filesystem.
 E.g. the value of st_dev may be different for two objects in the same
 overlay filesystem and the value of st_ino for directory objects may not be
-persistent and could change even while the overlay filesystem is mounted.
+persistent and could change even while the overlay filesystem is mounted, as
+summarized in the `Inode properties`_ table above.
 
 
 Changes to underlying filesystems
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-01-01 17:58 ` [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on Amir Goldstein
@ 2020-02-19 14:25   ` Miklos Szeredi
  2020-02-19 15:28     ` Amir Goldstein
  0 siblings, 1 reply; 14+ messages in thread
From: Miklos Szeredi @ 2020-02-19 14:25 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Wed, Jan 1, 2020 at 6:58 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> When xino feature is enabled and a real directory inode number
> overflows the lower xino bits, we cannot map this directory inode
> number to a unique and persistent inode number and we fall back to
> the real inode st_ino and overlay st_dev.
>
> The real inode st_ino with high bits may collide with a lower inode
> number on overlay st_dev that was mapped using xino.
>
> To avoid possible collision with legitimate xino values, map a non
> persistent inode number to a dedicated range in the xino address space.
> The dedicated range is created by adding one more bit to the number of
> reserved high xino bits.  We could have added just one more fsid, but
> that would have had the undesired effect of changing persistent overlay
> inode numbers on kernel or require more complex xino mapping code.
>
> The non persistent inode number is allocated with get_next_ino()
> and stored in i_generation.  To reduce the burn rate of get_next_ino()
> numbers in the system, we avoid calling get_next_ino() on non dir,
> because we are not going to use it and we reuse i_generation from
> recycled ovl_inode objects.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/inode.c     | 54 ++++++++++++++++++++++++++++++++--------
>  fs/overlayfs/overlayfs.h |  7 ++++++
>  fs/overlayfs/super.c     | 28 ++++++++++++---------
>  3 files changed, 66 insertions(+), 23 deletions(-)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 04e8e8de2012..415d9efa4799 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -79,6 +79,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
>  {
>         bool samefs = ovl_same_fs(dentry->d_sb);
>         unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
> +       unsigned int shift = 64 - xinobits;
>
>         if (samefs) {
>                 /*
> @@ -89,7 +90,6 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
>                 stat->dev = dentry->d_sb->s_dev;
>                 return 0;
>         } else if (xinobits) {
> -               unsigned int shift = 64 - xinobits;
>                 /*
>                  * All inode numbers of underlying fs should not be using the
>                  * high xinobits, so we use high xinobits to partition the
> @@ -116,11 +116,25 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
>                  * overlay mount boundaries.
>                  *
>                  * If not all layers are on the same fs the pair {real st_ino;
> -                * overlay st_dev} is not unique, so use the non persistent
> -                * overlay st_ino for directories.
> +                * overlay st_dev} is not unique, so use a non persistent
> +                * overlay st_ino for directories, which is allocated with
> +                * get_next_ino() and stored in i_generation.
> +                *
> +                * To avoid ino collision with legitimate xino values from upper
> +                * layer (fsid 0), use fsid 1 to map the non persistent inode
> +                * numbers to the unified st_ino address space.
> +                *
> +                * In this case (xino bits overflow on directory inode), the
> +                * value of overlay inode st_ino will not be consistent with
> +                * d_ino and i_ino. i_ino will have the value of the real inode
> +                * i_ino and d_ino will have either the value of i_ino or the
> +                * value of st_ino, depending on the directory iteration mode
> +                * that is used on the parent (i.e. real/merge/impure).
>                  */
>                 stat->dev = dentry->d_sb->s_dev;
> -               stat->ino = dentry->d_inode->i_ino;
> +               stat->ino = dentry->d_inode->i_generation;

I think we should avoid misusing i_generation for this.  It's
confusing and not woth it, IMO.

> +               if (xinobits)
> +                       stat->ino |= ((u64)1) << shift;

I don't like this magic shifting.


>         } else {
>                 /*
>                  * For non-samefs setup, if we cannot map all layers st_ino
> @@ -128,7 +142,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
>                  * is unique per underlying fs, so we use the unique anonymous
>                  * bdev assigned to the underlying fs.
>                  */
> -               stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
> +               stat->dev = ovl_dentry_fs(dentry, fsid)->pseudo_dev;
>         }
>
>         return 0;
> @@ -564,6 +578,7 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
>  static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
>  {
>         int xinobits = ovl_xino_bits(inode->i_sb);
> +       bool overflow = ino >> (64 - xinobits);
>
>         /*
>          * When d_ino is consistent with st_ino (samefs or i_ino has enough
> @@ -571,13 +586,30 @@ static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
>          * so inode number exposed via /proc/locks and a like will be
>          * consistent with d_ino and st_ino values. An i_ino value inconsistent
>          * with d_ino also causes nfsd readdirplus to fail.
> +        *
> +        * When real inode bits overflow into xino bits, we leave i_ino value as
> +        * the real inode to be consitent with d_ino.  For directory inodes on
> +        * non-samefs with xino disabled or xino overflow, we reserve a unique
> +        * 32bit generation number, to be used for resolving st_ino collisions
> +        * in ovl_map_dev_ino(). With xino disabled, this 32bit number is also
> +        * used as i_ino.
>          */
> -       if (ovl_same_dev(inode->i_sb)) {
> -               inode->i_ino = ino;
> -               if (xinobits && fsid && !(ino >> (64 - xinobits)))
> -                       inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
> -       } else {
> -               inode->i_ino = get_next_ino();
> +       inode->i_ino = ino;
> +       if (ovl_same_dev(inode->i_sb) && !overflow) {
> +               inode->i_ino |= (unsigned long)fsid << (64 - xinobits);

While this makes sense on 64bit arch, it's going to overflow on 32bit
(due to i_ino being "unsigned long").

> +       } else if (S_ISDIR(inode->i_mode)) {
> +               /*
> +                * Reuse unique 32bit ino from recycled ovl_inode object.
> +                * get_next_ino() wraps around at 32bit, but may be extended
> +                * to 64bit in the future, so be prepared.
> +                */
> +               if (!inode->i_generation) {
> +                       inode->i_generation = (u32)get_next_ino();
> +                       if (unlikely(!inode->i_generation))
> +                               inode->i_generation = (u32)get_next_ino();
> +               }

And this is really messy.   How about an atomic64_t counter in ovl_fs
instead?  Do we really care about the performance of inode allocation
for the overflow case?

> +               if (!ovl_same_dev(inode->i_sb))
> +                       inode->i_ino = inode->i_generation;
>         }
>  }
>
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 140510d24d9f..c0b15fd2b395 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -311,6 +311,13 @@ static inline unsigned int ovl_xino_bits(struct super_block *sb)
>         return ofs->xino_bits;
>  }
>
> +static inline struct ovl_sb *ovl_dentry_fs(struct dentry *dentry, int fsid)
> +{
> +       /* fsid bit 1 is reserved for non persistent ino range */
> +       WARN_ON_ONCE(fsid & 1);
> +       return &OVL_FS(dentry->d_sb)->fs[fsid >> 1];

Again, magic shifts.  Would be so much nicer to leave the fsid
definition as the index into the ->fs array and deal with this when
creating the st_ino/i_ino values.

> +}
> +
>  /* All layers on same fs? */
>  static inline bool ovl_same_fs(struct super_block *sb)
>  {
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index d072f982d3de..d636a23df541 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -1272,8 +1272,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
>         return true;
>  }
>
> -/* Get a unique fsid for the layer */
> -static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
> +/* Get a unique fs for the layer */
> +static int ovl_get_fs(struct ovl_fs *ofs, const struct path *path)
>  {
>         struct super_block *sb = path->mnt->mnt_sb;
>         unsigned int i;
> @@ -1353,9 +1353,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>         for (i = 0; i < numlower; i++) {
>                 struct vfsmount *mnt;
>                 struct inode *trap;
> -               int fsid;
> +               int n;
>
> -               err = fsid = ovl_get_fsid(ofs, &stack[i]);
> +               err = n = ovl_get_fs(ofs, &stack[i]);
>                 if (err < 0)
>                         goto out;
>
> @@ -1387,9 +1387,10 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>                 ofs->layers[ofs->numlower].trap = trap;
>                 ofs->layers[ofs->numlower].mnt = mnt;
>                 ofs->layers[ofs->numlower].idx = ofs->numlower;
> -               ofs->layers[ofs->numlower].fsid = fsid;
> -               ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
> -               ofs->fs[fsid].is_lower = true;
> +               /* fsid bit 1 is reserved for non persistent ino range */
> +               ofs->layers[ofs->numlower].fsid = n << 1;
> +               ofs->layers[ofs->numlower].fs = &ofs->fs[n];
> +               ofs->fs[n].is_lower = true;
>         }
>
>         /*
> @@ -1398,7 +1399,8 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>          * free high bits in underlying fs to hold the unique fsid.
>          * If overlayfs does encounter underlying inodes using the high xino
>          * bits reserved for fsid, it emits a warning and uses the original
> -        * inode number.
> +        * inode number or a non persistent inode number allocated from a
> +        * dedicated range.
>          */
>         if (ofs->numfs == 1 || (ofs->numfs == 2 && !ofs->upper_mnt)) {
>                 if (ofs->config.xino == OVL_XINO_ON)
> @@ -1408,11 +1410,13 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>         } else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
>                 /*
>                  * This is a roundup of number of bits needed for encoding
> -                * fsid, where fsid 0 is reserved for upper fs even with
> -                * lower only overlay.
> +                * fsid, where fsid 0 is reserved for upper fs (even with
> +                * lower only overlay) and fsid bit 1 is reserved for the non
> +                * persistent inode number range that is used for resolving
> +                * xino lower bits overflow.
>                  */
> -               BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
> -               ofs->xino_bits = ilog2(ofs->numfs - 1) + 1;
> +               BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 30);
> +               ofs->xino_bits = ilog2(ofs->numfs - 1) + 2;


Just realized, this patch is obsolete (xino_bits ->xino_mode).

Anyway, I think the comments are valid for the updated patch as well.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-02-19 14:25   ` Miklos Szeredi
@ 2020-02-19 15:28     ` Amir Goldstein
  2020-02-19 15:36       ` Miklos Szeredi
  0 siblings, 1 reply; 14+ messages in thread
From: Amir Goldstein @ 2020-02-19 15:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Wed, Feb 19, 2020 at 4:25 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, Jan 1, 2020 at 6:58 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > When xino feature is enabled and a real directory inode number
> > overflows the lower xino bits, we cannot map this directory inode
> > number to a unique and persistent inode number and we fall back to
> > the real inode st_ino and overlay st_dev.
> >
> > The real inode st_ino with high bits may collide with a lower inode
> > number on overlay st_dev that was mapped using xino.
> >
> > To avoid possible collision with legitimate xino values, map a non
> > persistent inode number to a dedicated range in the xino address space.
> > The dedicated range is created by adding one more bit to the number of
> > reserved high xino bits.  We could have added just one more fsid, but
> > that would have had the undesired effect of changing persistent overlay
> > inode numbers on kernel or require more complex xino mapping code.
> >
> > The non persistent inode number is allocated with get_next_ino()
> > and stored in i_generation.  To reduce the burn rate of get_next_ino()
> > numbers in the system, we avoid calling get_next_ino() on non dir,
> > because we are not going to use it and we reuse i_generation from
> > recycled ovl_inode objects.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/overlayfs/inode.c     | 54 ++++++++++++++++++++++++++++++++--------
> >  fs/overlayfs/overlayfs.h |  7 ++++++
> >  fs/overlayfs/super.c     | 28 ++++++++++++---------
> >  3 files changed, 66 insertions(+), 23 deletions(-)
> >
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index 04e8e8de2012..415d9efa4799 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -79,6 +79,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
> >  {
> >         bool samefs = ovl_same_fs(dentry->d_sb);
> >         unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
> > +       unsigned int shift = 64 - xinobits;
> >
> >         if (samefs) {
> >                 /*
> > @@ -89,7 +90,6 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
> >                 stat->dev = dentry->d_sb->s_dev;
> >                 return 0;
> >         } else if (xinobits) {
> > -               unsigned int shift = 64 - xinobits;
> >                 /*
> >                  * All inode numbers of underlying fs should not be using the
> >                  * high xinobits, so we use high xinobits to partition the
> > @@ -116,11 +116,25 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
> >                  * overlay mount boundaries.
> >                  *
> >                  * If not all layers are on the same fs the pair {real st_ino;
> > -                * overlay st_dev} is not unique, so use the non persistent
> > -                * overlay st_ino for directories.
> > +                * overlay st_dev} is not unique, so use a non persistent
> > +                * overlay st_ino for directories, which is allocated with
> > +                * get_next_ino() and stored in i_generation.
> > +                *
> > +                * To avoid ino collision with legitimate xino values from upper
> > +                * layer (fsid 0), use fsid 1 to map the non persistent inode
> > +                * numbers to the unified st_ino address space.
> > +                *
> > +                * In this case (xino bits overflow on directory inode), the
> > +                * value of overlay inode st_ino will not be consistent with
> > +                * d_ino and i_ino. i_ino will have the value of the real inode
> > +                * i_ino and d_ino will have either the value of i_ino or the
> > +                * value of st_ino, depending on the directory iteration mode
> > +                * that is used on the parent (i.e. real/merge/impure).
> >                  */
> >                 stat->dev = dentry->d_sb->s_dev;
> > -               stat->ino = dentry->d_inode->i_ino;
> > +               stat->ino = dentry->d_inode->i_generation;
>
> I think we should avoid misusing i_generation for this.  It's
> confusing and not woth it, IMO.

OK, I though I could maintain i_ino -> d_ino consistency this way and
make nfsdv3 happy, but it's not true for all types of dir iteration,
so I'll drop it.

>
> > +               if (xinobits)
> > +                       stat->ino |= ((u64)1) << shift;
>
> I don't like this magic shifting.
>
>
> >         } else {
> >                 /*
> >                  * For non-samefs setup, if we cannot map all layers st_ino
> > @@ -128,7 +142,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
> >                  * is unique per underlying fs, so we use the unique anonymous
> >                  * bdev assigned to the underlying fs.
> >                  */
> > -               stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
> > +               stat->dev = ovl_dentry_fs(dentry, fsid)->pseudo_dev;
> >         }
> >
> >         return 0;
> > @@ -564,6 +578,7 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
> >  static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
> >  {
> >         int xinobits = ovl_xino_bits(inode->i_sb);
> > +       bool overflow = ino >> (64 - xinobits);
> >
> >         /*
> >          * When d_ino is consistent with st_ino (samefs or i_ino has enough
> > @@ -571,13 +586,30 @@ static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
> >          * so inode number exposed via /proc/locks and a like will be
> >          * consistent with d_ino and st_ino values. An i_ino value inconsistent
> >          * with d_ino also causes nfsd readdirplus to fail.
> > +        *
> > +        * When real inode bits overflow into xino bits, we leave i_ino value as
> > +        * the real inode to be consitent with d_ino.  For directory inodes on
> > +        * non-samefs with xino disabled or xino overflow, we reserve a unique
> > +        * 32bit generation number, to be used for resolving st_ino collisions
> > +        * in ovl_map_dev_ino(). With xino disabled, this 32bit number is also
> > +        * used as i_ino.
> >          */
> > -       if (ovl_same_dev(inode->i_sb)) {
> > -               inode->i_ino = ino;
> > -               if (xinobits && fsid && !(ino >> (64 - xinobits)))
> > -                       inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
> > -       } else {
> > -               inode->i_ino = get_next_ino();
> > +       inode->i_ino = ino;
> > +       if (ovl_same_dev(inode->i_sb) && !overflow) {
> > +               inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
>
> While this makes sense on 64bit arch, it's going to overflow on 32bit
> (due to i_ino being "unsigned long").

It's not clear here, but on 32bit, xinobits is 0:

                ofs->xino_mode = BITS_PER_LONG - 32;

To the expression doesn't change i_ino.
Correct?
Want me to clarify that by comment or by code?

>
> > +       } else if (S_ISDIR(inode->i_mode)) {
> > +               /*
> > +                * Reuse unique 32bit ino from recycled ovl_inode object.
> > +                * get_next_ino() wraps around at 32bit, but may be extended
> > +                * to 64bit in the future, so be prepared.
> > +                */
> > +               if (!inode->i_generation) {
> > +                       inode->i_generation = (u32)get_next_ino();
> > +                       if (unlikely(!inode->i_generation))
> > +                               inode->i_generation = (u32)get_next_ino();
> > +               }
>
> And this is really messy.   How about an atomic64_t counter in ovl_fs
> instead?  Do we really care about the performance of inode allocation
> for the overflow case?

No, that sounds right.

>
> > +               if (!ovl_same_dev(inode->i_sb))
> > +                       inode->i_ino = inode->i_generation;
> >         }
> >  }
> >
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index 140510d24d9f..c0b15fd2b395 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -311,6 +311,13 @@ static inline unsigned int ovl_xino_bits(struct super_block *sb)
> >         return ofs->xino_bits;
> >  }
> >
> > +static inline struct ovl_sb *ovl_dentry_fs(struct dentry *dentry, int fsid)
> > +{
> > +       /* fsid bit 1 is reserved for non persistent ino range */
> > +       WARN_ON_ONCE(fsid & 1);
> > +       return &OVL_FS(dentry->d_sb)->fs[fsid >> 1];
>
> Again, magic shifts.  Would be so much nicer to leave the fsid
> definition as the index into the ->fs array and deal with this when
> creating the st_ino/i_ino values.

OK.

>
> > +}
> > +
> >  /* All layers on same fs? */
> >  static inline bool ovl_same_fs(struct super_block *sb)
> >  {
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index d072f982d3de..d636a23df541 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -1272,8 +1272,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
> >         return true;
> >  }
> >
> > -/* Get a unique fsid for the layer */
> > -static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
> > +/* Get a unique fs for the layer */
> > +static int ovl_get_fs(struct ovl_fs *ofs, const struct path *path)
> >  {
> >         struct super_block *sb = path->mnt->mnt_sb;
> >         unsigned int i;
> > @@ -1353,9 +1353,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >         for (i = 0; i < numlower; i++) {
> >                 struct vfsmount *mnt;
> >                 struct inode *trap;
> > -               int fsid;
> > +               int n;
> >
> > -               err = fsid = ovl_get_fsid(ofs, &stack[i]);
> > +               err = n = ovl_get_fs(ofs, &stack[i]);
> >                 if (err < 0)
> >                         goto out;
> >
> > @@ -1387,9 +1387,10 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >                 ofs->layers[ofs->numlower].trap = trap;
> >                 ofs->layers[ofs->numlower].mnt = mnt;
> >                 ofs->layers[ofs->numlower].idx = ofs->numlower;
> > -               ofs->layers[ofs->numlower].fsid = fsid;
> > -               ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
> > -               ofs->fs[fsid].is_lower = true;
> > +               /* fsid bit 1 is reserved for non persistent ino range */
> > +               ofs->layers[ofs->numlower].fsid = n << 1;
> > +               ofs->layers[ofs->numlower].fs = &ofs->fs[n];
> > +               ofs->fs[n].is_lower = true;
> >         }
> >
> >         /*
> > @@ -1398,7 +1399,8 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >          * free high bits in underlying fs to hold the unique fsid.
> >          * If overlayfs does encounter underlying inodes using the high xino
> >          * bits reserved for fsid, it emits a warning and uses the original
> > -        * inode number.
> > +        * inode number or a non persistent inode number allocated from a
> > +        * dedicated range.
> >          */
> >         if (ofs->numfs == 1 || (ofs->numfs == 2 && !ofs->upper_mnt)) {
> >                 if (ofs->config.xino == OVL_XINO_ON)
> > @@ -1408,11 +1410,13 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >         } else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
> >                 /*
> >                  * This is a roundup of number of bits needed for encoding
> > -                * fsid, where fsid 0 is reserved for upper fs even with
> > -                * lower only overlay.
> > +                * fsid, where fsid 0 is reserved for upper fs (even with
> > +                * lower only overlay) and fsid bit 1 is reserved for the non
> > +                * persistent inode number range that is used for resolving
> > +                * xino lower bits overflow.
> >                  */
> > -               BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
> > -               ofs->xino_bits = ilog2(ofs->numfs - 1) + 1;
> > +               BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 30);
> > +               ofs->xino_bits = ilog2(ofs->numfs - 1) + 2;
>
>
> Just realized, this patch is obsolete (xino_bits ->xino_mode).
>
> Anyway, I think the comments are valid for the updated patch as well.
>

Yeh, it's mostly the same. Branch ovl-ino is already rebased.
If you have no other comments, I'll prepare v2 and test it with 5.6-rc2.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-02-19 15:28     ` Amir Goldstein
@ 2020-02-19 15:36       ` Miklos Szeredi
  2020-02-19 15:59         ` Amir Goldstein
  2020-02-21  1:10         ` Amir Goldstein
  0 siblings, 2 replies; 14+ messages in thread
From: Miklos Szeredi @ 2020-02-19 15:36 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Wed, Feb 19, 2020 at 4:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Feb 19, 2020 at 4:25 PM Miklos Szeredi <miklos@szeredi.hu> wrote:

> > While this makes sense on 64bit arch, it's going to overflow on 32bit
> > (due to i_ino being "unsigned long").
>
> It's not clear here, but on 32bit, xinobits is 0:
>
>                 ofs->xino_mode = BITS_PER_LONG - 32;
>
> To the expression doesn't change i_ino.
> Correct?
> Want me to clarify that by comment or by code?

Ah, missed that.  I think no need to clarify further.

> Yeh, it's mostly the same. Branch ovl-ino is already rebased.
> If you have no other comments, I'll prepare v2 and test it with 5.6-rc2.

Thanks.   I've already applied the patches leading up to this and just
pushed to #overlayfs-next.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-02-19 15:36       ` Miklos Szeredi
@ 2020-02-19 15:59         ` Amir Goldstein
  2020-02-19 19:45           ` Miklos Szeredi
  2020-02-21  1:10         ` Amir Goldstein
  1 sibling, 1 reply; 14+ messages in thread
From: Amir Goldstein @ 2020-02-19 15:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

> > Yeh, it's mostly the same. Branch ovl-ino is already rebased.
> > If you have no other comments, I'll prepare v2 and test it with 5.6-rc2.
>
> Thanks.   I've already applied the patches leading up to this and just
> pushed to #overlayfs-next.
>

OK, I'll rebase the rest on top of that.
While you are here, what do you think about:
  ovl: enable xino automatically in more cases

Do you agree with that minor change of behavior?

BTW, I see that overlayfs-next allows all remote fs as upper,
without extra restrictions.
I guess you are not too worried about implications?
Or intend to fix that up before the merge window?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-02-19 15:59         ` Amir Goldstein
@ 2020-02-19 19:45           ` Miklos Szeredi
  0 siblings, 0 replies; 14+ messages in thread
From: Miklos Szeredi @ 2020-02-19 19:45 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs

On Wed, Feb 19, 2020 at 4:59 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> > > Yeh, it's mostly the same. Branch ovl-ino is already rebased.
> > > If you have no other comments, I'll prepare v2 and test it with 5.6-rc2.
> >
> > Thanks.   I've already applied the patches leading up to this and just
> > pushed to #overlayfs-next.
> >
>
> OK, I'll rebase the rest on top of that.
> While you are here, what do you think about:
>   ovl: enable xino automatically in more cases
>
> Do you agree with that minor change of behavior?

I haven't thought about that yet.

>
> BTW, I see that overlayfs-next allows all remote fs as upper,
> without extra restrictions.
> I guess you are not too worried about implications?
> Or intend to fix that up before the merge window?

No, I'm not too worried, but if you send a patch, I'm not against
restricting it either.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on
  2020-02-19 15:36       ` Miklos Szeredi
  2020-02-19 15:59         ` Amir Goldstein
@ 2020-02-21  1:10         ` Amir Goldstein
  1 sibling, 0 replies; 14+ messages in thread
From: Amir Goldstein @ 2020-02-21  1:10 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Wed, Feb 19, 2020 at 5:36 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, Feb 19, 2020 at 4:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Wed, Feb 19, 2020 at 4:25 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> > > While this makes sense on 64bit arch, it's going to overflow on 32bit
> > > (due to i_ino being "unsigned long").
> >
> > It's not clear here, but on 32bit, xinobits is 0:
> >
> >                 ofs->xino_mode = BITS_PER_LONG - 32;
> >
> > To the expression doesn't change i_ino.
> > Correct?
> > Want me to clarify that by comment or by code?
>
> Ah, missed that.  I think no need to clarify further.
>

Mmm.. only it doesn't seem to be true.
Seems like with xino=on xino_mode won't be 0 on 32bit.
I think we do want to force disable xino on 32bit and send fix to stable -
if only to keep the code in ovl_map_ino() simpler.

What's more, I think there is a bug with xino_mode -
it is not initialized to -1 on the default xino=off mode,
so ovl_same_fs() could be wrong.

I will try to look at this tomorrow.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-02-21  1:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-01 17:58 [PATCH 0/7] Sort out overlay ino numbers Amir Goldstein
2020-01-01 17:58 ` [PATCH 1/7] ovl: fix value of i_ino for lower hardlink corner case Amir Goldstein
2020-01-01 17:58 ` [PATCH 2/7] ovl: fix out of date comment and unreachable code Amir Goldstein
2020-01-01 17:58 ` [PATCH 3/7] ovl: factor out helper ovl_get_root() Amir Goldstein
2020-01-01 17:58 ` [PATCH 4/7] ovl: simplify i_ino initialization Amir Goldstein
2020-01-01 17:58 ` [PATCH 5/7] ovl: avoid possible inode number collisions with xino=on Amir Goldstein
2020-02-19 14:25   ` Miklos Szeredi
2020-02-19 15:28     ` Amir Goldstein
2020-02-19 15:36       ` Miklos Szeredi
2020-02-19 15:59         ` Amir Goldstein
2020-02-19 19:45           ` Miklos Szeredi
2020-02-21  1:10         ` Amir Goldstein
2020-01-01 17:58 ` [PATCH 6/7] ovl: enable xino automatically in more cases Amir Goldstein
2020-01-01 17:58 ` [PATCH 7/7] ovl: document xino expected behavior Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).