All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 00/13] overlayfs stable inodes
@ 2017-04-16 23:59 Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 01/13] ovl: check if all layers are on the same fs Amir Goldstein
                   ` (13 more replies)
  0 siblings, 14 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

Overlayfs inodes are considered unstable in several aspects,
because on a copy up event:
1. st_ino can change
2. st_dev can change
3. hardlinks are broken
4. NFS handle would become stale
5. content of read-only file descriptor would become stale

This patch set 'stabilizes' overlayfs inodes w.r.t. st_ino/st_dev
and takes some big steps in the direction of stabilizing hardlinks
and NFS handles.

The full work is available at [1] and includes also an untested
WIP for NFS export support.

This work relies heavily on inode and dentry cache - so long as
the overlayfs dcache is fully populated, hardlinks should never be
broken and NFS handles should never become stale on copy up.

The missing piece of the puzzle is a persistent mapping from
lower entries to upper entries, so that overlay entries could be
reconstructed from their stable (lower) unique identifier.

But the work has merit even without the missing piece, because it
stabilizes st_ino/st_dev and prevents breaking hardlinks in many
use cases.

[1] https://github.com/amir73il/linux/commits/ovl-nfs-export

Amir Goldstein (13):
  ovl: check if all layers are on the same fs
  ovl: redirect dir by file handle on copy up
  ovl: lookup redirect by file handle
  ovl: store file handle of stable inode
  ovl: lookup stable inode by file handle
  ovl: move inode helpers to inode.c
  ovl: create helpers for initializing hashed inode
  ovl: allow hashing non upper inodes
  ovl: inherit overlay inode ino/generation from real inode
  ovl: hash overlay inodes by stable inode
  ovl: fix du --one-file-system on overlay mount
  ovl: constant ino across copy up
  ovl: try to hardlink upper on copy up of lower hardlinks

 fs/overlayfs/Kconfig     |  16 ++++
 fs/overlayfs/copy_up.c   | 130 ++++++++++++++++++++++++++-
 fs/overlayfs/dir.c       |   2 +-
 fs/overlayfs/inode.c     | 106 ++++++++++++++++++++--
 fs/overlayfs/namei.c     | 226 ++++++++++++++++++++++++++++++++++++++++++-----
 fs/overlayfs/overlayfs.h |  40 +++++++--
 fs/overlayfs/ovl_entry.h |   4 +-
 fs/overlayfs/readdir.c   |  85 ++++++++++++++++--
 fs/overlayfs/super.c     |  37 +++++++-
 fs/overlayfs/util.c      |  51 +++++++----
 10 files changed, 633 insertions(+), 64 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC][PATCH 01/13] ovl: check if all layers are on the same fs
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up Amir Goldstein
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

Some features can only work when all layers are on the same fs.
Check that during mount time, so features can test it later.

Add helper ovl_same_sb() to return the underlying super block
in case all layers are on the same fs.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/overlayfs.h | 1 +
 fs/overlayfs/ovl_entry.h | 3 ++-
 fs/overlayfs/super.c     | 9 +++++++++
 fs/overlayfs/util.c      | 7 +++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 741dc0b..c851158 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -151,6 +151,7 @@ int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
+struct super_block *ovl_same_sb(struct super_block *sb);
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
 bool ovl_dentry_remote(struct dentry *dentry);
 bool ovl_dentry_weird(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 59614fa..e294f22 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -27,7 +27,8 @@ struct ovl_fs {
 	struct ovl_config config;
 	/* creds of process who forced instantiation of super block */
 	const struct cred *creator_cred;
-	bool tmpfile;
+	bool tmpfile;			/* upper supports O_TMPFILE */
+	struct super_block *same_sb;	/* all layers on same sb */
 	wait_queue_head_t copyup_wq;
 };
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index c072a0c..79500d9 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -842,6 +842,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		goto out_put_lowerpath;
 	}
 
+	ufs->same_sb = NULL;
 	if (ufs->config.upperdir) {
 		ufs->upper_mnt = clone_private_mount(&upperpath);
 		err = PTR_ERR(ufs->upper_mnt);
@@ -853,6 +854,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		ufs->upper_mnt->mnt_flags &= ~(MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME);
 
 		sb->s_time_gran = ufs->upper_mnt->mnt_sb->s_time_gran;
+		ufs->same_sb = ufs->upper_mnt->mnt_sb;
 
 		ufs->workdir = ovl_workdir_create(ufs->upper_mnt, workpath.dentry);
 		err = PTR_ERR(ufs->workdir);
@@ -898,6 +900,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ufs->lower_mnt = kcalloc(numlower, sizeof(struct vfsmount *), GFP_KERNEL);
 	if (ufs->lower_mnt == NULL)
 		goto out_put_workdir;
+
 	for (i = 0; i < numlower; i++) {
 		struct vfsmount *mnt = clone_private_mount(&stack[i]);
 
@@ -914,6 +917,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 		ufs->lower_mnt[ufs->numlower] = mnt;
 		ufs->numlower++;
+
+		/* Check if all layers on same sb */
+		if (ufs->same_sb && ufs->same_sb != mnt->mnt_sb)
+			ufs->same_sb = NULL;
+		else if (i == 0 && !ufs->same_sb)
+			ufs->same_sb = mnt->mnt_sb;
 	}
 
 	/* If the upper fs is nonexistent, we mark overlayfs r/o too */
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 1953986..dcebef0 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -41,6 +41,13 @@ const struct cred *ovl_override_creds(struct super_block *sb)
 	return override_creds(ofs->creator_cred);
 }
 
+struct super_block *ovl_same_sb(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return ofs->same_sb;
+}
+
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower)
 {
 	size_t size = offsetof(struct ovl_entry, lowerstack[numlower]);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 01/13] ovl: check if all layers are on the same fs Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-17 13:33   ` Rock Lee
                     ` (2 more replies)
  2017-04-16 23:59 ` [RFC][PATCH 03/13] ovl: lookup redirect by file handle Amir Goldstein
                   ` (11 subsequent siblings)
  13 siblings, 3 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

When mounted with mount option redirect_dir=fh,
every copy up of lower directory stores the lower
dir file handle in overlay.fh xattr of upper dir.

This method has some advantages over absolute path redirect:
- it is more compact in stored xattr size
- it is not limited by lengths of full paths
- lookup redirect is more efficient for very nested directories

It also has some disadvantages over absolute path redirect:
- it requires that all lower layers are on the same file system,
  which support exportfs ops
- file handles will become stale if overlay lower directories
  were to be copied to another location

Therefore, redirect by file handle is considered an optimization
and file handle is stored on copy up in addition to setting
redirect by path on rename.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/Kconfig     | 16 +++++++++
 fs/overlayfs/copy_up.c   | 87 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/overlayfs/overlayfs.h | 15 +++++++++
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/super.c     | 28 ++++++++++++++--
 fs/overlayfs/util.c      | 14 ++++++++
 6 files changed, 159 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 0daac51..351b61a 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -22,3 +22,19 @@ config OVERLAY_FS_REDIRECT_DIR
 	  Note, that redirects are not backward compatible.  That is, mounting
 	  an overlay which has redirects on a kernel that doesn't support this
 	  feature will have unexpected results.
+
+config OVERLAY_FS_REDIRECT_FH
+	bool "Overlayfs: turn on redirect by file handle by default"
+	depends on OVERLAY_FS_REDIRECT_DIR
+	help
+	  If this config option is enabled then overlay filesystems will use
+	  redirect by file handle for merged directories by default.  It is
+	  also possible to turn on redirect by file handle globally with the
+	  "redirect_fh=on" module option or on a filesystem instance basis
+	  with the "redirect_dir=fh" mount option.
+
+	  Redirect by file handle provides faster lookup compared to redirect
+	  by absolute path, but it only works when all layers are on the same
+	  underlying file system that supports NFS export ops.  Redirect by
+	  file handle breaks if layers are taken offline and copied to another
+	  location.  In that case, lookup falls back to redirect by path.
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 906ea6c..428dc26 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -20,6 +20,7 @@
 #include <linux/namei.h>
 #include <linux/fdtable.h>
 #include <linux/ratelimit.h>
+#include <linux/exportfs.h>
 #include "overlayfs.h"
 #include "ovl_entry.h"
 
@@ -232,6 +233,80 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
 	return err;
 }
 
+static struct ovl_fh *ovl_get_redirect_fh(struct dentry *lower)
+{
+	const struct export_operations *nop = lower->d_sb->s_export_op;
+	struct ovl_fh *fh;
+	int fh_type, fh_len, dwords;
+	void *buf, *ret;
+	int buflen = MAX_HANDLE_SZ;
+
+	/* Do not encode file handle if we cannot decode it later */
+	if (!nop || !nop->fh_to_dentry) {
+		pr_info("overlay: redirect by file handle not supported "
+			"by lower - turning off redirect\n");
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	buf = kmalloc(buflen, GFP_TEMPORARY);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	fh = buf;
+	dwords = (buflen - offsetof(struct ovl_fh, fid)) >> 2;
+	fh_type = exportfs_encode_fh(lower,
+				     (struct fid *)fh->fid,
+				     &dwords, 0);
+	fh_len = (dwords << 2) + offsetof(struct ovl_fh, fid);
+
+	ret = ERR_PTR(-EOVERFLOW);
+	if (fh_len > buflen || fh_type <= 0 || fh_type == FILEID_INVALID)
+		goto out;
+
+	fh->version = OVL_FH_VERSION;
+	fh->magic = OVL_FH_MAGIC;
+	fh->type = fh_type;
+	fh->len = fh_len;
+
+	ret = kmalloc(fh_len, GFP_KERNEL);
+	if (!ret) {
+		ret = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	memcpy(ret, buf, fh_len);
+out:
+	kfree(buf);
+	return ret;
+}
+
+static int ovl_set_redirect_fh(struct dentry *dentry, struct dentry *upper)
+{
+	int err;
+	const struct ovl_fh *fh;
+
+	fh = ovl_get_redirect_fh(ovl_dentry_lower(dentry));
+	err = PTR_ERR(fh);
+	if (IS_ERR(fh))
+		goto out_err;
+
+	err = ovl_do_setxattr(upper, OVL_XATTR_FH, fh, fh->len, 0);
+	if (err)
+		goto out_free;
+
+	return 0;
+
+out_free:
+	kfree(fh);
+out_err:
+	if (err == -EOPNOTSUPP) {
+		ovl_clear_redirect_fh(dentry->d_sb);
+		return 0;
+	}
+	pr_warn_ratelimited("overlay: failed to set redirect fh (%i)\n", err);
+	return err;
+}
+
 static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 			      struct dentry *dentry, struct path *lowerpath,
 			      struct kstat *stat, const char *link,
@@ -316,6 +391,18 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out_cleanup;
 
+	if (S_ISDIR(stat->mode) &&
+	    ovl_redirect_dir(dentry->d_sb) &&
+	    ovl_redirect_fh(dentry->d_sb)) {
+		/*
+		 * Store file handle of lower dir in upper dir xattr to
+		 * create a chain of file handles for merged dir stack
+		 */
+		err = ovl_set_redirect_fh(dentry, temp);
+		if (err)
+			goto out_cleanup;
+	}
+
 	if (tmpfile)
 		err = ovl_do_link(temp, udir, upper, true);
 	else
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index c851158..49be199 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -20,6 +20,7 @@ enum ovl_path_type {
 #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
 #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
 #define OVL_XATTR_REDIRECT OVL_XATTR_PREFIX "redirect"
+#define OVL_XATTR_FH OVL_XATTR_PREFIX "fh"
 
 #define OVL_ISUPPER_MASK 1UL
 
@@ -146,6 +147,18 @@ static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
 	return (struct inode *) (x & ~OVL_ISUPPER_MASK);
 }
 
+/* redirect data format for redirect by file handle */
+struct ovl_fh {
+	unsigned char version;	/* 0 */
+	unsigned char magic;	/* 0xfb */
+	unsigned char len;	/* size of this header + size of fid */
+	unsigned char type;	/* fid_type of fid */
+	unsigned char fid[0];	/* file identifier */
+} __packed;
+
+#define OVL_FH_VERSION	0
+#define OVL_FH_MAGIC	0xfb
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
@@ -171,6 +184,8 @@ bool ovl_redirect_dir(struct super_block *sb);
 void ovl_clear_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
+bool ovl_redirect_fh(struct super_block *sb);
+void ovl_clear_redirect_fh(struct super_block *sb);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index e294f22..cf22992 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -14,6 +14,7 @@ struct ovl_config {
 	char *workdir;
 	bool default_permissions;
 	bool redirect_dir;
+	bool redirect_fh;
 };
 
 /* private information held for overlayfs's superblock */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 79500d9..3036b2d 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -17,6 +17,7 @@
 #include <linux/statfs.h>
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
+#include <linux/exportfs.h>
 #include "overlayfs.h"
 #include "ovl_entry.h"
 
@@ -34,6 +35,12 @@ module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644);
 MODULE_PARM_DESC(ovl_redirect_dir_def,
 		 "Default to on or off for the redirect_dir feature");
 
+static bool ovl_redirect_fh_def =
+	    IS_ENABLED(CONFIG_OVERLAY_FS_REDIRECT_FH);
+module_param_named(redirect_fh, ovl_redirect_fh_def, bool, 0644);
+MODULE_PARM_DESC(ovl_redirect_fh_def,
+		 "Default to on or off for redirect by file handle");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -246,9 +253,11 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	}
 	if (ufs->config.default_permissions)
 		seq_puts(m, ",default_permissions");
-	if (ufs->config.redirect_dir != ovl_redirect_dir_def)
+	if (ufs->config.redirect_dir != ovl_redirect_dir_def ||
+	    ufs->config.redirect_fh != ovl_redirect_fh_def)
 		seq_printf(m, ",redirect_dir=%s",
-			   ufs->config.redirect_dir ? "on" : "off");
+			   !ufs->config.redirect_dir ? "off" :
+			   ufs->config.redirect_fh ? "fh" : "on");
 	return 0;
 }
 
@@ -278,6 +287,7 @@ enum {
 	OPT_DEFAULT_PERMISSIONS,
 	OPT_REDIRECT_DIR_ON,
 	OPT_REDIRECT_DIR_OFF,
+	OPT_REDIRECT_DIR_FH,
 	OPT_ERR,
 };
 
@@ -288,6 +298,7 @@ static const match_table_t ovl_tokens = {
 	{OPT_DEFAULT_PERMISSIONS,	"default_permissions"},
 	{OPT_REDIRECT_DIR_ON,		"redirect_dir=on"},
 	{OPT_REDIRECT_DIR_OFF,		"redirect_dir=off"},
+	{OPT_REDIRECT_DIR_FH,		"redirect_dir=fh"},
 	{OPT_ERR,			NULL}
 };
 
@@ -354,10 +365,17 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 
 		case OPT_REDIRECT_DIR_ON:
 			config->redirect_dir = true;
+			config->redirect_fh = false;
 			break;
 
 		case OPT_REDIRECT_DIR_OFF:
 			config->redirect_dir = false;
+			config->redirect_fh = false;
+			break;
+
+		case OPT_REDIRECT_DIR_FH:
+			config->redirect_dir = true;
+			config->redirect_fh = true;
 			break;
 
 		default:
@@ -754,6 +772,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	init_waitqueue_head(&ufs->copyup_wq);
 	ufs->config.redirect_dir = ovl_redirect_dir_def;
+	ufs->config.redirect_fh = ovl_redirect_fh_def;
 	err = ovl_parse_opt((char *) data, &ufs->config);
 	if (err)
 		goto out_free_config;
@@ -934,6 +953,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	else
 		sb->s_d_op = &ovl_dentry_operations;
 
+	/* Redirect by fhandle only if all layers on same sb with export ops */
+	if (!ufs->same_sb || !ufs->same_sb->s_export_op ||
+	    !ufs->same_sb->s_export_op->fh_to_dentry)
+		ufs->config.redirect_fh = false;
+
 	ufs->creator_cred = cred = prepare_creds();
 	if (!cred)
 		goto out_put_lower_mnt;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index dcebef0..17530c5 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -215,6 +215,20 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
 	oe->redirect = redirect;
 }
 
+bool ovl_redirect_fh(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return ofs->config.redirect_fh;
+}
+
+void ovl_clear_redirect_fh(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	ofs->config.redirect_fh = false;
+}
+
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 01/13] ovl: check if all layers are on the same fs Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-18 13:05   ` Vivek Goyal
  2017-04-16 23:59 ` [RFC][PATCH 04/13] ovl: store file handle of stable inode Amir Goldstein
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

When redirect_fh feature is enabled and when underlying
layers are all on the same fs, construct the dentry stack
of a merged dir by following a file handle chain of all
merged dir layers.

When overlay.fh xattr is found in upper inode, instead
of lookup of the dentry in next lower layer by name,
try to get it by calling exportfs_decode_fh().

On failure to follow file handle from upper to lower, fall back
to regular name lookup with or without path redirect.

On failure to decode file handle from middle layer, after
already following file handle from upper, there is no fall back
to lookup by name.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     | 167 +++++++++++++++++++++++++++++++++++++++++++----
 fs/overlayfs/overlayfs.h |   1 +
 fs/overlayfs/util.c      |  14 ++++
 3 files changed, 168 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index b8b0778..fcb7c15 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -9,9 +9,11 @@
 
 #include <linux/fs.h>
 #include <linux/cred.h>
+#include <linux/mount.h>
 #include <linux/namei.h>
 #include <linux/xattr.h>
 #include <linux/ratelimit.h>
+#include <linux/exportfs.h>
 #include "overlayfs.h"
 #include "ovl_entry.h"
 
@@ -21,7 +23,10 @@ struct ovl_lookup_data {
 	bool opaque;
 	bool stop;
 	bool last;
-	char *redirect;
+	bool by_path;		/* redirect by path */
+	bool by_fh;		/* redirect by file handle */
+	char *redirect;		/* path to follow */
+	struct ovl_fh *fh;	/* file handle to follow */
 };
 
 static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
@@ -81,6 +86,41 @@ static int ovl_check_redirect(struct dentry *dentry, struct ovl_lookup_data *d,
 	goto err_free;
 }
 
+static int ovl_check_redirect_fh(struct dentry *dentry, struct ovl_lookup_data *d)
+{
+	int res;
+	void *buf = NULL;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_FH, NULL, 0);
+	if (res < 0) {
+		if (res == -ENODATA || res == -EOPNOTSUPP)
+			return 0;
+		goto fail;
+	}
+	buf = kzalloc(res, GFP_TEMPORARY);
+	if (!buf)
+		return -ENOMEM;
+
+	if (res == 0)
+		goto fail;
+
+	res = vfs_getxattr(dentry, OVL_XATTR_FH, buf, res);
+	if (res < 0 || !ovl_redirect_fh_ok(buf, res))
+		goto fail;
+
+	kfree(d->fh);
+	d->fh = buf;
+
+	return 0;
+
+err_free:
+	kfree(buf);
+	return 0;
+fail:
+	pr_warn_ratelimited("overlayfs: failed to get file handle (%i)\n", res);
+	goto err_free;
+}
+
 static bool ovl_is_opaquedir(struct dentry *dentry)
 {
 	int res;
@@ -96,22 +136,76 @@ static bool ovl_is_opaquedir(struct dentry *dentry)
 	return false;
 }
 
+/* Check if p1 is connected with a chain of hashed dentries to p2 */
+static bool ovl_is_lookable(struct dentry *p1, struct dentry *p2)
+{
+	struct dentry *p;
+
+	for (p = p2; !IS_ROOT(p); p = p->d_parent) {
+		if (d_unhashed(p))
+			return false;
+		if (p->d_parent == p1)
+			return true;
+	}
+	return false;
+}
+
+/* Check if dentry is reachable from mnt via path lookup */
+static int ovl_dentry_under_mnt(void *ctx, struct dentry *dentry)
+{
+	struct vfsmount *mnt = ctx;
+
+	return ovl_is_lookable(mnt->mnt_root, dentry);
+}
+
+static struct dentry *ovl_lookup_fh(struct vfsmount *mnt,
+				    const struct ovl_fh *fh)
+{
+	int bytes = (fh->len - offsetof(struct ovl_fh, fid));
+
+	/*
+	 * Several layers can be on the same fs and decoded dentry may be in
+	 * either one of those layers. We are looking for a match of dentry
+	 * and mnt to find out to which layer the decoded dentry belongs to.
+	 */
+	return exportfs_decode_fh(mnt, (struct fid *)fh->fid,
+				  bytes >> 2, (int)fh->type,
+				  ovl_dentry_under_mnt, mnt);
+}
+
 static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 			     const char *name, unsigned int namelen,
 			     size_t prelen, const char *post,
-			     struct dentry **ret)
+			     struct vfsmount *mnt, struct dentry **ret)
 {
 	struct dentry *this;
 	int err;
 
-	this = lookup_one_len_unlocked(name, base, namelen);
+	/*
+	 * Lookup of upper is with null d->fh.
+	 * Lookup of lower is either by_fh with non-null d->fh
+	 * or by_path with null d->fh.
+	 */
+	if (d->fh)
+		this = ovl_lookup_fh(mnt, d->fh);
+	else
+		this = lookup_one_len_unlocked(name, base, namelen);
 	if (IS_ERR(this)) {
 		err = PTR_ERR(this);
 		this = NULL;
 		if (err == -ENOENT || err == -ENAMETOOLONG)
 			goto out;
+		if (d->fh && err == -ESTALE)
+			goto out;
 		goto out_err;
 	}
+
+	/* Lower found by file handle - don't follow that handle again */
+	if (d->fh) {
+		kfree(d->fh);
+		d->fh = NULL;
+	}
+
 	if (!this->d_inode)
 		goto put_and_out;
 
@@ -135,9 +229,18 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		d->stop = d->opaque = true;
 		goto out;
 	}
-	err = ovl_check_redirect(this, d, prelen, post);
-	if (err)
-		goto out_err;
+	if (d->last)
+		goto out;
+	if (d->by_fh) {
+		err = ovl_check_redirect_fh(this, d);
+		if (err)
+			goto out_err;
+	}
+	if (d->by_path) {
+		err = ovl_check_redirect(this, d, prelen, post);
+		if (err)
+			goto out_err;
+	}
 out:
 	*ret = this;
 	return 0;
@@ -152,6 +255,12 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 	return err;
 }
 
+static int ovl_lookup_layer_fh(struct path *path, struct ovl_lookup_data *d,
+			       struct dentry **ret)
+{
+	return ovl_lookup_single(path->dentry, d, "", 0, 0, "", path->mnt, ret);
+}
+
 static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 			    struct dentry **ret)
 {
@@ -162,7 +271,7 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 
 	if (d->name.name[0] != '/')
 		return ovl_lookup_single(base, d, d->name.name, d->name.len,
-					 0, "", ret);
+					 0, "", NULL, ret);
 
 	while (!IS_ERR_OR_NULL(base) && d_can_lookup(base)) {
 		const char *s = d->name.name + d->name.len - rem;
@@ -175,7 +284,7 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 			return -EIO;
 
 		err = ovl_lookup_single(base, d, s, thislen,
-					d->name.len - rem, next, &base);
+					d->name.len - rem, next, NULL, &base);
 		dput(dentry);
 		if (err)
 			return err;
@@ -220,6 +329,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	const struct cred *old_cred;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_entry *poe = dentry->d_parent->d_fsdata;
+	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
 	struct path *stack = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
 	unsigned int ctr = 0;
@@ -235,7 +345,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		.opaque = false,
 		.stop = false,
 		.last = !poe->numlower,
+		.by_path = true,
+		.by_fh = ofs->config.redirect_fh,
 		.redirect = NULL,
+		.fh = NULL,
 	};
 
 	if (dentry->d_name.len > ofs->namelen)
@@ -259,12 +372,12 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			if (!upperredirect)
 				goto out_put_upper;
 			if (d.redirect[0] == '/')
-				poe = dentry->d_sb->s_root->d_fsdata;
+				poe = roe;
 		}
 		upperopaque = d.opaque;
 	}
 
-	if (!d.stop && poe->numlower) {
+	if (!d.stop && (poe->numlower || d.fh)) {
 		err = -ENOMEM;
 		stack = kcalloc(ofs->numlower, sizeof(struct path),
 				GFP_TEMPORARY);
@@ -272,6 +385,33 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			goto out_put_upper;
 	}
 
+	/* Try to lookup lower layers by file handle (at root) */
+	d.by_path = false;
+	for (i = 0; !d.stop && d.fh && i < roe->numlower; i++) {
+		struct path lowerpath = roe->lowerstack[i];
+
+		d.last = i == roe->numlower - 1;
+		err = ovl_lookup_layer_fh(&lowerpath, &d, &this);
+		if (err)
+			goto out_put;
+
+		if (!this)
+			continue;
+
+		stack[ctr].dentry = this;
+		stack[ctr].mnt = lowerpath.mnt;
+		ctr++;
+	}
+
+	/* Fallback to lookup lower layers by name (at parent) */
+	if (ctr) {
+		d.stop = true;
+	} else {
+		d.by_path = true;
+		d.by_fh = false;
+		kfree(d.fh);
+		d.fh = NULL;
+	}
 	for (i = 0; !d.stop && i < poe->numlower; i++) {
 		struct path lowerpath = poe->lowerstack[i];
 
@@ -290,10 +430,8 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		if (d.stop)
 			break;
 
-		if (d.redirect &&
-		    d.redirect[0] == '/' &&
-		    poe != dentry->d_sb->s_root->d_fsdata) {
-			poe = dentry->d_sb->s_root->d_fsdata;
+		if (d.redirect && d.redirect[0] == '/' && poe != roe) {
+			poe = roe;
 
 			/* Find the current layer on the root dentry */
 			for (i = 0; i < poe->numlower; i++)
@@ -352,6 +490,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	dput(upperdentry);
 	kfree(upperredirect);
 out:
+	kfree(d.fh);
 	kfree(d.redirect);
 	revert_creds(old_cred);
 	return ERR_PTR(err);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 49be199..6257c5b 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -186,6 +186,7 @@ const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 bool ovl_redirect_fh(struct super_block *sb);
 void ovl_clear_redirect_fh(struct super_block *sb);
+bool ovl_redirect_fh_ok(const char *redirect, size_t size);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 17530c5..6538ec5 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -229,6 +229,20 @@ void ovl_clear_redirect_fh(struct super_block *sb)
 	ofs->config.redirect_fh = false;
 }
 
+bool ovl_redirect_fh_ok(const char *redirect, size_t size)
+{
+	struct ovl_fh *fh = (void *)redirect;
+
+	if (size < sizeof(struct ovl_fh) || size < fh->len)
+		return false;
+
+	if (fh->version > OVL_FH_VERSION ||
+	    fh->magic != OVL_FH_MAGIC)
+		return false;
+
+	return true;
+}
+
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 04/13] ovl: store file handle of stable inode
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 03/13] ovl: lookup redirect by file handle Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 05/13] ovl: lookup stable inode by file handle Amir Goldstein
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

Sometimes it is interesting to know if an upper file is pure
upper or a copy up target, and if it is a copy up target, it
may be interesting to know which is the copy up origin.

One such case is nfs export, where we need to have persistent
and consistent file handles across copy up.

Store the lower inode file handle in upper inode xattr on copy up
to use it later for these cases.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 428dc26..77a8715 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -391,12 +391,10 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out_cleanup;
 
-	if (S_ISDIR(stat->mode) &&
-	    ovl_redirect_dir(dentry->d_sb) &&
-	    ovl_redirect_fh(dentry->d_sb)) {
+	if (ovl_redirect_fh(dentry->d_sb)) {
 		/*
-		 * Store file handle of lower dir in upper dir xattr to
-		 * create a chain of file handles for merged dir stack
+		 * Store file handle of lower inode in upper inode xattr to
+		 * allow lookup of stable (pre copy up) inode
 		 */
 		err = ovl_set_redirect_fh(dentry, temp);
 		if (err)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 05/13] ovl: lookup stable inode by file handle
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 04/13] ovl: store file handle of stable inode Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 06/13] ovl: move inode helpers to inode.c Amir Goldstein
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

When non directory upper has overlay.fh xattr, lookup that
file handle in lower layers to find the stable inode it refers to.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index fcb7c15..42b6030 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -204,6 +204,9 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 	if (d->fh) {
 		kfree(d->fh);
 		d->fh = NULL;
+		/* Follow once by file handle for non-dir */
+		if (!d->is_dir)
+			d->by_fh = false;
 	}
 
 	if (!this->d_inode)
@@ -219,15 +222,21 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		goto put_and_out;
 	}
 	if (!d_can_lookup(this)) {
-		d->stop = true;
-		if (d->is_dir)
+		if (d->is_dir) {
+			d->stop = true;
 			goto put_and_out;
-		goto out;
-	}
-	d->is_dir = true;
-	if (!d->last && ovl_is_opaquedir(this)) {
-		d->stop = d->opaque = true;
-		goto out;
+		}
+		/* Lookup stable inode of non-dir by file handle */
+		if (!d->by_fh) {
+			d->stop = true;
+			goto out;
+		}
+	} else {
+		d->is_dir = true;
+		if (!d->last && ovl_is_opaquedir(this)) {
+			d->stop = d->opaque = true;
+			goto out;
+		}
 	}
 	if (d->last)
 		goto out;
@@ -236,6 +245,11 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
 		if (err)
 			goto out_err;
 	}
+	/* No redirect fh for non-dir means pure upper */
+	if (!d->is_dir) {
+		d->stop = !d->fh;
+		goto out;
+	}
 	if (d->by_path) {
 		err = ovl_check_redirect(this, d, prelen, post);
 		if (err)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 06/13] ovl: move inode helpers to inode.c
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (4 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 05/13] ovl: lookup stable inode by file handle Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 07/13] ovl: create helpers for initializing hashed inode Amir Goldstein
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 16 ++++++++++++++++
 fs/overlayfs/overlayfs.h |  7 ++++---
 fs/overlayfs/util.c      | 16 ----------------
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 17b8418..370a015 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -381,6 +381,22 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 	return inode;
 }
 
+void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
+{
+	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
+		   (is_upper ? OVL_ISUPPER_MASK : 0));
+}
+
+void ovl_inode_update(struct inode *inode, struct inode *upperinode)
+{
+	WARN_ON(!upperinode);
+	WARN_ON(!inode_unhashed(inode));
+	WRITE_ONCE(inode->i_private,
+		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
+	if (!S_ISDIR(upperinode->i_mode))
+		__insert_inode_hash(inode, (unsigned long) upperinode);
+}
+
 static int ovl_inode_test(struct inode *inode, void *data)
 {
 	return ovl_inode_real(inode, NULL) == data;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 6257c5b..f9d2d77 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -188,9 +188,6 @@ bool ovl_redirect_fh(struct super_block *sb);
 void ovl_clear_redirect_fh(struct super_block *sb);
 bool ovl_redirect_fh_ok(const char *redirect, size_t size);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
-void ovl_inode_init(struct inode *inode, struct inode *realinode,
-		    bool is_upper);
-void ovl_inode_update(struct inode *inode, struct inode *upperinode);
 void ovl_dentry_version_inc(struct dentry *dentry);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
@@ -226,7 +223,11 @@ int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
+void ovl_inode_init(struct inode *inode, struct inode *realinode,
+		    bool is_upper);
+void ovl_inode_update(struct inode *inode, struct inode *upperinode);
 struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode);
+
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 6538ec5..d202f28 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -257,22 +257,6 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 	oe->__upperdentry = upperdentry;
 }
 
-void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
-{
-	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
-		   (is_upper ? OVL_ISUPPER_MASK : 0));
-}
-
-void ovl_inode_update(struct inode *inode, struct inode *upperinode)
-{
-	WARN_ON(!upperinode);
-	WARN_ON(!inode_unhashed(inode));
-	WRITE_ONCE(inode->i_private,
-		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
-	if (!S_ISDIR(upperinode->i_mode))
-		__insert_inode_hash(inode, (unsigned long) upperinode);
-}
-
 void ovl_dentry_version_inc(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 07/13] ovl: create helpers for initializing hashed inode
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (5 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 06/13] ovl: move inode helpers to inode.c Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 08/13] ovl: allow hashing non upper inodes Amir Goldstein
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

In preparation for hashing all overlay inodes, modify the helper
ovl_inode_init() to hash a new non-dir upper overlay inode.

Use this helper for initializing new upper inode in ovl_instantiate()
instead of ovl_inode_update(), because the implementations for new inode
case and copy up case are going to diverge.

Factor out helpers ovl_inode_data() and ovl_set_inode_data(), that
are going to be used for initializing and hashing overlay inodes.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c   |  2 +-
 fs/overlayfs/inode.c | 31 ++++++++++++++++++++++++-------
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 6515796..d8b8982 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -175,7 +175,7 @@ static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 	ovl_dentry_version_inc(dentry->d_parent);
 	ovl_dentry_update(dentry, newdentry);
 	if (!hardlink) {
-		ovl_inode_update(inode, d_inode(newdentry));
+		ovl_inode_init(inode, d_inode(newdentry), true);
 		ovl_copyattr(newdentry->d_inode, inode);
 	} else {
 		WARN_ON(ovl_inode_real(inode, NULL) != d_inode(newdentry));
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 370a015..a6d7c44 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -381,20 +381,37 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 	return inode;
 }
 
+static void *ovl_inode_data(struct inode *realinode, bool is_upper)
+{
+	return (void *)((unsigned long) realinode |
+			(is_upper ? OVL_ISUPPER_MASK : 0));
+}
+
+static void ovl_set_inode_data(struct inode *inode, struct inode *realinode,
+			       bool is_upper)
+{
+	WRITE_ONCE(inode->i_private, ovl_inode_data(realinode, is_upper));
+}
+
+static void ovl_insert_inode_hash(struct inode *inode, struct inode *realinode)
+{
+	WARN_ON(!inode_unhashed(inode));
+	__insert_inode_hash(inode, (unsigned long) realinode);
+}
+
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
-	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
-		   (is_upper ? OVL_ISUPPER_MASK : 0));
+	ovl_set_inode_data(inode, realinode, is_upper);
+	if (is_upper && !S_ISDIR(realinode->i_mode))
+		ovl_insert_inode_hash(inode, realinode);
 }
 
 void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
 	WARN_ON(!upperinode);
-	WARN_ON(!inode_unhashed(inode));
-	WRITE_ONCE(inode->i_private,
-		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
+	ovl_set_inode_data(inode, upperinode, true);
 	if (!S_ISDIR(upperinode->i_mode))
-		__insert_inode_hash(inode, (unsigned long) upperinode);
+		ovl_insert_inode_hash(inode, upperinode);
 }
 
 static int ovl_inode_test(struct inode *inode, void *data)
@@ -404,7 +421,7 @@ static int ovl_inode_test(struct inode *inode, void *data)
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	inode->i_private = (void *) (((unsigned long) data) | OVL_ISUPPER_MASK);
+	ovl_set_inode_data(inode, data, true);
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 08/13] ovl: allow hashing non upper inodes
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (6 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 07/13] ovl: create helpers for initializing hashed inode Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 09/13] ovl: inherit overlay inode ino/generation from real inode Amir Goldstein
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

As preparation for hashing all overlay inodes, pass is_upper to
ovl_get_inode() and the iget5_locked() callbacks, instead of assuming
that the hashed inode is upper.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 11 ++++++-----
 fs/overlayfs/namei.c     |  2 +-
 fs/overlayfs/overlayfs.h |  7 +++++--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index a6d7c44..9a429ed 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -416,22 +416,23 @@ void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 
 static int ovl_inode_test(struct inode *inode, void *data)
 {
-	return ovl_inode_real(inode, NULL) == data;
+	return ovl_inode_real(inode, NULL) == OVL_INODE_REAL(data);
 }
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	ovl_set_inode_data(inode, data, true);
+	WRITE_ONCE(inode->i_private, data);
 	return 0;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode)
-
+struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
+			    bool is_upper)
 {
 	struct inode *inode;
 
 	inode = iget5_locked(sb, (unsigned long) realinode,
-			     ovl_inode_test, ovl_inode_set, realinode);
+			     ovl_inode_test, ovl_inode_set,
+			     ovl_inode_data(realinode, is_upper));
 	if (inode && inode->i_state & I_NEW) {
 		ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
 		set_nlink(inode, realinode->i_nlink);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 42b6030..9569ded 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -470,7 +470,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 		err = -ENOMEM;
 		if (upperdentry && !d_is_dir(upperdentry)) {
-			inode = ovl_get_inode(dentry->d_sb, realinode);
+			inode = ovl_get_inode(dentry->d_sb, realinode, true);
 		} else {
 			inode = ovl_new_inode(dentry->d_sb, realinode->i_mode,
 					      realinode->i_rdev);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f9d2d77..dacee9e 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -23,6 +23,8 @@ enum ovl_path_type {
 #define OVL_XATTR_FH OVL_XATTR_PREFIX "fh"
 
 #define OVL_ISUPPER_MASK 1UL
+#define OVL_INODE_REAL(x) ((struct inode *) \
+			   ((unsigned long)x & ~OVL_ISUPPER_MASK))
 
 static inline int ovl_do_rmdir(struct inode *dir, struct dentry *dentry)
 {
@@ -144,7 +146,7 @@ static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
 	if (is_upper)
 		*is_upper = x & OVL_ISUPPER_MASK;
 
-	return (struct inode *) (x & ~OVL_ISUPPER_MASK);
+	return OVL_INODE_REAL(x);
 }
 
 /* redirect data format for redirect by file handle */
@@ -226,7 +228,8 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
 void ovl_inode_update(struct inode *inode, struct inode *upperinode);
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode);
+struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
+			    bool is_upper);
 
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 09/13] ovl: inherit overlay inode ino/generation from real inode
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (7 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 08/13] ovl: allow hashing non upper inodes Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 10/13] ovl: hash overlay inodes by stable inode Amir Goldstein
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

When all layers are on same fs, initialize overlay inode with
ino/generation of a real inode.

Move initialization of overlay inode ino from ovl_fill_inode()
to when real inode is available, but before a new inode is inserted
to hash.

This does not affect the result of stat(2) of an overlay file,
which always returns the current real inode number.

The overlay ino is going to be used for overlay inodes lookup
by stable inode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 9a429ed..f7d89cf 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -340,7 +340,6 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
 
 static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev)
 {
-	inode->i_ino = get_next_ino();
 	inode->i_mode = mode;
 	inode->i_flags |= S_NOCMTIME;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -393,6 +392,16 @@ static void ovl_set_inode_data(struct inode *inode, struct inode *realinode,
 	WRITE_ONCE(inode->i_private, ovl_inode_data(realinode, is_upper));
 }
 
+static void ovl_inode_init_ino(struct inode *inode, struct inode *realinode)
+{
+	if (ovl_same_sb(inode->i_sb)) {
+		inode->i_ino = realinode->i_ino;
+		inode->i_generation = realinode->i_generation;
+	} else {
+		inode->i_ino = get_next_ino();
+	}
+}
+
 static void ovl_insert_inode_hash(struct inode *inode, struct inode *realinode)
 {
 	WARN_ON(!inode_unhashed(inode));
@@ -401,6 +410,7 @@ static void ovl_insert_inode_hash(struct inode *inode, struct inode *realinode)
 
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
+	ovl_inode_init_ino(inode, realinode);
 	ovl_set_inode_data(inode, realinode, is_upper);
 	if (is_upper && !S_ISDIR(realinode->i_mode))
 		ovl_insert_inode_hash(inode, realinode);
@@ -421,6 +431,7 @@ static int ovl_inode_test(struct inode *inode, void *data)
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
+	ovl_inode_init_ino(inode, OVL_INODE_REAL(data));
 	WRITE_ONCE(inode->i_private, data);
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 10/13] ovl: hash overlay inodes by stable inode
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (8 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 09/13] ovl: inherit overlay inode ino/generation from real inode Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 11/13] ovl: fix du --one-file-system on overlay mount Amir Goldstein
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

Non directory pure upper overlay inodes are hashed
by the address of the upper real inode.

When redirect_fh feature is enabled, hash all overlay inodes
by the address of the 'stable' real inode.
The stable real inode is the pre copy up inode if it stored
in overlay.fh xattr or the pure upper real inode otherwise.

This is going to be used for looking up an overlay inode
from a real stable inode for exportfs operations.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 18 +++++++++++++++---
 fs/overlayfs/namei.c | 13 ++++++++++++-
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index f7d89cf..42fa243 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -412,7 +412,8 @@ void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
 	ovl_inode_init_ino(inode, realinode);
 	ovl_set_inode_data(inode, realinode, is_upper);
-	if (is_upper && !S_ISDIR(realinode->i_mode))
+	if (ovl_redirect_fh(inode->i_sb) ||
+	    (is_upper && !S_ISDIR(realinode->i_mode)))
 		ovl_insert_inode_hash(inode, realinode);
 }
 
@@ -420,13 +421,24 @@ void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
 	WARN_ON(!upperinode);
 	ovl_set_inode_data(inode, upperinode, true);
-	if (!S_ISDIR(upperinode->i_mode))
+	if (!S_ISDIR(upperinode->i_mode) && !ovl_redirect_fh(inode->i_sb))
 		ovl_insert_inode_hash(inode, upperinode);
 }
 
 static int ovl_inode_test(struct inode *inode, void *data)
 {
-	return ovl_inode_real(inode, NULL) == OVL_INODE_REAL(data);
+	struct inode *realinode = OVL_INODE_REAL(data);
+
+	/*
+	 * When all layers on same fs, compare by ino/generation of stable
+	 * real inode, because i_private may get updated on copy up, but
+	 * overlay inode ino/generation does not get updated.
+	 */
+	if (ovl_same_sb(inode->i_sb))
+		return (inode->i_ino == realinode->i_ino &&
+			inode->i_generation == realinode->i_generation);
+	else
+		return ovl_inode_real(inode, NULL) == realinode;
 }
 
 static int ovl_inode_set(struct inode *inode, void *data)
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 9569ded..7aaff83 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -469,7 +469,18 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		realinode = d_inode(realdentry);
 
 		err = -ENOMEM;
-		if (upperdentry && !d_is_dir(upperdentry)) {
+		/* When redirect_fh is enabled, hash inodes by stable inode */
+		if (ofs->config.redirect_fh) {
+			struct dentry *stable = ctr ? stack[0].dentry :
+						upperdentry;
+
+			inode = ovl_get_inode(dentry->d_sb, d_inode(stable),
+					      !!upperdentry);
+			/* ovl_inode_real() may not be the stable inode */
+			if (realinode != d_inode(stable))
+				ovl_inode_update(inode, realinode);
+
+		} else if (upperdentry && !d_is_dir(upperdentry)) {
 			inode = ovl_get_inode(dentry->d_sb, realinode, true);
 		} else {
 			inode = ovl_new_inode(dentry->d_sb, realinode->i_mode,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 11/13] ovl: fix du --one-file-system on overlay mount
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (9 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 10/13] ovl: hash overlay inodes by stable inode Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 12/13] ovl: constant ino across copy up Amir Goldstein
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

Overlay directory inodes report overlay bdev and overlay ino to stat(2).
Overlay regular file inodes report real bdev and real ino to stat(2).

This results in wrong results from command du -x on an overlay mount,
because the regular files are not accounted for the overlay bdev usage.

The reasons for this inconsistecy is:
1. The overlay ino is not persistent, so real ino is used
2. The tupple overlay bdev and real ino is not unique, so real bdev is
   used

In case all overlay layers are on the same underlying fs, the tupple
from reason 2 above is unique, so use this tupple for regular files and
symlinks to produce the correct result from du -x.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 42fa243..1951865 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -69,6 +69,17 @@ static int ovl_getattr(const struct path *path, struct kstat *stat,
 	old_cred = ovl_override_creds(dentry->d_sb);
 	err = vfs_getattr(&realpath, stat, request_mask, flags);
 	revert_creds(old_cred);
+	if (err)
+		return err;
+
+	/*
+	 * When all layers are on same fs, the tupple overlay bdev
+	 * and real inode ino is unique, so it is preferred to expose
+	 * overlay bdev for overlay inodes for things like du -x.
+	 */
+	if (ovl_same_sb(dentry->d_sb))
+		stat->dev = dentry->d_sb->s_dev;
+
 	return err;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 12/13] ovl: constant ino across copy up
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (10 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 11/13] ovl: fix du --one-file-system on overlay mount Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-16 23:59 ` [RFC][PATCH 13/13] ovl: try to hardlink upper on copy up of lower hardlinks Amir Goldstein
  2017-04-18 18:37 ` [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

This patch is based on an earlier POC by Miklos Szeredi.

When redirect_fh is enabled, export the overlay inode ino to stat(2)
and readdir(3)/getdents(2) instead of the real upper inode ino.

The overlay inode ino is inherited from the uppermost lower real inode
(a.k.a. stable inode) and therefore remains unmodified after copy up
as well as after mount cycle.

There is an overhead of lookup per upper entry in readdir.
That overhead is a waste for pure upper dir with only pure upper
entries (i.e. no redirects), but that can be optimized later.

The overhead is minimal if the listed entries are already in dcache.
It is also quite useful for the common case of 'ls -l' that readdir()
pre populates the dcache with the listed entries, making the following
stat() calls faster.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |  6 ++++
 fs/overlayfs/namei.c     |  7 ++--
 fs/overlayfs/overlayfs.h |  2 +-
 fs/overlayfs/readdir.c   | 85 ++++++++++++++++++++++++++++++++++++++++++++----
 4 files changed, 91 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 1951865..0324d1c 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -79,6 +79,12 @@ static int ovl_getattr(const struct path *path, struct kstat *stat,
 	 */
 	if (ovl_same_sb(dentry->d_sb))
 		stat->dev = dentry->d_sb->s_dev;
+	/*
+	 * When redirect_fh is enabled, return the overlay inode ino, which is
+	 * inherited from the uppermost lower real inode (a.k.a. stable inode).
+	 */
+	if (ovl_redirect_fh(dentry->d_sb))
+		stat->ino = dentry->d_inode->i_ino;
 
 	return err;
 }
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 7aaff83..d7f3a15 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -319,18 +319,21 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
  * Returns next layer in stack starting from top.
  * Returns -1 if this is the last layer.
  */
-int ovl_path_next(int idx, struct dentry *dentry, struct path *path)
+int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
 	BUG_ON(idx < 0);
 	if (idx == 0) {
 		ovl_path_upper(dentry, path);
-		if (path->dentry)
+		if (path->dentry) {
+			*idxp = 0;
 			return oe->numlower ? 1 : -1;
+		}
 		idx++;
 	}
 	BUG_ON(idx > oe->numlower);
+	*idxp = idx;
 	*path = oe->lowerstack[idx - 1];
 
 	return (idx < oe->numlower) ? idx + 1 : -1;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index dacee9e..8092aae 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -198,7 +198,7 @@ int ovl_copy_up_start(struct dentry *dentry);
 void ovl_copy_up_end(struct dentry *dentry);
 
 /* namei.c */
-int ovl_path_next(int idx, struct dentry *dentry, struct path *path);
+int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
 bool ovl_lower_positive(struct dentry *dentry);
 
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index f241b4e..ebe15ea 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -20,10 +20,12 @@
 struct ovl_cache_entry {
 	unsigned int len;
 	unsigned int type;
+	u64 real_ino;
 	u64 ino;
 	struct list_head l_node;
 	struct rb_node node;
 	struct ovl_cache_entry *next_maybe_whiteout;
+	int idx;
 	bool is_whiteout;
 	char name[];
 };
@@ -43,6 +45,7 @@ struct ovl_readdir_data {
 	struct list_head middle;
 	struct ovl_cache_entry *first_maybe_whiteout;
 	int count;
+	int idx;
 	int err;
 	bool d_type_supported;
 };
@@ -97,8 +100,11 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
 	p->name[len] = '\0';
 	p->len = len;
 	p->type = d_type;
-	p->ino = ino;
+	p->real_ino = ino;
+	/* Defer setting d_ino for upper entry to ovl_iterate() */
+	p->ino = rdd->idx ? ino : 0;
 	p->is_whiteout = false;
+	p->idx = rdd->idx;
 
 	if (d_type == DT_CHR) {
 		p->next_maybe_whiteout = rdd->first_maybe_whiteout;
@@ -225,6 +231,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
 	}
 	revert_creds(old_cred);
 
+
 	return err;
 }
 
@@ -256,21 +263,37 @@ static inline int ovl_dir_read(struct path *realpath,
 	return err;
 }
 
+/* Can we iterate real dir directly? */
+static bool ovl_dir_is_real(struct super_block *sb, enum ovl_path_type type)
+{
+	if (OVL_TYPE_MERGE(type))
+		return false;
+	/* Upper dir may contain copied up entries that were moved into it */
+	if (ovl_redirect_fh(sb))
+		return !OVL_TYPE_UPPER(type);
+	return true;
+}
+
 static void ovl_dir_reset(struct file *file)
 {
 	struct ovl_dir_file *od = file->private_data;
 	struct ovl_dir_cache *cache = od->cache;
 	struct dentry *dentry = file->f_path.dentry;
 	enum ovl_path_type type = ovl_path_type(dentry);
+	bool is_real;
 
 	if (cache && ovl_dentry_version_get(dentry) != cache->version) {
 		ovl_cache_put(od, dentry);
 		od->cache = NULL;
 		od->cursor = NULL;
 	}
-	WARN_ON(!od->is_real && !OVL_TYPE_MERGE(type));
-	if (od->is_real && OVL_TYPE_MERGE(type))
+	is_real = ovl_dir_is_real(dentry->d_sb, type);
+	if (od->is_real != is_real) {
+		/* is_real can only become false (after dir copy up) */
+		if (WARN_ON(is_real))
+			return;
 		od->is_real = false;
+	}
 }
 
 static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list)
@@ -287,7 +310,7 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list)
 	int idx, next;
 
 	for (idx = 0; idx != -1; idx = next) {
-		next = ovl_path_next(idx, dentry, &realpath);
+		next = ovl_path_next(idx, dentry, &realpath, &rdd.idx);
 
 		if (next != -1) {
 			err = ovl_dir_read(&realpath, &rdd);
@@ -353,11 +376,55 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
 	return cache;
 }
 
+/*
+ * Set d_ino for upper entries. Non-upper entries should always report
+ * the uppermost real inode ino and should not call this function.
+ * When redirect_fh is disabled, report real ino also for upper.
+ * When redirect_fh is enabled, lookup the overlay inode of p->name
+ * under dir and report ino of the overlay inode. The overlay inode ino
+ * is inherited from the uppermost lower real inode (a.k.a. stable inode).
+ */
+static int ovl_cache_entry_update_ino(struct dentry *dir,
+				      struct ovl_cache_entry *p)
+
+{
+	struct dentry *this;
+
+	if (!ovl_redirect_fh(dir->d_sb) || WARN_ON(p->idx)) {
+		p->ino = p->real_ino;
+		return 0;
+	}
+
+	if (p->name[0] == '.') {
+		if (p->len == 1) {
+			this = dget(dir);
+			goto get;
+		}
+		if (p->len == 2 && p->name[1] == '.') {
+			/* we shall not be moved */
+			this = dget(dir->d_parent);
+			goto get;
+		}
+	}
+	this = lookup_one_len(p->name, dir, p->len);
+	if (IS_ERR_OR_NULL(this)) {
+		pr_err("overlay: failed to look up (%s) for ino (%i)\n",
+		       p->name, (int) PTR_ERR(this));
+		return -EIO;
+	}
+get:
+	p->ino = this->d_inode->i_ino;
+	dput(this);
+
+	return 0;
+}
+
 static int ovl_iterate(struct file *file, struct dir_context *ctx)
 {
 	struct ovl_dir_file *od = file->private_data;
 	struct dentry *dentry = file->f_path.dentry;
 	struct ovl_cache_entry *p;
+	int err;
 
 	if (!ctx->pos)
 		ovl_dir_reset(file);
@@ -378,9 +445,15 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
 
 	while (od->cursor != &od->cache->entries) {
 		p = list_entry(od->cursor, struct ovl_cache_entry, l_node);
-		if (!p->is_whiteout)
+		if (!p->is_whiteout) {
+			if (!p->ino) {
+				err = ovl_cache_entry_update_ino(dentry, p);
+				if (err)
+					return err;
+			}
 			if (!dir_emit(ctx, p->name, p->len, p->ino, p->type))
 				break;
+		}
 		od->cursor = p->l_node.next;
 		ctx->pos++;
 	}
@@ -502,7 +575,7 @@ static int ovl_dir_open(struct inode *inode, struct file *file)
 		return PTR_ERR(realfile);
 	}
 	od->realfile = realfile;
-	od->is_real = !OVL_TYPE_MERGE(type);
+	od->is_real = ovl_dir_is_real(inode->i_sb, type);
 	od->is_upper = OVL_TYPE_UPPER(type);
 	file->private_data = od;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC][PATCH 13/13] ovl: try to hardlink upper on copy up of lower hardlinks
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (11 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 12/13] ovl: constant ino across copy up Amir Goldstein
@ 2017-04-16 23:59 ` Amir Goldstein
  2017-04-18 18:37 ` [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
  13 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-16 23:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

When redirect_fh is enabled, overlay inodes are hashed by the
pointer to stable (pre copy up) inode.

This lets us address the issue of broken hardlinks on copy up.

On copy up of nlink > 1 inode, check if the overlay inode has
an alias that has already been copied up and link to that upper
alias instead of copying up.

At this time, there is no way to find the upper aliases unless
they are already in dentry cache from previous lookups, so this
is a best effort approach, which cannot fully prevent breaking
hardlinks.

Also, with the broken hardlink bug fixed, the consistent ro/rw
fd bug is manifested with upper and lower aliases, e.g.:

$ echo -n a > /lower/foo
$ ln /lower/foo /lower/bar
$ cd /mnt
$ tail foo bar # both aliases are ro lower
==> foo <==
a
==> bar <==
a

$ echo -n b >> foo
$ tail foo bar # foo is rw upper, bar is ro lower
==> foo <==
ab
==> bar <==
a

$ echo -n c >> bar
$ tail foo bar # both aliases are rw upper
==> foo <==
abc
==> bar <==
abc

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 45 ++++++++++++++++++++++++++++++++++++++++++---
 fs/overlayfs/inode.c     | 24 ++++++++++++++++++++++--
 fs/overlayfs/namei.c     | 25 +++++++++++++++++++------
 fs/overlayfs/overlayfs.h |  9 ++++++++-
 4 files changed, 91 insertions(+), 12 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 77a8715..c2ea82f 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -326,6 +326,16 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 		.rdev = stat->rdev,
 		.link = link
 	};
+	struct inode *upperinode;
+	struct dentry *hardlink = NULL;
+
+	/* Another alias already copied up? */
+	upperinode = ovl_inode_upper(d_inode(dentry));
+	if (upperinode) {
+		hardlink = d_find_alias(upperinode);
+		if (WARN_ON(!hardlink || tmpfile))
+			return -ESTALE;
+	}
 
 	upper = lookup_one_len(dentry->d_name.name, upperdir,
 			       dentry->d_name.len);
@@ -350,7 +360,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 
 	err = 0;
 	if (!tmpfile)
-		err = ovl_create_real(wdir, temp, &cattr, NULL, true);
+		err = ovl_create_real(wdir, temp, &cattr, hardlink, true);
 
 	if (new_creds) {
 		revert_creds(old_creds);
@@ -360,6 +370,9 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out2;
 
+	if (hardlink)
+		goto temp_ready;
+
 	if (S_ISREG(stat->mode)) {
 		struct path upperpath;
 
@@ -401,6 +414,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 			goto out_cleanup;
 	}
 
+temp_ready:
 	if (tmpfile)
 		err = ovl_do_link(temp, udir, upper, true);
 	else
@@ -410,7 +424,24 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 
 	newdentry = dget(tmpfile ? upper : temp);
 	ovl_dentry_update(dentry, newdentry);
-	ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	if (!hardlink) {
+		upperinode = ovl_inode_update(d_inode(dentry),
+					      d_inode(newdentry));
+		/* Raced with copy up or lookup of an alias? */
+		if (upperinode) {
+			/* Try to undo this copy up and restart */
+			err = ovl_do_rename(udir, upper, wdir, temp, 0);
+			if (!err) {
+				err = -ERESTARTSYS;
+				goto out_cleanup;
+			}
+			/* Keep broken hardlink and invalidate dentry */
+			pr_warn_ratelimited("overlayfs: copy up broken hardlink (%lu:%lu)\n",
+					    d_inode(dentry)->i_ino,
+					    upperinode->i_ino);
+			d_drop(dentry);
+		}
+	}
 
 	/* Restore timestamps on parent (best effort) */
 	ovl_set_timestamps(upperdir, pstat);
@@ -419,6 +450,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 out1:
 	dput(upper);
 out:
+	dput(hardlink);
 	return err;
 
 out_cleanup:
@@ -448,6 +480,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	struct dentry *upperdir;
 	const char *link = NULL;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	int retries = 0;
 
 	if (WARN_ON(!workdir))
 		return -EROFS;
@@ -469,7 +502,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	}
 
 	/* Should we copyup with O_TMPFILE or with workdir? */
-	if (S_ISREG(stat->mode) && ofs->tmpfile) {
+	if (S_ISREG(stat->mode) && stat->nlink == 1 && ofs->tmpfile) {
 		err = ovl_copy_up_start(dentry);
 		/* err < 0: interrupted, err > 0: raced with another copy-up */
 		if (unlikely(err)) {
@@ -498,8 +531,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		goto out_unlock;
 	}
 
+retry:
 	err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
 				 stat, link, &pstat, false);
+	if (err == -ERESTARTSYS) {
+		if (!retries++)
+			goto retry;
+		err = -ESTALE;
+	}
 out_unlock:
 	unlock_rename(workdir, upperdir);
 out_done:
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 0324d1c..7a56ff4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -434,12 +434,32 @@ void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 		ovl_insert_inode_hash(inode, realinode);
 }
 
-void ovl_inode_update(struct inode *inode, struct inode *upperinode)
+/*
+ * Update inode private data to upper inode after copy up or lookup.
+ * Take care not to race with lookup or copy up of another hardlink
+ * of the same stable inode, which uses the same overlay inode.
+ * If overlay inode should be hashed on copy up, insert inode to hash.
+ *
+ * If private date was already set to another upper inode, don't
+ * change private date and return the existing real inode value.
+ * Return NULL if private data was updated to upper inode.
+ */
+struct inode *ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
+	struct inode *realinode;
+	bool is_upper;
+
 	WARN_ON(!upperinode);
-	ovl_set_inode_data(inode, upperinode, true);
+	spin_lock(&inode->i_lock);
+	realinode = ovl_inode_real(inode, &is_upper);
+	if (!is_upper)
+		ovl_set_inode_data(inode, upperinode, true);
+	spin_unlock(&inode->i_lock);
+	if (is_upper)
+		return realinode;
 	if (!S_ISDIR(upperinode->i_mode) && !ovl_redirect_fh(inode->i_sb))
 		ovl_insert_inode_hash(inode, upperinode);
+	return NULL;
 }
 
 static int ovl_inode_test(struct inode *inode, void *data)
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index d7f3a15..345be9a 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -474,14 +474,27 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		err = -ENOMEM;
 		/* When redirect_fh is enabled, hash inodes by stable inode */
 		if (ofs->config.redirect_fh) {
-			struct dentry *stable = ctr ? stack[0].dentry :
-						upperdentry;
+			struct inode *stable = d_inode(ctr ? stack[0].dentry :
+							     upperdentry);
 
-			inode = ovl_get_inode(dentry->d_sb, d_inode(stable),
+			inode = ovl_get_inode(dentry->d_sb, stable,
 					      !!upperdentry);
-			/* ovl_inode_real() may not be the stable inode */
-			if (realinode != d_inode(stable))
-				ovl_inode_update(inode, realinode);
+			/*
+			 * For non-pure-upper we need to update realinode.
+			 * For stable inode with nlink > 1, another alias could
+			 * have been copied up to a different upper inode.
+			 * In that case, fall back to hashing a new overlay
+			 * inode by this upper inode (and not by stable inode).
+			 */
+			if (inode && upperdentry && ctr &&
+			    !!ovl_inode_update(inode, realinode)) {
+				pr_warn_ratelimited("overlayfs: found broken hardlink (%lu:%lu)\n",
+						    inode->i_ino,
+						    realinode->i_ino);
+				iput(inode);
+				inode = ovl_get_inode(dentry->d_sb, realinode,
+						      true);
+			}
 
 		} else if (upperdentry && !d_is_dir(upperdentry)) {
 			inode = ovl_get_inode(dentry->d_sb, realinode, true);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 8092aae..bb1d72a 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -149,6 +149,13 @@ static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
 	return OVL_INODE_REAL(x);
 }
 
+static inline struct inode *ovl_inode_upper(struct inode *inode)
+{
+	unsigned long x = (unsigned long) READ_ONCE(inode->i_private);
+
+	return (x & OVL_ISUPPER_MASK) ? OVL_INODE_REAL(x) : NULL;
+}
+
 /* redirect data format for redirect by file handle */
 struct ovl_fh {
 	unsigned char version;	/* 0 */
@@ -227,7 +234,7 @@ bool ovl_is_private_xattr(const char *name);
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
-void ovl_inode_update(struct inode *inode, struct inode *upperinode);
+struct inode *ovl_inode_update(struct inode *inode, struct inode *upperinode);
 struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
 			    bool is_upper);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-16 23:59 ` [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up Amir Goldstein
@ 2017-04-17 13:33   ` Rock Lee
  2017-04-17 14:03     ` Amir Goldstein
  2017-04-17 19:49   ` Vivek Goyal
  2017-04-19 15:16   ` Miklos Szeredi
  2 siblings, 1 reply; 43+ messages in thread
From: Rock Lee @ 2017-04-17 13:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

Hi, Amir:

> +static int ovl_set_redirect_fh(struct dentry *dentry, struct dentry *upper)
> +{
> +       int err;
> +       const struct ovl_fh *fh;
> +
> +       fh = ovl_get_redirect_fh(ovl_dentry_lower(dentry));
> +       err = PTR_ERR(fh);
> +       if (IS_ERR(fh))
> +               goto out_err;
> +
> +       err = ovl_do_setxattr(upper, OVL_XATTR_FH, fh, fh->len, 0);
> +       if (err)
> +               goto out_free;

I'm not quite sure, maybe you missed a 'free(fh)' here.

> +       return 0;
> +
> +out_free:
> +       kfree(fh);
> +out_err:
> +       if (err == -EOPNOTSUPP) {
> +               ovl_clear_redirect_fh(dentry->d_sb);
> +               return 0;
> +       }
> +       pr_warn_ratelimited("overlay: failed to set redirect fh (%i)\n", err);
> +       return err;
> +}
> +






-- 
Cheers,
Rock

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-17 13:33   ` Rock Lee
@ 2017-04-17 14:03     ` Amir Goldstein
  0 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-17 14:03 UTC (permalink / raw)
  To: Rock Lee; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Mon, Apr 17, 2017 at 3:33 PM, Rock Lee <rockdotlee@gmail.com> wrote:
> Hi, Amir:
>
>> +static int ovl_set_redirect_fh(struct dentry *dentry, struct dentry *upper)
>> +{
>> +       int err;
>> +       const struct ovl_fh *fh;
>> +
>> +       fh = ovl_get_redirect_fh(ovl_dentry_lower(dentry));
>> +       err = PTR_ERR(fh);
>> +       if (IS_ERR(fh))
>> +               goto out_err;
>> +
>> +       err = ovl_do_setxattr(upper, OVL_XATTR_FH, fh, fh->len, 0);
>> +       if (err)
>> +               goto out_free;
>
> I'm not quite sure, maybe you missed a 'free(fh)' here.

It's right there at out_free :)

Thanks for reviewing!

>
>> +       return 0;
>> +
>> +out_free:
>> +       kfree(fh);
>> +out_err:
>> +       if (err == -EOPNOTSUPP) {
>> +               ovl_clear_redirect_fh(dentry->d_sb);
>> +               return 0;
>> +       }
>> +       pr_warn_ratelimited("overlay: failed to set redirect fh (%i)\n", err);
>> +       return err;
>> +}
>> +
>
>
>
>
>
>
> --
> Cheers,
> Rock

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-16 23:59 ` [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up Amir Goldstein
  2017-04-17 13:33   ` Rock Lee
@ 2017-04-17 19:49   ` Vivek Goyal
  2017-04-17 21:14     ` Amir Goldstein
  2017-04-19 15:16   ` Miklos Szeredi
  2 siblings, 1 reply; 43+ messages in thread
From: Vivek Goyal @ 2017-04-17 19:49 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Mon, Apr 17, 2017 at 02:59:32AM +0300, Amir Goldstein wrote:
> When mounted with mount option redirect_dir=fh,
> every copy up of lower directory stores the lower
> dir file handle in overlay.fh xattr of upper dir.
> 
> This method has some advantages over absolute path redirect:
> - it is more compact in stored xattr size
> - it is not limited by lengths of full paths
> - lookup redirect is more efficient for very nested directories
> 
> It also has some disadvantages over absolute path redirect:
> - it requires that all lower layers are on the same file system,
>   which support exportfs ops
> - file handles will become stale if overlay lower directories
>   were to be copied to another location
> 
> Therefore, redirect by file handle is considered an optimization
> and file handle is stored on copy up in addition to setting
> redirect by path on rename.

Hi Amir, 

If this is just an optimization to redirect, then why not store it
only during rename (like redirct path). 

Given you are storing it all the time for merged dir during copy up, 
looks like you plan to use it for other use cases too. May be it
makes sense to either mention those use cases here or in this patch
just create file handle over rename and then in later patches change
it with explanation.

Or I missed the intent completely?

> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/Kconfig     | 16 +++++++++
>  fs/overlayfs/copy_up.c   | 87 ++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/overlayfs/overlayfs.h | 15 +++++++++
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/super.c     | 28 ++++++++++++++--
>  fs/overlayfs/util.c      | 14 ++++++++
>  6 files changed, 159 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index 0daac51..351b61a 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -22,3 +22,19 @@ config OVERLAY_FS_REDIRECT_DIR
>  	  Note, that redirects are not backward compatible.  That is, mounting
>  	  an overlay which has redirects on a kernel that doesn't support this
>  	  feature will have unexpected results.
> +
> +config OVERLAY_FS_REDIRECT_FH
> +	bool "Overlayfs: turn on redirect by file handle by default"
> +	depends on OVERLAY_FS_REDIRECT_DIR
> +	help
> +	  If this config option is enabled then overlay filesystems will use
> +	  redirect by file handle for merged directories by default.  It is
> +	  also possible to turn on redirect by file handle globally with the
> +	  "redirect_fh=on" module option or on a filesystem instance basis
> +	  with the "redirect_dir=fh" mount option.
> +
> +	  Redirect by file handle provides faster lookup compared to redirect
> +	  by absolute path, but it only works when all layers are on the same
> +	  underlying file system that supports NFS export ops.  Redirect by
> +	  file handle breaks if layers are taken offline and copied to another
> +	  location.  In that case, lookup falls back to redirect by path.
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 906ea6c..428dc26 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -20,6 +20,7 @@
>  #include <linux/namei.h>
>  #include <linux/fdtable.h>
>  #include <linux/ratelimit.h>
> +#include <linux/exportfs.h>
>  #include "overlayfs.h"
>  #include "ovl_entry.h"
>  
> @@ -232,6 +233,80 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
>  	return err;
>  }
>  
> +static struct ovl_fh *ovl_get_redirect_fh(struct dentry *lower)
> +{
> +	const struct export_operations *nop = lower->d_sb->s_export_op;
> +	struct ovl_fh *fh;
> +	int fh_type, fh_len, dwords;
> +	void *buf, *ret;
> +	int buflen = MAX_HANDLE_SZ;
> +
> +	/* Do not encode file handle if we cannot decode it later */
> +	if (!nop || !nop->fh_to_dentry) {
> +		pr_info("overlay: redirect by file handle not supported "
> +			"by lower - turning off redirect\n");

This message probably should go where you are actually turning off the
feature?

> +		return ERR_PTR(-EOPNOTSUPP);
> +	}
> +
> +	buf = kmalloc(buflen, GFP_TEMPORARY);
> +	if (!buf)
> +		return ERR_PTR(-ENOMEM);
> +
> +	fh = buf;
> +	dwords = (buflen - offsetof(struct ovl_fh, fid)) >> 2;
> +	fh_type = exportfs_encode_fh(lower,
> +				     (struct fid *)fh->fid,
> +				     &dwords, 0);
> +	fh_len = (dwords << 2) + offsetof(struct ovl_fh, fid);
> +
> +	ret = ERR_PTR(-EOVERFLOW);
> +	if (fh_len > buflen || fh_type <= 0 || fh_type == FILEID_INVALID)
> +		goto out;
> +
> +	fh->version = OVL_FH_VERSION;
> +	fh->magic = OVL_FH_MAGIC;
> +	fh->type = fh_type;
> +	fh->len = fh_len;
> +
> +	ret = kmalloc(fh_len, GFP_KERNEL);
> +	if (!ret) {
> +		ret = ERR_PTR(-ENOMEM);
> +		goto out;
> +	}
> +
> +	memcpy(ret, buf, fh_len);
> +out:
> +	kfree(buf);
> +	return ret;
> +}
> +
> +static int ovl_set_redirect_fh(struct dentry *dentry, struct dentry *upper)
> +{
> +	int err;
> +	const struct ovl_fh *fh;
> +
> +	fh = ovl_get_redirect_fh(ovl_dentry_lower(dentry));
> +	err = PTR_ERR(fh);
> +	if (IS_ERR(fh))
> +		goto out_err;
> +
> +	err = ovl_do_setxattr(upper, OVL_XATTR_FH, fh, fh->len, 0);
> +	if (err)
> +		goto out_free;
> +
> +	return 0;
> +
> +out_free:
> +	kfree(fh);
> +out_err:
> +	if (err == -EOPNOTSUPP) {
> +		ovl_clear_redirect_fh(dentry->d_sb);
> +		return 0;
> +	}
> +	pr_warn_ratelimited("overlay: failed to set redirect fh (%i)\n", err);
> +	return err;
> +}
> +
>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>  			      struct dentry *dentry, struct path *lowerpath,
>  			      struct kstat *stat, const char *link,
> @@ -316,6 +391,18 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>  	if (err)
>  		goto out_cleanup;
>  
> +	if (S_ISDIR(stat->mode) &&
> +	    ovl_redirect_dir(dentry->d_sb) &&
> +	    ovl_redirect_fh(dentry->d_sb)) {

Do we need to check for redirect_dir. Looks like redirect_dir has to
be enabled if redirect_fh is enabled?

> +		/*
> +		 * Store file handle of lower dir in upper dir xattr to
> +		 * create a chain of file handles for merged dir stack
> +		 */
> +		err = ovl_set_redirect_fh(dentry, temp);
> +		if (err)
> +			goto out_cleanup;
> +	}
> +
>  	if (tmpfile)
>  		err = ovl_do_link(temp, udir, upper, true);
>  	else
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index c851158..49be199 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -20,6 +20,7 @@ enum ovl_path_type {
>  #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
>  #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
>  #define OVL_XATTR_REDIRECT OVL_XATTR_PREFIX "redirect"
> +#define OVL_XATTR_FH OVL_XATTR_PREFIX "fh"
>  
>  #define OVL_ISUPPER_MASK 1UL
>  
> @@ -146,6 +147,18 @@ static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
>  	return (struct inode *) (x & ~OVL_ISUPPER_MASK);
>  }
>  
> +/* redirect data format for redirect by file handle */
> +struct ovl_fh {
> +	unsigned char version;	/* 0 */
> +	unsigned char magic;	/* 0xfb */
> +	unsigned char len;	/* size of this header + size of fid */
> +	unsigned char type;	/* fid_type of fid */
> +	unsigned char fid[0];	/* file identifier */
> +} __packed;
> +
> +#define OVL_FH_VERSION	0
> +#define OVL_FH_MAGIC	0xfb
> +
>  /* util.c */
>  int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
> @@ -171,6 +184,8 @@ bool ovl_redirect_dir(struct super_block *sb);
>  void ovl_clear_redirect_dir(struct super_block *sb);
>  const char *ovl_dentry_get_redirect(struct dentry *dentry);
>  void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
> +bool ovl_redirect_fh(struct super_block *sb);
> +void ovl_clear_redirect_fh(struct super_block *sb);
>  void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
>  void ovl_inode_init(struct inode *inode, struct inode *realinode,
>  		    bool is_upper);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index e294f22..cf22992 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -14,6 +14,7 @@ struct ovl_config {
>  	char *workdir;
>  	bool default_permissions;
>  	bool redirect_dir;
> +	bool redirect_fh;
>  };
>  
>  /* private information held for overlayfs's superblock */
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 79500d9..3036b2d 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -17,6 +17,7 @@
>  #include <linux/statfs.h>
>  #include <linux/seq_file.h>
>  #include <linux/posix_acl_xattr.h>
> +#include <linux/exportfs.h>
>  #include "overlayfs.h"
>  #include "ovl_entry.h"
>  
> @@ -34,6 +35,12 @@ module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_redirect_dir_def,
>  		 "Default to on or off for the redirect_dir feature");
>  
> +static bool ovl_redirect_fh_def =
> +	    IS_ENABLED(CONFIG_OVERLAY_FS_REDIRECT_FH);
> +module_param_named(redirect_fh, ovl_redirect_fh_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_redirect_fh_def,
> +		 "Default to on or off for redirect by file handle");
> +
>  static void ovl_dentry_release(struct dentry *dentry)
>  {
>  	struct ovl_entry *oe = dentry->d_fsdata;
> @@ -246,9 +253,11 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>  	}
>  	if (ufs->config.default_permissions)
>  		seq_puts(m, ",default_permissions");
> -	if (ufs->config.redirect_dir != ovl_redirect_dir_def)
> +	if (ufs->config.redirect_dir != ovl_redirect_dir_def ||
> +	    ufs->config.redirect_fh != ovl_redirect_fh_def)
>  		seq_printf(m, ",redirect_dir=%s",
> -			   ufs->config.redirect_dir ? "on" : "off");
> +			   !ufs->config.redirect_dir ? "off" :
> +			   ufs->config.redirect_fh ? "fh" : "on");
>  	return 0;
>  }
>  
> @@ -278,6 +287,7 @@ enum {
>  	OPT_DEFAULT_PERMISSIONS,
>  	OPT_REDIRECT_DIR_ON,
>  	OPT_REDIRECT_DIR_OFF,
> +	OPT_REDIRECT_DIR_FH,
>  	OPT_ERR,
>  };
>  
> @@ -288,6 +298,7 @@ static const match_table_t ovl_tokens = {
>  	{OPT_DEFAULT_PERMISSIONS,	"default_permissions"},
>  	{OPT_REDIRECT_DIR_ON,		"redirect_dir=on"},
>  	{OPT_REDIRECT_DIR_OFF,		"redirect_dir=off"},
> +	{OPT_REDIRECT_DIR_FH,		"redirect_dir=fh"},
>  	{OPT_ERR,			NULL}
>  };
>  
> @@ -354,10 +365,17 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  
>  		case OPT_REDIRECT_DIR_ON:
>  			config->redirect_dir = true;
> +			config->redirect_fh = false;
>  			break;

Is this ->redirect_fh=false declarations needed? We do ufs allocation
using kzalloc() and that means all the fields are 0 value.
>  
>  		case OPT_REDIRECT_DIR_OFF:
>  			config->redirect_dir = false;
> +			config->redirect_fh = false;
> +			break;
> +
> +		case OPT_REDIRECT_DIR_FH:
> +			config->redirect_dir = true;
> +			config->redirect_fh = true;
>  			break;
>  
>  		default:
> @@ -754,6 +772,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	init_waitqueue_head(&ufs->copyup_wq);
>  	ufs->config.redirect_dir = ovl_redirect_dir_def;
> +	ufs->config.redirect_fh = ovl_redirect_fh_def;
>  	err = ovl_parse_opt((char *) data, &ufs->config);
>  	if (err)
>  		goto out_free_config;
> @@ -934,6 +953,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	else
>  		sb->s_d_op = &ovl_dentry_operations;
>  
> +	/* Redirect by fhandle only if all layers on same sb with export ops */
> +	if (!ufs->same_sb || !ufs->same_sb->s_export_op ||
> +	    !ufs->same_sb->s_export_op->fh_to_dentry)
> +		ufs->config.redirect_fh = false;
> +
>  	ufs->creator_cred = cred = prepare_creds();
>  	if (!cred)
>  		goto out_put_lower_mnt;
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index dcebef0..17530c5 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -215,6 +215,20 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
>  	oe->redirect = redirect;
>  }
>  
> +bool ovl_redirect_fh(struct super_block *sb)
> +{
> +	struct ovl_fs *ofs = sb->s_fs_info;
> +
> +	return ofs->config.redirect_fh;
> +}
> +
> +void ovl_clear_redirect_fh(struct super_block *sb)
> +{
> +	struct ovl_fs *ofs = sb->s_fs_info;
> +
> +	ofs->config.redirect_fh = false;
> +}
> +
>  void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
>  {
>  	struct ovl_entry *oe = dentry->d_fsdata;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-17 19:49   ` Vivek Goyal
@ 2017-04-17 21:14     ` Amir Goldstein
  0 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-17 21:14 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Mon, Apr 17, 2017 at 10:49 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, Apr 17, 2017 at 02:59:32AM +0300, Amir Goldstein wrote:
>> When mounted with mount option redirect_dir=fh,
>> every copy up of lower directory stores the lower
>> dir file handle in overlay.fh xattr of upper dir.
>>
>> This method has some advantages over absolute path redirect:
>> - it is more compact in stored xattr size
>> - it is not limited by lengths of full paths
>> - lookup redirect is more efficient for very nested directories
>>
>> It also has some disadvantages over absolute path redirect:
>> - it requires that all lower layers are on the same file system,
>>   which support exportfs ops
>> - file handles will become stale if overlay lower directories
>>   were to be copied to another location
>>
>> Therefore, redirect by file handle is considered an optimization
>> and file handle is stored on copy up in addition to setting
>> redirect by path on rename.
>
> Hi Amir,
>
> If this is just an optimization to redirect, then why not store it
> only during rename (like redirct path).
>
> Given you are storing it all the time for merged dir during copy up,
> looks like you plan to use it for other use cases too. May be it
> makes sense to either mention those use cases here or in this patch
> just create file handle over rename and then in later patches change
> it with explanation.
>
> Or I missed the intent completely?
>


Hi Vivek,

Patches 2-3 (redirect fh for dirs) are resend of patches I have been
carrying for a while now for redirect dir of snapshots, so the commit
message is a bit out dated.
I guess it makes some sense to store fh on rename in this patch
and change it to store on copy up in patch 4, but I'll only make
this cosmetic effort if Miklos insists ;-)
Probably best if I just squash patch 4 into patch 2 and describe
the config option/module param/mount option for what they
are used for in this patch set and not only the redirect dir optimization.

>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/Kconfig     | 16 +++++++++
>>  fs/overlayfs/copy_up.c   | 87 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  fs/overlayfs/overlayfs.h | 15 +++++++++
>>  fs/overlayfs/ovl_entry.h |  1 +
>>  fs/overlayfs/super.c     | 28 ++++++++++++++--
>>  fs/overlayfs/util.c      | 14 ++++++++
>>  6 files changed, 159 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
>> index 0daac51..351b61a 100644
>> --- a/fs/overlayfs/Kconfig
>> +++ b/fs/overlayfs/Kconfig
>> @@ -22,3 +22,19 @@ config OVERLAY_FS_REDIRECT_DIR
>>         Note, that redirects are not backward compatible.  That is, mounting
>>         an overlay which has redirects on a kernel that doesn't support this
>>         feature will have unexpected results.
>> +
>> +config OVERLAY_FS_REDIRECT_FH
>> +     bool "Overlayfs: turn on redirect by file handle by default"
>> +     depends on OVERLAY_FS_REDIRECT_DIR
>> +     help
>> +       If this config option is enabled then overlay filesystems will use
>> +       redirect by file handle for merged directories by default.  It is
>> +       also possible to turn on redirect by file handle globally with the
>> +       "redirect_fh=on" module option or on a filesystem instance basis
>> +       with the "redirect_dir=fh" mount option.
>> +
>> +       Redirect by file handle provides faster lookup compared to redirect
>> +       by absolute path, but it only works when all layers are on the same
>> +       underlying file system that supports NFS export ops.  Redirect by
>> +       file handle breaks if layers are taken offline and copied to another
>> +       location.  In that case, lookup falls back to redirect by path.
>> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>> index 906ea6c..428dc26 100644
>> --- a/fs/overlayfs/copy_up.c
>> +++ b/fs/overlayfs/copy_up.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/namei.h>
>>  #include <linux/fdtable.h>
>>  #include <linux/ratelimit.h>
>> +#include <linux/exportfs.h>
>>  #include "overlayfs.h"
>>  #include "ovl_entry.h"
>>
>> @@ -232,6 +233,80 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
>>       return err;
>>  }
>>
>> +static struct ovl_fh *ovl_get_redirect_fh(struct dentry *lower)
>> +{
>> +     const struct export_operations *nop = lower->d_sb->s_export_op;
>> +     struct ovl_fh *fh;
>> +     int fh_type, fh_len, dwords;
>> +     void *buf, *ret;
>> +     int buflen = MAX_HANDLE_SZ;
>> +
>> +     /* Do not encode file handle if we cannot decode it later */
>> +     if (!nop || !nop->fh_to_dentry) {
>> +             pr_info("overlay: redirect by file handle not supported "
>> +                     "by lower - turning off redirect\n");
>
> This message probably should go where you are actually turning off the
> feature?
>

This test should probably go away, or turn into a WARN_ON, because
I added a check for export ops in all layers at mount time.

>> +             return ERR_PTR(-EOPNOTSUPP);
>> +     }
>> +
>> +     buf = kmalloc(buflen, GFP_TEMPORARY);
>> +     if (!buf)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     fh = buf;
>> +     dwords = (buflen - offsetof(struct ovl_fh, fid)) >> 2;
>> +     fh_type = exportfs_encode_fh(lower,
>> +                                  (struct fid *)fh->fid,
>> +                                  &dwords, 0);
>> +     fh_len = (dwords << 2) + offsetof(struct ovl_fh, fid);
>> +
>> +     ret = ERR_PTR(-EOVERFLOW);
>> +     if (fh_len > buflen || fh_type <= 0 || fh_type == FILEID_INVALID)
>> +             goto out;
>> +
>> +     fh->version = OVL_FH_VERSION;
>> +     fh->magic = OVL_FH_MAGIC;
>> +     fh->type = fh_type;
>> +     fh->len = fh_len;
>> +
>> +     ret = kmalloc(fh_len, GFP_KERNEL);
>> +     if (!ret) {
>> +             ret = ERR_PTR(-ENOMEM);
>> +             goto out;
>> +     }
>> +
>> +     memcpy(ret, buf, fh_len);
>> +out:
>> +     kfree(buf);
>> +     return ret;
>> +}
>> +
>> +static int ovl_set_redirect_fh(struct dentry *dentry, struct dentry *upper)
>> +{
>> +     int err;
>> +     const struct ovl_fh *fh;
>> +
>> +     fh = ovl_get_redirect_fh(ovl_dentry_lower(dentry));
>> +     err = PTR_ERR(fh);
>> +     if (IS_ERR(fh))
>> +             goto out_err;
>> +
>> +     err = ovl_do_setxattr(upper, OVL_XATTR_FH, fh, fh->len, 0);
>> +     if (err)
>> +             goto out_free;
>> +
>> +     return 0;
>> +
>> +out_free:
>> +     kfree(fh);
>> +out_err:
>> +     if (err == -EOPNOTSUPP) {
>> +             ovl_clear_redirect_fh(dentry->d_sb);
>> +             return 0;
>> +     }
>> +     pr_warn_ratelimited("overlay: failed to set redirect fh (%i)\n", err);
>> +     return err;
>> +}
>> +
>>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>>                             struct dentry *dentry, struct path *lowerpath,
>>                             struct kstat *stat, const char *link,
>> @@ -316,6 +391,18 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>>       if (err)
>>               goto out_cleanup;
>>
>> +     if (S_ISDIR(stat->mode) &&
>> +         ovl_redirect_dir(dentry->d_sb) &&
>> +         ovl_redirect_fh(dentry->d_sb)) {
>
> Do we need to check for redirect_dir. Looks like redirect_dir has to
> be enabled if redirect_fh is enabled?
>

No, we don't. I removed it in patch 4.
I am waiting for feedback from Miklos on my choice of module parameter
name/type and the mount options redirect_dir=off/on/fh
(it made sense when I wrote this patch but maybe not anymore)
I'll sort this out after we agree on these details.

>> +             /*
>> +              * Store file handle of lower dir in upper dir xattr to
>> +              * create a chain of file handles for merged dir stack
>> +              */
>> +             err = ovl_set_redirect_fh(dentry, temp);
>> +             if (err)
>> +                     goto out_cleanup;
>> +     }
>> +
>>       if (tmpfile)
>>               err = ovl_do_link(temp, udir, upper, true);
>>       else
>> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
>> index c851158..49be199 100644
>> --- a/fs/overlayfs/overlayfs.h
>> +++ b/fs/overlayfs/overlayfs.h
>> @@ -20,6 +20,7 @@ enum ovl_path_type {
>>  #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
>>  #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
>>  #define OVL_XATTR_REDIRECT OVL_XATTR_PREFIX "redirect"
>> +#define OVL_XATTR_FH OVL_XATTR_PREFIX "fh"
>>
>>  #define OVL_ISUPPER_MASK 1UL
>>
>> @@ -146,6 +147,18 @@ static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
>>       return (struct inode *) (x & ~OVL_ISUPPER_MASK);
>>  }
>>
>> +/* redirect data format for redirect by file handle */
>> +struct ovl_fh {
>> +     unsigned char version;  /* 0 */
>> +     unsigned char magic;    /* 0xfb */
>> +     unsigned char len;      /* size of this header + size of fid */
>> +     unsigned char type;     /* fid_type of fid */
>> +     unsigned char fid[0];   /* file identifier */
>> +} __packed;
>> +
>> +#define OVL_FH_VERSION       0
>> +#define OVL_FH_MAGIC 0xfb
>> +
>>  /* util.c */
>>  int ovl_want_write(struct dentry *dentry);
>>  void ovl_drop_write(struct dentry *dentry);
>> @@ -171,6 +184,8 @@ bool ovl_redirect_dir(struct super_block *sb);
>>  void ovl_clear_redirect_dir(struct super_block *sb);
>>  const char *ovl_dentry_get_redirect(struct dentry *dentry);
>>  void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
>> +bool ovl_redirect_fh(struct super_block *sb);
>> +void ovl_clear_redirect_fh(struct super_block *sb);
>>  void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
>>  void ovl_inode_init(struct inode *inode, struct inode *realinode,
>>                   bool is_upper);
>> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
>> index e294f22..cf22992 100644
>> --- a/fs/overlayfs/ovl_entry.h
>> +++ b/fs/overlayfs/ovl_entry.h
>> @@ -14,6 +14,7 @@ struct ovl_config {
>>       char *workdir;
>>       bool default_permissions;
>>       bool redirect_dir;
>> +     bool redirect_fh;
>>  };
>>
>>  /* private information held for overlayfs's superblock */
>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>> index 79500d9..3036b2d 100644
>> --- a/fs/overlayfs/super.c
>> +++ b/fs/overlayfs/super.c
>> @@ -17,6 +17,7 @@
>>  #include <linux/statfs.h>
>>  #include <linux/seq_file.h>
>>  #include <linux/posix_acl_xattr.h>
>> +#include <linux/exportfs.h>
>>  #include "overlayfs.h"
>>  #include "ovl_entry.h"
>>
>> @@ -34,6 +35,12 @@ module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644);
>>  MODULE_PARM_DESC(ovl_redirect_dir_def,
>>                "Default to on or off for the redirect_dir feature");
>>
>> +static bool ovl_redirect_fh_def =
>> +         IS_ENABLED(CONFIG_OVERLAY_FS_REDIRECT_FH);
>> +module_param_named(redirect_fh, ovl_redirect_fh_def, bool, 0644);
>> +MODULE_PARM_DESC(ovl_redirect_fh_def,
>> +              "Default to on or off for redirect by file handle");
>> +
>>  static void ovl_dentry_release(struct dentry *dentry)
>>  {
>>       struct ovl_entry *oe = dentry->d_fsdata;
>> @@ -246,9 +253,11 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>>       }
>>       if (ufs->config.default_permissions)
>>               seq_puts(m, ",default_permissions");
>> -     if (ufs->config.redirect_dir != ovl_redirect_dir_def)
>> +     if (ufs->config.redirect_dir != ovl_redirect_dir_def ||
>> +         ufs->config.redirect_fh != ovl_redirect_fh_def)
>>               seq_printf(m, ",redirect_dir=%s",
>> -                        ufs->config.redirect_dir ? "on" : "off");
>> +                        !ufs->config.redirect_dir ? "off" :
>> +                        ufs->config.redirect_fh ? "fh" : "on");
>>       return 0;
>>  }
>>
>> @@ -278,6 +287,7 @@ enum {
>>       OPT_DEFAULT_PERMISSIONS,
>>       OPT_REDIRECT_DIR_ON,
>>       OPT_REDIRECT_DIR_OFF,
>> +     OPT_REDIRECT_DIR_FH,
>>       OPT_ERR,
>>  };
>>
>> @@ -288,6 +298,7 @@ static const match_table_t ovl_tokens = {
>>       {OPT_DEFAULT_PERMISSIONS,       "default_permissions"},
>>       {OPT_REDIRECT_DIR_ON,           "redirect_dir=on"},
>>       {OPT_REDIRECT_DIR_OFF,          "redirect_dir=off"},
>> +     {OPT_REDIRECT_DIR_FH,           "redirect_dir=fh"},
>>       {OPT_ERR,                       NULL}
>>  };
>>
>> @@ -354,10 +365,17 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>>
>>               case OPT_REDIRECT_DIR_ON:
>>                       config->redirect_dir = true;
>> +                     config->redirect_fh = false;
>>                       break;
>
> Is this ->redirect_fh=false declarations needed? We do ufs allocation
> using kzalloc() and that means all the fields are 0 value.

Unless some bozo comes along and does

mount -t overlay -o redirect_dir=fh,redirect_dir=on ...

We are not preventing this, we don't need to.

>>
>>               case OPT_REDIRECT_DIR_OFF:
>>                       config->redirect_dir = false;
>> +                     config->redirect_fh = false;
>> +                     break;
>> +
>> +             case OPT_REDIRECT_DIR_FH:
>> +                     config->redirect_dir = true;
>> +                     config->redirect_fh = true;
>>                       break;
>>
>>               default:
>> @@ -754,6 +772,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>
>>       init_waitqueue_head(&ufs->copyup_wq);
>>       ufs->config.redirect_dir = ovl_redirect_dir_def;
>> +     ufs->config.redirect_fh = ovl_redirect_fh_def;
>>       err = ovl_parse_opt((char *) data, &ufs->config);
>>       if (err)
>>               goto out_free_config;
>> @@ -934,6 +953,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>>       else
>>               sb->s_d_op = &ovl_dentry_operations;
>>
>> +     /* Redirect by fhandle only if all layers on same sb with export ops */
>> +     if (!ufs->same_sb || !ufs->same_sb->s_export_op ||
>> +         !ufs->same_sb->s_export_op->fh_to_dentry)
>> +             ufs->config.redirect_fh = false;
>> +
>>       ufs->creator_cred = cred = prepare_creds();
>>       if (!cred)
>>               goto out_put_lower_mnt;
>> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
>> index dcebef0..17530c5 100644
>> --- a/fs/overlayfs/util.c
>> +++ b/fs/overlayfs/util.c
>> @@ -215,6 +215,20 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
>>       oe->redirect = redirect;
>>  }
>>
>> +bool ovl_redirect_fh(struct super_block *sb)
>> +{
>> +     struct ovl_fs *ofs = sb->s_fs_info;
>> +
>> +     return ofs->config.redirect_fh;
>> +}
>> +
>> +void ovl_clear_redirect_fh(struct super_block *sb)
>> +{
>> +     struct ovl_fs *ofs = sb->s_fs_info;
>> +
>> +     ofs->config.redirect_fh = false;
>> +}
>> +
>>  void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
>>  {
>>       struct ovl_entry *oe = dentry->d_fsdata;
>> --
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-16 23:59 ` [RFC][PATCH 03/13] ovl: lookup redirect by file handle Amir Goldstein
@ 2017-04-18 13:05   ` Vivek Goyal
  2017-04-18 14:05     ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Vivek Goyal @ 2017-04-18 13:05 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Mon, Apr 17, 2017 at 02:59:33AM +0300, Amir Goldstein wrote:

[..]
> @@ -272,6 +385,33 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			goto out_put_upper;
>  	}
>  
> +	/* Try to lookup lower layers by file handle (at root) */
> +	d.by_path = false;
> +	for (i = 0; !d.stop && d.fh && i < roe->numlower; i++) {
> +		struct path lowerpath = roe->lowerstack[i];
> +
> +		d.last = i == roe->numlower - 1;
> +		err = ovl_lookup_layer_fh(&lowerpath, &d, &this);
> +		if (err)
> +			goto out_put;
> +
> +		if (!this)
> +			continue;
> +
> +		stack[ctr].dentry = this;
> +		stack[ctr].mnt = lowerpath.mnt;
> +		ctr++;
> +	}
> +
> +	/* Fallback to lookup lower layers by name (at parent) */
> +	if (ctr) {
> +		d.stop = true;

Hi Amir,

Got a very basic question. So say I two lower layers and a directory is
in present in both, say lower1/dir1 and lower2/dir1. Now this directory
gets copied up to upper/dir1. Assume lower1/dir1 is being copied up. So
upper/dir1 will save file handle of lower1/dir1 right? This file handle
does not represent lower2/dir1?

IOW, later when I am doing lookup, then using file handle I will find
dentry of lower1/dir1 but not lower2/dir1. And looks like we will not
path based lookup as ctr will be non-zero (as we found one dentry in
one path).

What am I missing.

(This ovl_lookup() logic has become really twisted now. I wished it was
 little easier to read.)

Vivek

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-18 13:05   ` Vivek Goyal
@ 2017-04-18 14:05     ` Amir Goldstein
  2017-04-18 18:32       ` Vivek Goyal
  0 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-18 14:05 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Tue, Apr 18, 2017 at 4:05 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Mon, Apr 17, 2017 at 02:59:33AM +0300, Amir Goldstein wrote:
>
> [..]
>> @@ -272,6 +385,33 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>>                       goto out_put_upper;
>>       }
>>
>> +     /* Try to lookup lower layers by file handle (at root) */
>> +     d.by_path = false;
>> +     for (i = 0; !d.stop && d.fh && i < roe->numlower; i++) {
>> +             struct path lowerpath = roe->lowerstack[i];
>> +
>> +             d.last = i == roe->numlower - 1;
>> +             err = ovl_lookup_layer_fh(&lowerpath, &d, &this);
>> +             if (err)
>> +                     goto out_put;
>> +
>> +             if (!this)
>> +                     continue;
>> +
>> +             stack[ctr].dentry = this;
>> +             stack[ctr].mnt = lowerpath.mnt;
>> +             ctr++;
>> +     }
>> +
>> +     /* Fallback to lookup lower layers by name (at parent) */
>> +     if (ctr) {
>> +             d.stop = true;
>
> Hi Amir,
>
> Got a very basic question. So say I two lower layers and a directory is
> in present in both, say lower1/dir1 and lower2/dir1. Now this directory
> gets copied up to upper/dir1. Assume lower1/dir1 is being copied up. So
> upper/dir1 will save file handle of lower1/dir1 right? This file handle
> does not represent lower2/dir1?
>
> IOW, later when I am doing lookup, then using file handle I will find
> dentry of lower1/dir1 but not lower2/dir1. And looks like we will not
> path based lookup as ctr will be non-zero (as we found one dentry in
> one path).
>
> What am I missing.
>

You are not missing anything - I am.

The reason I did not bump into this is because I have only 2 testing modes
with unionmount testsuite:
- upper layers recycled to lower layers without redirect fh (./run --ov=N)
- upper layers recycled to lower layers with redirect fh (./run --ov=N --samefs)

I have no tests that compose lower layers without redirect_fh and
upper layer with redirect_fh.
So in my tests, lower1/dir1 will always have a redirect_fh to lower2/dir1.

One way I consider adressing this issue is:
When mounting redirect_fh, check that all lowerstack[i] has a redirect_fh
to lowerstack[i+1] and set a redirect_fh to lowerstack[0] from upperdir.
If any of the redirect_fh are missing or stale (because lowerdirs were copied)
disable redirect_fh.

I think that any attempt to fallback from lookup by fh to lookup by path
is going to be risky and I wish to eliminate the per lookup fallback.
When redirect_fh does not fallback to lookup by path, then the semantics
of "implicit opaque" are always correct and there is no longer a need to
check directories for opaque xattr explicitly (the lack of redirect fh xattr
means opaque).

> (This ovl_lookup() logic has become really twisted now. I wished it was
>  little easier to read.)
>

Me too. IMO, most of the complexity is in the fallbacks from redirect
by fh to full path to relative path. Eliminating some of these fallbacks
and maybe having separate lookup op per mode, may simplify the code.

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-18 14:05     ` Amir Goldstein
@ 2017-04-18 18:32       ` Vivek Goyal
  2017-04-18 18:57         ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Vivek Goyal @ 2017-04-18 18:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Tue, Apr 18, 2017 at 05:05:12PM +0300, Amir Goldstein wrote:

[..]
> > (This ovl_lookup() logic has become really twisted now. I wished it was
> >  little easier to read.)
> >
> 
> Me too. IMO, most of the complexity is in the fallbacks from redirect
> by fh to full path to relative path. Eliminating some of these fallbacks
> and maybe having separate lookup op per mode, may simplify the code.

In general, why are we falling back to path based lookup. If lower dirs
were copied or something else, that's a configuration error and overlayfs
will have undefined behavior?

Vivek

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
                   ` (12 preceding siblings ...)
  2017-04-16 23:59 ` [RFC][PATCH 13/13] ovl: try to hardlink upper on copy up of lower hardlinks Amir Goldstein
@ 2017-04-18 18:37 ` Amir Goldstein
  2017-04-19  9:16   ` Miklos Szeredi
  13 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-18 18:37 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Mon, Apr 17, 2017 at 2:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> Overlayfs inodes are considered unstable in several aspects,
> because on a copy up event:
> 1. st_ino can change
> 2. st_dev can change
> 3. hardlinks are broken
> 4. NFS handle would become stale
> 5. content of read-only file descriptor would become stale
>
> This patch set 'stabilizes' overlayfs inodes w.r.t. st_ino/st_dev
> and takes some big steps in the direction of stabilizing hardlinks
> and NFS handles.
>

I realized I forgot to mention in the cover letter that stable inodes
are only available for the overlay configuration where all layers
are on the same underlying fs and that underlying fs support
NFS export (I think all eligible upper fs support NFS export anyway).

Following some questions from Vivek, here is another thing worth
mentioning - when overlayfs layers are copied to another location
(or another machine) the old redirect file handles become stale.
In that case:
- All overlayfs content should still be available (i.e. merge dirs)
- Copy up of hardlinks that already have upper aliases will end
  up with broken hardlinks
- Except for the hardlink case above, NFS handles from the
  new overlay mount cannot become stale on copy up
- Inode numbers on the new overlay mount will be constant


> The full work is available at [1] and includes also an untested
> WIP for NFS export support.
>
> This work relies heavily on inode and dentry cache - so long as
> the overlayfs dcache is fully populated, hardlinks should never be
> broken and NFS handles should never become stale on copy up.
>
> The missing piece of the puzzle is a persistent mapping from
> lower entries to upper entries, so that overlay entries could be
> reconstructed from their stable (lower) unique identifier.
>
> But the work has merit even without the missing piece, because it
> stabilizes st_ino/st_dev and prevents breaking hardlinks in many
> use cases.
>
> [1] https://github.com/amir73il/linux/commits/ovl-nfs-export
>
> Amir Goldstein (13):
>   ovl: check if all layers are on the same fs
>   ovl: redirect dir by file handle on copy up
>   ovl: lookup redirect by file handle
>   ovl: store file handle of stable inode
>   ovl: lookup stable inode by file handle
>   ovl: move inode helpers to inode.c
>   ovl: create helpers for initializing hashed inode
>   ovl: allow hashing non upper inodes
>   ovl: inherit overlay inode ino/generation from real inode
>   ovl: hash overlay inodes by stable inode
>   ovl: fix du --one-file-system on overlay mount
>   ovl: constant ino across copy up
>   ovl: try to hardlink upper on copy up of lower hardlinks
>
>  fs/overlayfs/Kconfig     |  16 ++++
>  fs/overlayfs/copy_up.c   | 130 ++++++++++++++++++++++++++-
>  fs/overlayfs/dir.c       |   2 +-
>  fs/overlayfs/inode.c     | 106 ++++++++++++++++++++--
>  fs/overlayfs/namei.c     | 226 ++++++++++++++++++++++++++++++++++++++++++-----
>  fs/overlayfs/overlayfs.h |  40 +++++++--
>  fs/overlayfs/ovl_entry.h |   4 +-
>  fs/overlayfs/readdir.c   |  85 ++++++++++++++++--
>  fs/overlayfs/super.c     |  37 +++++++-
>  fs/overlayfs/util.c      |  51 +++++++----
>  10 files changed, 633 insertions(+), 64 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-18 18:32       ` Vivek Goyal
@ 2017-04-18 18:57         ` Amir Goldstein
  2017-04-19 15:21           ` Miklos Szeredi
  0 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-18 18:57 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel

On Tue, Apr 18, 2017 at 9:32 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, Apr 18, 2017 at 05:05:12PM +0300, Amir Goldstein wrote:
>
> [..]
>> > (This ovl_lookup() logic has become really twisted now. I wished it was
>> >  little easier to read.)
>> >
>>
>> Me too. IMO, most of the complexity is in the fallbacks from redirect
>> by fh to full path to relative path. Eliminating some of these fallbacks
>> and maybe having separate lookup op per mode, may simplify the code.
>
> In general, why are we falling back to path based lookup. If lower dirs
> were copied or something else, that's a configuration error and overlayfs
> will have undefined behavior?
>

Heh, I just sent out a summary of what happens when lower dirs are copied.
The fallback is needed because when lower/upper dirs are copied
(which is a perfectly valid practice), the xattr are copied with them.
So the fallback is needed because ovl_lookup() find the fh on upper dir
and tries to follow it only to realize that it is stale - then it assumes that
layers were copied and resorts to path lookup.

My point earlier is that I wish to move this "was I copied?" test to mount
time instead of doing it on every lookup. But then user won't be able
to enjoy all the benefits of stable inodes after layers were copied, so
this is not ideal.

I guess what we could do is disable the redirect_dir by fh optimization
for directories if layers were copied and always follow non-dirs by fh.
That will eliminate the fallback and none of the stable inode features
would miss anything.
This should work well with the copied layers use case.

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-18 18:37 ` [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
@ 2017-04-19  9:16   ` Miklos Szeredi
  2017-04-19 10:37     ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-19  9:16 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Tue, Apr 18, 2017 at 8:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Apr 17, 2017 at 2:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> > Overlayfs inodes are considered unstable in several aspects,
> > because on a copy up event:
> > 1. st_ino can change
> > 2. st_dev can change
> > 3. hardlinks are broken
> > 4. NFS handle would become stale
> > 5. content of read-only file descriptor would become stale
> >
> > This patch set 'stabilizes' overlayfs inodes w.r.t. st_ino/st_dev
> > and takes some big steps in the direction of stabilizing hardlinks
> > and NFS handles.
> >
>
> I realized I forgot to mention in the cover letter that stable inodes
> are only available for the overlay configuration where all layers
> are on the same underlying fs and that underlying fs support
> NFS export (I think all eligible upper fs support NFS export anyway).

Hmm,  we could keep inode numbers stable across copy up even if layers
are on different filesystems:  just need to use a separate st_dev for
lower layers and keep st_dev and st_inode constant.  The only extra
thing needed compared to the samefs case is the allocation of dummy
device numbers for lower layers.  Of course "find -xdev" and the like
still won't work properly, and we wouldn't be able to provide sane
d_ino values in readdir.  But NFS export should actually work, since
we can encode the device number in the file handle.

Lets keep things simple for now, so those are just things to keep in
mind but the actual implementation can wait.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19  9:16   ` Miklos Szeredi
@ 2017-04-19 10:37     ` Amir Goldstein
  2017-04-19 13:52       ` Miklos Szeredi
  0 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-19 10:37 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Wed, Apr 19, 2017 at 12:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Apr 18, 2017 at 8:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>
>> On Mon, Apr 17, 2017 at 2:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> > Overlayfs inodes are considered unstable in several aspects,
>> > because on a copy up event:
>> > 1. st_ino can change
>> > 2. st_dev can change
>> > 3. hardlinks are broken
>> > 4. NFS handle would become stale
>> > 5. content of read-only file descriptor would become stale
>> >
>> > This patch set 'stabilizes' overlayfs inodes w.r.t. st_ino/st_dev
>> > and takes some big steps in the direction of stabilizing hardlinks
>> > and NFS handles.
>> >
>>
>> I realized I forgot to mention in the cover letter that stable inodes
>> are only available for the overlay configuration where all layers
>> are on the same underlying fs and that underlying fs support
>> NFS export (I think all eligible upper fs support NFS export anyway).
>
> Hmm,  we could keep inode numbers stable across copy up even if layers
> are on different filesystems:  just need to use a separate st_dev for
> lower layers and keep st_dev and st_inode constant.  The only extra
> thing needed compared to the samefs case is the allocation of dummy
> device numbers for lower layers.  Of course "find -xdev" and the like
> still won't work properly, and we wouldn't be able to provide sane
> d_ino values in readdir.

Not sure that is going to be worth the effort, but we'll see.
Anyway, not sure if you already read far enough into the series,
but the fact that overlay inodes are hashed by stable inode ino
helps solving a lot of the problems with minimal code changes,
so in the grand scheme of things, I think it would be easier to
say: same_fs can give you POSIX. non same_fs cannot.

> But NFS export should actually work, since
> we can encode the device number in the file handle.

That's true.
But it comes down to the question of are there any users for this.
I mean if we provide full POSIX and NFS export to same_fs case,
would it be interesting to provide NFS export and partial
constant inode numbers (not for d_ino) for non same_fs case?

>
> Lets keep things simple for now, so those are just things to keep in
> mind but the actual implementation can wait.
>
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 10:37     ` Amir Goldstein
@ 2017-04-19 13:52       ` Miklos Szeredi
  2017-04-19 14:46         ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-19 13:52 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Wed, Apr 19, 2017 at 12:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 19, 2017 at 12:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Tue, Apr 18, 2017 at 8:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>
>>> On Mon, Apr 17, 2017 at 2:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> > Overlayfs inodes are considered unstable in several aspects,
>>> > because on a copy up event:
>>> > 1. st_ino can change
>>> > 2. st_dev can change
>>> > 3. hardlinks are broken
>>> > 4. NFS handle would become stale
>>> > 5. content of read-only file descriptor would become stale
>>> >
>>> > This patch set 'stabilizes' overlayfs inodes w.r.t. st_ino/st_dev
>>> > and takes some big steps in the direction of stabilizing hardlinks
>>> > and NFS handles.
>>> >
>>>
>>> I realized I forgot to mention in the cover letter that stable inodes
>>> are only available for the overlay configuration where all layers
>>> are on the same underlying fs and that underlying fs support
>>> NFS export (I think all eligible upper fs support NFS export anyway).
>>
>> Hmm,  we could keep inode numbers stable across copy up even if layers
>> are on different filesystems:  just need to use a separate st_dev for
>> lower layers and keep st_dev and st_inode constant.  The only extra
>> thing needed compared to the samefs case is the allocation of dummy
>> device numbers for lower layers.  Of course "find -xdev" and the like
>> still won't work properly, and we wouldn't be able to provide sane
>> d_ino values in readdir.
>
> Not sure that is going to be worth the effort, but we'll see.
> Anyway, not sure if you already read far enough into the series,
> but the fact that overlay inodes are hashed by stable inode ino
> helps solving a lot of the problems with minimal code changes,
> so in the grand scheme of things, I think it would be easier to
> say: same_fs can give you POSIX. non same_fs cannot.

The effort should be small, and the reward is substantially less weird behavior.

In fact I looked and SUSv4
(http://pubs.opengroup.org/onlinepubs/9699919799/) only talks about
"mount point" in the context of directories.  It does *not* require
st_dev to be the as the st_dev of the parent directory for
non-directories.  The only requirement is that st_ino and st_dev
together uniquely identify a file in the system, which is why we need
to generate a dummy st_dev for lower files in this case.  It also
explicitly only talks about directories in the context of "-xdev" and
the like.

So even in the non-samefs case we could stamp it with "POSIX
compliant" because strictly speaking it is.

If that's not enough, I think we *can* do unified ino space even in
most non-samefs cases.  And here's why: look at the inode numbers of
any filesystem; they will always be "small" so we can just partition
the 64 bit ino space between layers and map inode numbers into its own
partition.  This does not work in the general case, and it is a hack.
But it's a very simple hack and it probably works fine. Similar thing
is assumed by the 32bit compat code, which just returns EOVERFLOW if
the ino happens to be too large, which I guess doesn't happen too
often for most filesystems...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 13:52       ` Miklos Szeredi
@ 2017-04-19 14:46         ` Amir Goldstein
  2017-04-19 15:01           ` Miklos Szeredi
  0 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-19 14:46 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Wed, Apr 19, 2017 at 4:52 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Apr 19, 2017 at 12:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Wed, Apr 19, 2017 at 12:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Tue, Apr 18, 2017 at 8:37 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>
>>>> On Mon, Apr 17, 2017 at 2:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> > Overlayfs inodes are considered unstable in several aspects,
>>>> > because on a copy up event:
>>>> > 1. st_ino can change
>>>> > 2. st_dev can change
>>>> > 3. hardlinks are broken
>>>> > 4. NFS handle would become stale
>>>> > 5. content of read-only file descriptor would become stale
>>>> >
>>>> > This patch set 'stabilizes' overlayfs inodes w.r.t. st_ino/st_dev
>>>> > and takes some big steps in the direction of stabilizing hardlinks
>>>> > and NFS handles.
>>>> >
>>>>
>>>> I realized I forgot to mention in the cover letter that stable inodes
>>>> are only available for the overlay configuration where all layers
>>>> are on the same underlying fs and that underlying fs support
>>>> NFS export (I think all eligible upper fs support NFS export anyway).
>>>
>>> Hmm,  we could keep inode numbers stable across copy up even if layers
>>> are on different filesystems:  just need to use a separate st_dev for
>>> lower layers and keep st_dev and st_inode constant.  The only extra
>>> thing needed compared to the samefs case is the allocation of dummy
>>> device numbers for lower layers.  Of course "find -xdev" and the like
>>> still won't work properly, and we wouldn't be able to provide sane
>>> d_ino values in readdir.
>>
>> Not sure that is going to be worth the effort, but we'll see.
>> Anyway, not sure if you already read far enough into the series,
>> but the fact that overlay inodes are hashed by stable inode ino
>> helps solving a lot of the problems with minimal code changes,
>> so in the grand scheme of things, I think it would be easier to
>> say: same_fs can give you POSIX. non same_fs cannot.
>
> The effort should be small, and the reward is substantially less weird behavior.
>
> In fact I looked and SUSv4
> (http://pubs.opengroup.org/onlinepubs/9699919799/) only talks about
> "mount point" in the context of directories.  It does *not* require
> st_dev to be the as the st_dev of the parent directory for
> non-directories.  The only requirement is that st_ino and st_dev
> together uniquely identify a file in the system, which is why we need
> to generate a dummy st_dev for lower files in this case.  It also
> explicitly only talks about directories in the context of "-xdev" and
> the like.
>

I did see that 'find' does list files from overlay which have differnt
st_dev than parent, but 'du -x' does not count the files, which should
be very annoying to users. I'm surprised I haven't heard about this.

> So even in the non-samefs case we could stamp it with "POSIX
> compliant" because strictly speaking it is.
>
> If that's not enough, I think we *can* do unified ino space even in
> most non-samefs cases.  And here's why: look at the inode numbers of
> any filesystem; they will always be "small" so we can just partition
> the 64 bit ino space between layers and map inode numbers into its own
> partition.  This does not work in the general case, and it is a hack.
> But it's a very simple hack and it probably works fine. Similar thing
> is assumed by the 32bit compat code, which just returns EOVERFLOW if
> the ino happens to be too large, which I guess doesn't happen too
> often for most filesystems...
>

Well, if you are lucky you can run into a filesystem that exports
a file handle of type FILEID_INO32_GEN, then you *know* you're
good to go. ext* will do that and xfs that was forever mounted with
-o inode32.
Even with xfs -o inode64, it will not use the MSB ino bits unless
you are in the exabytes fs sizes.

Anyway, I will keep that in the back on my mind when working
on stable inode to keep the implementation open for such
improvements in the future.

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 14:46         ` Amir Goldstein
@ 2017-04-19 15:01           ` Miklos Szeredi
  2017-04-19 15:17             ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-19 15:01 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Wed, Apr 19, 2017 at 4:46 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Well, if you are lucky you can run into a filesystem that exports
> a file handle of type FILEID_INO32_GEN, then you *know* you're
> good to go. ext* will do that and xfs that was forever mounted with
> -o inode32.
> Even with xfs -o inode64, it will not use the MSB ino bits unless
> you are in the exabytes fs sizes.

Could filesystems export a max-ino property in their sb?  That would
help with doing this properly.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-16 23:59 ` [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up Amir Goldstein
  2017-04-17 13:33   ` Rock Lee
  2017-04-17 19:49   ` Vivek Goyal
@ 2017-04-19 15:16   ` Miklos Szeredi
  2017-04-19 15:27     ` Amir Goldstein
  2 siblings, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-19 15:16 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Mon, Apr 17, 2017 at 1:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> When mounted with mount option redirect_dir=fh,
> every copy up of lower directory stores the lower
> dir file handle in overlay.fh xattr of upper dir.

Is there a disadvantage to doing this unconditionally? (Same goes for patch 4)

Is there a disadvantage to doing this even for the non-samefs case?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 15:01           ` Miklos Szeredi
@ 2017-04-19 15:17             ` Amir Goldstein
  2017-04-19 22:58               ` Darrick J. Wong
  0 siblings, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-19 15:17 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal, Darrick J. Wong

On Wed, Apr 19, 2017 at 6:01 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Apr 19, 2017 at 4:46 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Well, if you are lucky you can run into a filesystem that exports
>> a file handle of type FILEID_INO32_GEN, then you *know* you're
>> good to go. ext* will do that and xfs that was forever mounted with
>> -o inode32.
>> Even with xfs -o inode64, it will not use the MSB ino bits unless
>> you are in the exabytes fs sizes.
>
> Could filesystems export a max-ino property in their sb?  That would
> help with doing this properly.
>

Sounds reasonable, but as max-ino usually derived from filesystem size
and filesystems can grow size online, you will need to query both the
'soft' ino limit (without growing fs) and the 'hard' ino limit.

Darrick,

Are there bits in GETFSMAP to provide this info?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-18 18:57         ` Amir Goldstein
@ 2017-04-19 15:21           ` Miklos Szeredi
  2017-04-19 15:35             ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-19 15:21 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Vivek Goyal, Al Viro, linux-unionfs, linux-fsdevel

On Tue, Apr 18, 2017 at 8:57 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Tue, Apr 18, 2017 at 9:32 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> On Tue, Apr 18, 2017 at 05:05:12PM +0300, Amir Goldstein wrote:
>>
>> [..]
>>> > (This ovl_lookup() logic has become really twisted now. I wished it was
>>> >  little easier to read.)
>>> >
>>>
>>> Me too. IMO, most of the complexity is in the fallbacks from redirect
>>> by fh to full path to relative path. Eliminating some of these fallbacks
>>> and maybe having separate lookup op per mode, may simplify the code.
>>
>> In general, why are we falling back to path based lookup. If lower dirs
>> were copied or something else, that's a configuration error and overlayfs
>> will have undefined behavior?
>>
>
> Heh, I just sent out a summary of what happens when lower dirs are copied.
> The fallback is needed because when lower/upper dirs are copied
> (which is a perfectly valid practice), the xattr are copied with them.
> So the fallback is needed because ovl_lookup() find the fh on upper dir
> and tries to follow it only to realize that it is stale - then it assumes that
> layers were copied and resorts to path lookup.
>
> My point earlier is that I wish to move this "was I copied?" test to mount
> time instead of doing it on every lookup. But then user won't be able
> to enjoy all the benefits of stable inodes after layers were copied, so
> this is not ideal.

How so?  Nobody is expecting the copy to have the same inode numbers
as the original.

So it's fine if the relation to the original lower layer file is
broken, we don't care about that anymore.  Except for the hard link
case, but that relation can be preserved in the reverse mapping.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-19 15:16   ` Miklos Szeredi
@ 2017-04-19 15:27     ` Amir Goldstein
  2017-04-19 15:33       ` Miklos Szeredi
  2017-04-20  8:55       ` Amir Goldstein
  0 siblings, 2 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-19 15:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Wed, Apr 19, 2017 at 6:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Apr 17, 2017 at 1:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> When mounted with mount option redirect_dir=fh,
>> every copy up of lower directory stores the lower
>> dir file handle in overlay.fh xattr of upper dir.
>
> Is there a disadvantage to doing this unconditionally? (Same goes for patch 4)
>
> Is there a disadvantage to doing this even for the non-samefs case?
>

Not that I can think of. It's just that all the features that stable
inode brings
to the table only work for same_fs right now, but we can store overlay.fh
anyway.

When you get a chance please let me know your preference wrt
configuration options.
I don't see a reason to let the user turn off storing overlay.fh?
Is there a reason to let user turn off lookup by fh? for files?
(i.e. to preserve old weird behavior?)

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-19 15:27     ` Amir Goldstein
@ 2017-04-19 15:33       ` Miklos Szeredi
  2017-04-19 15:43         ` Amir Goldstein
  2017-04-20  8:55       ` Amir Goldstein
  1 sibling, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-19 15:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Wed, Apr 19, 2017 at 5:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 19, 2017 at 6:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, Apr 17, 2017 at 1:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> When mounted with mount option redirect_dir=fh,
>>> every copy up of lower directory stores the lower
>>> dir file handle in overlay.fh xattr of upper dir.
>>
>> Is there a disadvantage to doing this unconditionally? (Same goes for patch 4)
>>
>> Is there a disadvantage to doing this even for the non-samefs case?
>>
>
> Not that I can think of. It's just that all the features that stable
> inode brings
> to the table only work for same_fs right now, but we can store overlay.fh
> anyway.
>
> When you get a chance please let me know your preference wrt
> configuration options.
> I don't see a reason to let the user turn off storing overlay.fh?
> Is there a reason to let user turn off lookup by fh? for files?
> (i.e. to preserve old weird behavior?)

The less config options the better.

We needed the redirect_dir=on/off thing because it's a backward
incompatible format change and the alternatives sucked more (i.e.
Linus didn't like them).

I don't see why we'd need to allow turning off backward compatible
changes (of course, there's a minimal overhead, but lets assume that
nobody cares about that, and if it turns out to be a problem, we'll
fix it).

Thanks,
Miklos

>
> Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 03/13] ovl: lookup redirect by file handle
  2017-04-19 15:21           ` Miklos Szeredi
@ 2017-04-19 15:35             ` Amir Goldstein
  0 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-19 15:35 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Vivek Goyal, Al Viro, linux-unionfs, linux-fsdevel

On Wed, Apr 19, 2017 at 6:21 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Apr 18, 2017 at 8:57 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Tue, Apr 18, 2017 at 9:32 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>>> On Tue, Apr 18, 2017 at 05:05:12PM +0300, Amir Goldstein wrote:
>>>
>>> [..]
>>>> > (This ovl_lookup() logic has become really twisted now. I wished it was
>>>> >  little easier to read.)
>>>> >
>>>>
>>>> Me too. IMO, most of the complexity is in the fallbacks from redirect
>>>> by fh to full path to relative path. Eliminating some of these fallbacks
>>>> and maybe having separate lookup op per mode, may simplify the code.
>>>
>>> In general, why are we falling back to path based lookup. If lower dirs
>>> were copied or something else, that's a configuration error and overlayfs
>>> will have undefined behavior?
>>>
>>
>> Heh, I just sent out a summary of what happens when lower dirs are copied.
>> The fallback is needed because when lower/upper dirs are copied
>> (which is a perfectly valid practice), the xattr are copied with them.
>> So the fallback is needed because ovl_lookup() find the fh on upper dir
>> and tries to follow it only to realize that it is stale - then it assumes that
>> layers were copied and resorts to path lookup.
>>
>> My point earlier is that I wish to move this "was I copied?" test to mount
>> time instead of doing it on every lookup. But then user won't be able
>> to enjoy all the benefits of stable inodes after layers were copied, so
>> this is not ideal.
>
> How so?  Nobody is expecting the copy to have the same inode numbers
> as the original.
>
> So it's fine if the relation to the original lower layer file is
> broken, we don't care about that anymore.  Except for the hard link
> case, but that relation can be preserved in the reverse mapping.
>

I was addressing Vivek's concern about the complexity of the
"fallback from fh to path" code.
I managed to get the following use case wrong:
- 2 or more lower layers that have some dir redirects by path
- upper layer that now creates redirect_fh
After following from upper to lower1 by fh, I failed to follow by
path to lower2.

So I contemplated detecting this mixed mode at mount time
and disabling redirect_fh.

Nevermind, I'll just fix the fallback in mid layers case.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-19 15:33       ` Miklos Szeredi
@ 2017-04-19 15:43         ` Amir Goldstein
  0 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-19 15:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Wed, Apr 19, 2017 at 6:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Apr 19, 2017 at 5:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Wed, Apr 19, 2017 at 6:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Mon, Apr 17, 2017 at 1:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> When mounted with mount option redirect_dir=fh,
>>>> every copy up of lower directory stores the lower
>>>> dir file handle in overlay.fh xattr of upper dir.
>>>
>>> Is there a disadvantage to doing this unconditionally? (Same goes for patch 4)
>>>
>>> Is there a disadvantage to doing this even for the non-samefs case?
>>>
>>
>> Not that I can think of. It's just that all the features that stable
>> inode brings
>> to the table only work for same_fs right now, but we can store overlay.fh
>> anyway.
>>
>> When you get a chance please let me know your preference wrt
>> configuration options.
>> I don't see a reason to let the user turn off storing overlay.fh?
>> Is there a reason to let user turn off lookup by fh? for files?
>> (i.e. to preserve old weird behavior?)
>
> The less config options the better.
>
> We needed the redirect_dir=on/off thing because it's a backward
> incompatible format change and the alternatives sucked more (i.e.
> Linus didn't like them).
>
> I don't see why we'd need to allow turning off backward compatible
> changes (of course, there's a minimal overhead, but lets assume that
> nobody cares about that, and if it turns out to be a problem, we'll
> fix it).
>

There is one slightly hackish reason for which I find redirect_dir=fh
mount option useful.
redirect_fh it automatically turned on|off depending on same_fs
configuration.
When I look at the mount options at /proc/mounts I can see
redirect_dir=fh|on which gives me an indication if I can
expect the stable inodes behavior.

This sort of info can be useful for support calls, but I guess it
belongs somewhere under sysfs or debugfs...

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 15:17             ` Amir Goldstein
@ 2017-04-19 22:58               ` Darrick J. Wong
  2017-04-19 23:15                 ` Andreas Dilger
  0 siblings, 1 reply; 43+ messages in thread
From: Darrick J. Wong @ 2017-04-19 22:58 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Al Viro, linux-unionfs, linux-fsdevel, Vivek Goyal

On Wed, Apr 19, 2017 at 06:17:15PM +0300, Amir Goldstein wrote:
> On Wed, Apr 19, 2017 at 6:01 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Wed, Apr 19, 2017 at 4:46 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> >> Well, if you are lucky you can run into a filesystem that exports
> >> a file handle of type FILEID_INO32_GEN, then you *know* you're
> >> good to go. ext* will do that and xfs that was forever mounted with
> >> -o inode32.
> >> Even with xfs -o inode64, it will not use the MSB ino bits unless
> >> you are in the exabytes fs sizes.

I think it only takes really big AGs for it to start using the >32 bit parts.

> >
> > Could filesystems export a max-ino property in their sb?  That would
> > help with doing this properly.
> >
> 
> Sounds reasonable, but as max-ino usually derived from filesystem size
> and filesystems can grow size online, you will need to query both the
> 'soft' ino limit (without growing fs) and the 'hard' ino limit.
> 
> Darrick,
> 
> Are there bits in GETFSMAP to provide this info?

Nope.  I suppose there could be a way to find out the theoretical
maximum inode number for a filesystem (statvfsx, etc.) but on the other
hand I can also see the other fs developers not wanting to expose that
information for fear that someone will start using the upper bits (inode
numbers should just be a 64-bit cookie we hand to users, right?) and
then they'll have to resort to all sorts of trickery to avoid breaking
things if they ever /do/ want to use those high bits that have been
claimed by someone else.

/me wonders what you're trying to accomplish?

--D

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 22:58               ` Darrick J. Wong
@ 2017-04-19 23:15                 ` Andreas Dilger
  2017-04-20  5:43                   ` Amir Goldstein
  2017-04-20  8:45                   ` Miklos Szeredi
  0 siblings, 2 replies; 43+ messages in thread
From: Andreas Dilger @ 2017-04-19 23:15 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Amir Goldstein, Miklos Szeredi, Al Viro, linux-unionfs,
	linux-fsdevel, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 3106 bytes --]

On Apr 19, 2017, at 4:58 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
> 
> On Wed, Apr 19, 2017 at 06:17:15PM +0300, Amir Goldstein wrote:
>> On Wed, Apr 19, 2017 at 6:01 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Wed, Apr 19, 2017 at 4:46 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>> Well, if you are lucky you can run into a filesystem that exports
>>>> a file handle of type FILEID_INO32_GEN, then you *know* you're
>>>> good to go. ext* will do that and xfs that was forever mounted with
>>>> -o inode32.
>>>> Even with xfs -o inode64, it will not use the MSB ino bits unless
>>>> you are in the exabytes fs sizes.
> 
> I think it only takes really big AGs for it to start using the >32 bit parts.
> 
>>> 
>>> Could filesystems export a max-ino property in their sb?  That would
>>> help with doing this properly.
>>> 
>> 
>> Sounds reasonable, but as max-ino usually derived from filesystem size
>> and filesystems can grow size online, you will need to query both the
>> 'soft' ino limit (without growing fs) and the 'hard' ino limit.
>> 
>> Darrick,
>> 
>> Are there bits in GETFSMAP to provide this info?
> 
> Nope.  I suppose there could be a way to find out the theoretical
> maximum inode number for a filesystem (statvfsx, etc.) but on the other
> hand I can also see the other fs developers not wanting to expose that
> information for fear that someone will start using the upper bits (inode
> numbers should just be a 64-bit cookie we hand to users, right?) and
> then they'll have to resort to all sorts of trickery to avoid breaking
> things if they ever /do/ want to use those high bits that have been
> claimed by someone else.

I recall there was a similar issue with GlusterFS assuming only 32-bit
readdir cookies on ext4, and stashing some information in the high bits,
but that broke when ext4 moved to 64-bit readdir cookies to avoid hash
collisions on "normal sized" directories (above ~32k entries).

I'd agree that it is the filesystem's prerogative to use any/all of the
64-bit inode number when it wants, and stacking filesystems shouldn't
try to usurp those bits for something else, only to suffer later on.

There is already some interest to add 64-bit inode numbers for ext4, and
it may allocate inode numbers sparsely, so just because the filesystem has
2^33 inodes in it doesn't imply that the highest possible inum is 2^33,
but could instead be 2^48 or something else entirely.

> /me wonders what you're trying to accomplish?

That is mentioned upthread:

>>>> On Wed, Apr 19, 2017 at 4:52 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:

>>>>> I think we *can* do unified ino space even in
>>>>> most non-samefs cases.  And here's why: look at the inode numbers of
>>>>> any filesystem; they will always be "small" so we can just partition
>>>>> the 64 bit ino space between layers and map inode numbers into its own
>>>>> partition.  This does not work in the general case, and it is a hack.
>>>>> But it's a very simple hack and it probably works fine.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 23:15                 ` Andreas Dilger
@ 2017-04-20  5:43                   ` Amir Goldstein
  2017-04-20  8:45                   ` Miklos Szeredi
  1 sibling, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-20  5:43 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Darrick J. Wong, Al Viro, linux-unionfs, linux-fsdevel,
	Vivek Goyal, Andreas Dilger

On Thu, Apr 20, 2017 at 2:15 AM, Andreas Dilger <adilger@dilger.ca> wrote:
> On Apr 19, 2017, at 4:58 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
>>
>> On Wed, Apr 19, 2017 at 06:17:15PM +0300, Amir Goldstein wrote:
>>> On Wed, Apr 19, 2017 at 6:01 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Wed, Apr 19, 2017 at 4:46 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>> Well, if you are lucky you can run into a filesystem that exports
>>>>> a file handle of type FILEID_INO32_GEN, then you *know* you're
>>>>> good to go. ext* will do that and xfs that was forever mounted with
>>>>> -o inode32.
>>>>> Even with xfs -o inode64, it will not use the MSB ino bits unless
>>>>> you are in the exabytes fs sizes.
>>
>> I think it only takes really big AGs for it to start using the >32 bit parts.
>>
>>>>
>>>> Could filesystems export a max-ino property in their sb?  That would
>>>> help with doing this properly.
>>>>
>>>
>>> Sounds reasonable, but as max-ino usually derived from filesystem size
>>> and filesystems can grow size online, you will need to query both the
>>> 'soft' ino limit (without growing fs) and the 'hard' ino limit.
>>>
>>> Darrick,
>>>
>>> Are there bits in GETFSMAP to provide this info?
>>
>> Nope.  I suppose there could be a way to find out the theoretical
>> maximum inode number for a filesystem (statvfsx, etc.) but on the other
>> hand I can also see the other fs developers not wanting to expose that
>> information for fear that someone will start using the upper bits (inode
>> numbers should just be a 64-bit cookie we hand to users, right?) and
>> then they'll have to resort to all sorts of trickery to avoid breaking
>> things if they ever /do/ want to use those high bits that have been
>> claimed by someone else.
>
> I recall there was a similar issue with GlusterFS assuming only 32-bit
> readdir cookies on ext4, and stashing some information in the high bits,
> but that broke when ext4 moved to 64-bit readdir cookies to avoid hash
> collisions on "normal sized" directories (above ~32k entries).
>
> I'd agree that it is the filesystem's prerogative to use any/all of the
> 64-bit inode number when it wants, and stacking filesystems shouldn't
> try to usurp those bits for something else, only to suffer later on.
>
> There is already some interest to add 64-bit inode numbers for ext4, and
> it may allocate inode numbers sparsely, so just because the filesystem has
> 2^33 inodes in it doesn't imply that the highest possible inum is 2^33,
> but could instead be 2^48 or something else entirely.
>

Miklos,

As you can see, fs developers are quite possessive of their reserved bits ;-)
and probably for good reasons too.

I think our best value solution would be to go with virtual dev ids
for lowers layers in the non-same-fs case.

Then we can add an opt-in mount/config option 'masqino' that will use
as few MSB as needed to masquerade 64bit overlay inode numbers
and return EOVERFLOW for stat() if those bits turn up used by
underlying fs, as you proposed.

For the special case of handle file type FILEID_INO32_GEN, we
can automatically turn on 'masqino'. This special case is still quite
common (i.e. ext4). Even if and when ext4 adds 64bit inode support,
like with the case of xfs, legacy on-disk formatted fs, would still
return  handle type FILEID_INO32_GEN.

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-19 23:15                 ` Andreas Dilger
  2017-04-20  5:43                   ` Amir Goldstein
@ 2017-04-20  8:45                   ` Miklos Szeredi
  2017-04-20  8:47                     ` Miklos Szeredi
  1 sibling, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-20  8:45 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Darrick J. Wong, Amir Goldstein, Al Viro, linux-unionfs,
	linux-fsdevel, Vivek Goyal

On Thu, Apr 20, 2017 at 1:15 AM, Andreas Dilger <adilger@dilger.ca> wrote:

> I recall there was a similar issue with GlusterFS assuming only 32-bit
> readdir cookies on ext4, and stashing some information in the high bits,
> but that broke when ext4 moved to 64-bit readdir cookies to avoid hash
> collisions on "normal sized" directories (above ~32k entries).
>
> I'd agree that it is the filesystem's prerogative to use any/all of the
> 64-bit inode number when it wants, and stacking filesystems shouldn't
> try to usurp those bits for something else, only to suffer later on.

Thought experiment:  we extend statx with 128bit inode numbers; is it
now the filesystems' prerogative to use all of the 128 bits for what
it wants?

See the flaw with that reasoning?

Ideally we'd want dynamically sized inode numbers and we have that:
file handles.  But file handles do more than inode numbers and they
are more heavyweight.


> There is already some interest to add 64-bit inode numbers for ext4, and
> it may allocate inode numbers sparsely, so just because the filesystem has
> 2^33 inodes in it doesn't imply that the highest possible inum is 2^33,
> but could instead be 2^48 or something else entirely.

Exactly.

I'm not saying we should limit filesystems use of MSB's but that
filesystems should be able to say that they will use at max say 56
bits or whatever.   They would be free to say they use all 64, but
that would limit their use in stacking filesystems so they may not
want to do that.

Is that not reasonable?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 00/13] overlayfs stable inodes
  2017-04-20  8:45                   ` Miklos Szeredi
@ 2017-04-20  8:47                     ` Miklos Szeredi
  0 siblings, 0 replies; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-20  8:47 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Darrick J. Wong, Amir Goldstein, Al Viro, linux-unionfs,
	linux-fsdevel, Vivek Goyal

On Thu, Apr 20, 2017 at 10:45 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> Ideally we'd want dynamically sized inode numbers and we have that:
> file handles.  But file handles do more than inode numbers and they
> are more heavyweight.

And even more importantly, we can't force userspace to *use* file
handles instead of st_ino.  So that doesn't work.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-19 15:27     ` Amir Goldstein
  2017-04-19 15:33       ` Miklos Szeredi
@ 2017-04-20  8:55       ` Amir Goldstein
  2017-04-21 15:02         ` Miklos Szeredi
  1 sibling, 1 reply; 43+ messages in thread
From: Amir Goldstein @ 2017-04-20  8:55 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Wed, Apr 19, 2017 at 6:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 19, 2017 at 6:16 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Mon, Apr 17, 2017 at 1:59 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> When mounted with mount option redirect_dir=fh,
>>> every copy up of lower directory stores the lower
>>> dir file handle in overlay.fh xattr of upper dir.
>>
>> Is there a disadvantage to doing this unconditionally? (Same goes for patch 4)
>>
>> Is there a disadvantage to doing this even for the non-samefs case?
>>
>
> Not that I can think of. It's just that all the features that stable
> inode brings
> to the table only work for same_fs right now, but we can store overlay.fh
> anyway.
>

Sorry, I remembered what the reason was.

File handles are only unique for a single fs namespace.
Right now, the copy up stores the lower handle in overlay.fh
without storing lower layer number and/or lower fs device id.

The overlay.fh format has a header with version and magic:
struct ovl_fh {
        unsigned char version;  /* 0 */
        unsigned char magic;    /* 0xfb */

So we can easily extend this format later on to support non
same fs redirect_fh.

That said, to answer your question, there is no disadvantage to
storing the lower handle for non-samefs case, there is only
a problem with lookup in the case of several lower layers
no on the same fs.

So we can actually relax the samefs constrain for redirect_fh
to exclude upper layer and specifically, we can always use redirect_fh
in the single lower layer case.

There is one more advantage to storing overlay.fh unconditionally.
Even with multi non-samefs lower layers, the overlay.fh on upper
can be used to distinguish between "merge" and "pure" upper
for non dirs.

I would like to extend the meaning of OVL_TYPE_MERGE
to represent an upper non-dir that has been copied up.
Patch #5 already takes the first step and sets numlower = 1
for non-dirs that have been copied up on samefs.

The next steps would be:
1. remove d_is_dir() check from ovl_path_type()
    BTW, can you please explain this comment.
    I don't understand how this could be:

                /*
                 * Non-dir dentry can hold lower dentry from previous
                 * location.
                 */
                if (oe->numlower && d_is_dir(dentry))
                        type |= __OVL_PATH_MERGE;

2. check merge_or_lower() in ovl_rename() also for non-dir case
    this will cause setting overlay.redirect on non-dir
    and then hardlinks could work better when copying layers
    even without reverse mapping and recovery

3. set __OVL_PATH_MERGE flag during lookup for non-samefs
    In non-samefs case, __OVL_PATH_MERGE may not always be
    deduced from numlower, so need to use my patch that add
    ovl_update_type() [1]

[1] https://marc.info/?l=linux-unionfs&m=149079818122017&w=2

Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-20  8:55       ` Amir Goldstein
@ 2017-04-21 15:02         ` Miklos Szeredi
  2017-04-21 15:29           ` Amir Goldstein
  0 siblings, 1 reply; 43+ messages in thread
From: Miklos Szeredi @ 2017-04-21 15:02 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Thu, Apr 20, 2017 at 10:55 AM, Amir Goldstein <amir73il@gmail.com> wrote:

> File handles are only unique for a single fs namespace.
> Right now, the copy up stores the lower handle in overlay.fh
> without storing lower layer number and/or lower fs device id.

Hmm, would need to be careful with the security implications of
decoding fh on the wrong fs.

>
> The overlay.fh format has a header with version and magic:
> struct ovl_fh {
>         unsigned char version;  /* 0 */
>         unsigned char magic;    /* 0xfb */
>
> So we can easily extend this format later on to support non
> same fs redirect_fh.
>
> That said, to answer your question, there is no disadvantage to
> storing the lower handle for non-samefs case, there is only
> a problem with lookup in the case of several lower layers
> no on the same fs.

Why?

> So we can actually relax the samefs constrain for redirect_fh
> to exclude upper layer and specifically, we can always use redirect_fh
> in the single lower layer case.
>
> There is one more advantage to storing overlay.fh unconditionally.
> Even with multi non-samefs lower layers, the overlay.fh on upper
> can be used to distinguish between "merge" and "pure" upper
> for non dirs.
>
> I would like to extend the meaning of OVL_TYPE_MERGE
> to represent an upper non-dir that has been copied up.
> Patch #5 already takes the first step and sets numlower = 1
> for non-dirs that have been copied up on samefs.
>
> The next steps would be:
> 1. remove d_is_dir() check from ovl_path_type()
>     BTW, can you please explain this comment.
>     I don't understand how this could be:
>
>                 /*
>                  * Non-dir dentry can hold lower dentry from previous
>                  * location.
>                  */
>                 if (oe->numlower && d_is_dir(dentry))
>                         type |= __OVL_PATH_MERGE;

I guess the comment says, that when a non-dir is copied up it will
have positive __upperdentry and a positive numlower (actually, always
one).

If you remove the d_is_dir() then it will change the meaning from
"merge" to "copied up".  The two meanings are distinct: consider the
ovl_clear_empty() case.  It starts with a merged dir and ends up with
an opaque one.  But the opaque one is still the same object as the
original, lower one, so both need to have the same overlay.fh
attribute.

>
> 2. check merge_or_lower() in ovl_rename() also for non-dir case
>     this will cause setting overlay.redirect on non-dir
>     and then hardlinks could work better when copying layers
>     even without reverse mapping and recovery
>
> 3. set __OVL_PATH_MERGE flag during lookup for non-samefs
>     In non-samefs case, __OVL_PATH_MERGE may not always be
>     deduced from numlower, so need to use my patch that add
>     ovl_update_type() [1]

So lets not confuse "merge" with "copied up".  The two are similar for
directories, but not the same, and "merge" doesn't make sense for
regular files.

Otherwise I think the above would be fine.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up
  2017-04-21 15:02         ` Miklos Szeredi
@ 2017-04-21 15:29           ` Amir Goldstein
  0 siblings, 0 replies; 43+ messages in thread
From: Amir Goldstein @ 2017-04-21 15:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, linux-fsdevel

On Fri, Apr 21, 2017 at 6:02 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Apr 20, 2017 at 10:55 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>
>> File handles are only unique for a single fs namespace.
>> Right now, the copy up stores the lower handle in overlay.fh
>> without storing lower layer number and/or lower fs device id.
>
> Hmm, would need to be careful with the security implications of
> decoding fh on the wrong fs.
>
>>
>> The overlay.fh format has a header with version and magic:
>> struct ovl_fh {
>>         unsigned char version;  /* 0 */
>>         unsigned char magic;    /* 0xfb */
>>
>> So we can easily extend this format later on to support non
>> same fs redirect_fh.
>>
>> That said, to answer your question, there is no disadvantage to
>> storing the lower handle for non-samefs case, there is only
>> a problem with lookup in the case of several lower layers
>> no on the same fs.
>
> Why?
>

Because decoding an fh takes a vfsmnt as input arg, looking up
by fh is implemented:

    for (i = 0; i < numlower; i++) {
          this = ovl_lookup_fh(lowerstack[i].mnt, d.fh);
          if (this == NULL)
               continue;
    }

and ovl_lookup_fh() even verifies ovl_dentry_under_mnt() for
accepting the decoded dentry.

So unless we are in the "same lower sb" case this lookup
can return a dentry from the wrong layer.

>> So we can actually relax the samefs constrain for redirect_fh
>> to exclude upper layer and specifically, we can always use redirect_fh
>> in the single lower layer case.
>>
>> There is one more advantage to storing overlay.fh unconditionally.
>> Even with multi non-samefs lower layers, the overlay.fh on upper
>> can be used to distinguish between "merge" and "pure" upper
>> for non dirs.
>>
>> I would like to extend the meaning of OVL_TYPE_MERGE
>> to represent an upper non-dir that has been copied up.
>> Patch #5 already takes the first step and sets numlower = 1
>> for non-dirs that have been copied up on samefs.
>>
>> The next steps would be:
>> 1. remove d_is_dir() check from ovl_path_type()
>>     BTW, can you please explain this comment.
>>     I don't understand how this could be:
>>
>>                 /*
>>                  * Non-dir dentry can hold lower dentry from previous
>>                  * location.
>>                  */
>>                 if (oe->numlower && d_is_dir(dentry))
>>                         type |= __OVL_PATH_MERGE;
>
> I guess the comment says, that when a non-dir is copied up it will
> have positive __upperdentry and a positive numlower (actually, always
> one).

Ok, I guess I wasn't sure what "previous location" was referring to
it implied rename.

>
> If you remove the d_is_dir() then it will change the meaning from
> "merge" to "copied up".  The two meanings are distinct: consider the
> ovl_clear_empty() case.  It starts with a merged dir and ends up with
> an opaque one.  But the opaque one is still the same object as the
> original, lower one, so both need to have the same overlay.fh
> attribute.
>

Well, technically, an opaque object should not have overlay.fh
but I see that it can happen with an unhashed just-deleted entry.
so yes, need to be able to make that distinction.

>>
>> 2. check merge_or_lower() in ovl_rename() also for non-dir case
>>     this will cause setting overlay.redirect on non-dir
>>     and then hardlinks could work better when copying layers
>>     even without reverse mapping and recovery
>>
>> 3. set __OVL_PATH_MERGE flag during lookup for non-samefs
>>     In non-samefs case, __OVL_PATH_MERGE may not always be
>>     deduced from numlower, so need to use my patch that add
>>     ovl_update_type() [1]
>
> So lets not confuse "merge" with "copied up".  The two are similar for
> directories, but not the same, and "merge" doesn't make sense for
> regular files.
>

It make sense if you consider that the overlay inode of non-dir is a merge
of the upper attributes and data and the lower ino (and maybe st_dev).

But it is a hack nonetheless, so I'll think up a new flag name for
COPIED_UP. suggestions are welcome.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-04-21 16:39 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-16 23:59 [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 01/13] ovl: check if all layers are on the same fs Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 02/13] ovl: redirect dir by file handle on copy up Amir Goldstein
2017-04-17 13:33   ` Rock Lee
2017-04-17 14:03     ` Amir Goldstein
2017-04-17 19:49   ` Vivek Goyal
2017-04-17 21:14     ` Amir Goldstein
2017-04-19 15:16   ` Miklos Szeredi
2017-04-19 15:27     ` Amir Goldstein
2017-04-19 15:33       ` Miklos Szeredi
2017-04-19 15:43         ` Amir Goldstein
2017-04-20  8:55       ` Amir Goldstein
2017-04-21 15:02         ` Miklos Szeredi
2017-04-21 15:29           ` Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 03/13] ovl: lookup redirect by file handle Amir Goldstein
2017-04-18 13:05   ` Vivek Goyal
2017-04-18 14:05     ` Amir Goldstein
2017-04-18 18:32       ` Vivek Goyal
2017-04-18 18:57         ` Amir Goldstein
2017-04-19 15:21           ` Miklos Szeredi
2017-04-19 15:35             ` Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 04/13] ovl: store file handle of stable inode Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 05/13] ovl: lookup stable inode by file handle Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 06/13] ovl: move inode helpers to inode.c Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 07/13] ovl: create helpers for initializing hashed inode Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 08/13] ovl: allow hashing non upper inodes Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 09/13] ovl: inherit overlay inode ino/generation from real inode Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 10/13] ovl: hash overlay inodes by stable inode Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 11/13] ovl: fix du --one-file-system on overlay mount Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 12/13] ovl: constant ino across copy up Amir Goldstein
2017-04-16 23:59 ` [RFC][PATCH 13/13] ovl: try to hardlink upper on copy up of lower hardlinks Amir Goldstein
2017-04-18 18:37 ` [RFC][PATCH 00/13] overlayfs stable inodes Amir Goldstein
2017-04-19  9:16   ` Miklos Szeredi
2017-04-19 10:37     ` Amir Goldstein
2017-04-19 13:52       ` Miklos Szeredi
2017-04-19 14:46         ` Amir Goldstein
2017-04-19 15:01           ` Miklos Szeredi
2017-04-19 15:17             ` Amir Goldstein
2017-04-19 22:58               ` Darrick J. Wong
2017-04-19 23:15                 ` Andreas Dilger
2017-04-20  5:43                   ` Amir Goldstein
2017-04-20  8:45                   ` Miklos Szeredi
2017-04-20  8:47                     ` Miklos Szeredi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.