All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] Avoid breaking lower hardlinks on copy up
@ 2017-06-02 14:04 Amir Goldstein
  2017-06-02 14:04 ` [PATCH 01/17] vfs: add helper wait_on_inode_inuse() Amir Goldstein
                   ` (16 more replies)
  0 siblings, 17 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Miklos,

This is the curernt pile. Since the pile is quite big, touches
many parts of the code and raises some user API questions, I wanted
to send it out early for review, so there are still some known issues.

I have improved hardlink copy up tests including concurrent copy
up of lower hardlinks on my xfstests dev branch [1].

This work does NOT assume samefs.

Known issues:
- Missing code on mount to verify/cleanup stale index entries
- Missing documentation
- Better handling of existing broken hardlink with origin (*)
- Non-upper alias content is inconsistent with upper aliases (**)

(*) Currently upper entries are hashed by origin inode if it is
    known AND inodes index feature is enabled. This is not enough.
    Need to also check that the origin inode in question is indexed.

(**) This is for the same reason that ro/rw fd content is inconsistent.
     The 2 hardlink copy up xfstests fail because of that.
     Need to lookup the index on lookup of non-upper and store it in
     dentry for open RDONLY. Something like my rocopyup POC [2].
     Probably not for v4.13 though...

[1] https://github.com/amir73il/xfstests/commits/overlayfs-devel
[2] https://github.com/amir73il/linux/commits/ovl-rocopyup

Amir Goldstein (17):
  vfs: add helper wait_on_inode_inuse()
  ovl: generalize ovl_create_workdir()
  ovl: introduce the inodes index dir feature
  ovl: verify index dir matches upper dir
  ovl: create helper ovl_lookup_index()
  ovl: move inode helpers to inode.c
  ovl: create helpers for initializing hashed inode
  ovl: use ovl_inode_init() for initializing new inode
  ovl: allow hashing non upper inodes
  ovl: allow hashing inodes by arbitrary key
  ovl: hash overlay non-dir inodes by copy up origin inode
  ovl: defer upper dir lock to tempfile link
  ovl: factor out ovl_copy_up_inode() helper
  ovl: generalize ovl_copy_up_locked() using actors
  ovl: generalize ovl_copy_up_one() using actors
  ovl: implement index dir copy up method
  ovl: handle race of concurrent lower hardlinks copy up

 fs/inode.c               |  24 ++
 fs/overlayfs/Kconfig     |  20 ++
 fs/overlayfs/copy_up.c   | 630 ++++++++++++++++++++++++++++++++++++++---------
 fs/overlayfs/dir.c       |   2 +-
 fs/overlayfs/inode.c     |  66 ++++-
 fs/overlayfs/namei.c     |  72 +++++-
 fs/overlayfs/overlayfs.h |  43 +++-
 fs/overlayfs/ovl_entry.h |   3 +
 fs/overlayfs/super.c     | 154 +++++++++---
 fs/overlayfs/util.c      |  26 +-
 include/linux/fs.h       |   1 +
 11 files changed, 848 insertions(+), 193 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/17] vfs: add helper wait_on_inode_inuse()
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 02/17] ovl: generalize ovl_create_workdir() Amir Goldstein
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Adds a helper to wait until an inode's INUSE bit is cleared.

This is going to be used by overlayfs to sychronize concurrent copy up
of lower hardlinks.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/inode.c         | 24 ++++++++++++++++++++++++
 include/linux/fs.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/fs/inode.c b/fs/inode.c
index 216efeecdbdc..546cd503148a 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2167,6 +2167,30 @@ void inode_inuse_unlock(struct inode *inode)
 	WARN_ON(inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE));
 	WARN_ON(!(inode->i_state & I_INUSE));
 	inode->i_state &= ~I_INUSE;
+	smp_mb();
+	wake_up_bit(&inode->i_state, __I_INUSE);
 	spin_unlock(&inode->i_lock);
 }
 EXPORT_SYMBOL(inode_inuse_unlock);
+
+/**
+ * wait_on_inode_inuse - wait for release of exclusive 'inuse' lock
+ * @inode:	inode inuse to wait on
+ *
+ * Can be used in combination with parent i_mutex, to protect access to a
+ * newly created inode, until that inode has been properly initialized by
+ * the user that grabbed the 'inuse' exclusive lock after creating the inode.
+ *
+ * Caller must hold a reference to inode to prevent waiting on an inode that
+ * is not 'inuse' and is already being freed.
+ *
+ * Return 0 if the 'inuse' bit is clear or has been cleared while waiting.
+ */
+int wait_on_inode_inuse(struct inode *inode, unsigned mode)
+{
+	if (WARN_ON(!atomic_read(&inode->i_count)))
+		return -EINVAL;
+	might_sleep();
+	return wait_on_bit(&inode->i_state, __I_INUSE, mode);
+}
+EXPORT_SYMBOL(wait_on_inode_inuse);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2e29ff868e90..e064612b45ef 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3268,6 +3268,7 @@ extern bool path_noexec(const struct path *path);
 extern void inode_nohighmem(struct inode *inode);
 extern bool inode_inuse_trylock(struct inode *inode);
 extern void inode_inuse_unlock(struct inode *inode);
+extern int wait_on_inode_inuse(struct inode *inode, unsigned mode);
 
 static inline bool inode_inuse(struct inode *inode)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/17] ovl: generalize ovl_create_workdir()
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
  2017-06-02 14:04 ` [PATCH 01/17] vfs: add helper wait_on_inode_inuse() Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 03/17] ovl: introduce the inodes index dir feature Amir Goldstein
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Pass in the subdir name to create and specify if subdir is persistent
or if it should be cleaned up on every mount.

Move fallback to readonly mount on failure to create dir and print of error
message into the helper.

This function is going to be used for creating the persistent 'index' dir
under workbasedir.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/super.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 3d7b5c9bc042..3cdefd401713 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -450,22 +450,25 @@ static int ovl_verify_set_origin(struct dentry *dir, struct vfsmount *mnt,
 
 #define OVL_WORKDIR_NAME "work"
 
-static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
-					 struct dentry *dentry)
+static struct dentry *ovl_workdir_create(struct super_block *sb,
+					 struct ovl_fs *ufs,
+					 struct dentry *dentry,
+					 const char *name, bool persist)
 {
-	struct inode *dir = dentry->d_inode;
-	struct dentry *work;
+	struct vfsmount *mnt = ufs->upper_mnt;
+	struct inode *dir = NULL;
+	struct dentry *work = NULL;
 	int err;
 	bool retried = false;
 
 	err = mnt_want_write(mnt);
 	if (err)
-		return ERR_PTR(err);
+		goto out_err;
 
+	dir = d_inode(dentry);
 	inode_lock_nested(dir, I_MUTEX_PARENT);
 retry:
-	work = lookup_one_len(OVL_WORKDIR_NAME, dentry,
-			      strlen(OVL_WORKDIR_NAME));
+	work = lookup_one_len(name, dentry, strlen(name));
 
 	if (!IS_ERR(work)) {
 		struct iattr attr = {
@@ -478,6 +481,9 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
 			if (retried)
 				goto out_dput;
 
+			if (persist)
+				goto out_unlock;
+
 			retried = true;
 			ovl_workdir_cleanup(dir, mnt, work, 0);
 			dput(work);
@@ -517,16 +523,25 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
 		inode_unlock(work->d_inode);
 		if (err)
 			goto out_dput;
+	} else {
+		err = PTR_ERR(work);
+		goto out_err;
 	}
+
 out_unlock:
-	inode_unlock(dir);
+	if (dir)
+		inode_unlock(dir);
 	mnt_drop_write(mnt);
 
 	return work;
 
 out_dput:
 	dput(work);
-	work = ERR_PTR(err);
+out_err:
+	pr_warn("overlayfs: failed to create directory %s/%s (errno: %i); mounting read-only\n",
+		ufs->config.workdir, name, -err);
+	sb->s_flags |= MS_RDONLY;
+	work = NULL;
 	goto out_unlock;
 }
 
@@ -940,14 +955,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 		sb->s_time_gran = ufs->upper_mnt->mnt_sb->s_time_gran;
 
-		ufs->workdir = ovl_workdir_create(ufs->upper_mnt, workpath.dentry);
+		ufs->workdir = ovl_workdir_create(sb, ufs, workpath.dentry,
+						  OVL_WORKDIR_NAME, false);
 		err = PTR_ERR(ufs->workdir);
-		if (IS_ERR(ufs->workdir)) {
-			pr_warn("overlayfs: failed to create directory %s/%s (errno: %i); mounting read-only\n",
-				ufs->config.workdir, OVL_WORKDIR_NAME, -err);
-			sb->s_flags |= MS_RDONLY;
-			ufs->workdir = NULL;
-		}
+		if (IS_ERR(ufs->workdir))
+			goto out_put_upper_mnt;
 
 		/*
 		 * Upper should support d_type, else whiteouts are visible.
@@ -1112,6 +1124,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	kfree(ufs->lower_mnt);
 out_put_workdir:
 	dput(ufs->workdir);
+out_put_upper_mnt:
 	mntput(ufs->upper_mnt);
 out_put_lowerpath:
 	for (i = 0; i < numlower; i++)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/17] ovl: introduce the inodes index dir feature
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
  2017-06-02 14:04 ` [PATCH 01/17] vfs: add helper wait_on_inode_inuse() Amir Goldstein
  2017-06-02 14:04 ` [PATCH 02/17] ovl: generalize ovl_create_workdir() Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 04/17] ovl: verify index dir matches upper dir Amir Goldstein
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Create the index dir on mount. The index dir will contain hardlinks to
upper inodes, named after the hex representation of their origin lower
inodes.

The index dir is going to be used to prevent breaking lower hardlinks
on copy up.

Because the feature is not fully backward compat, enabling the feature
is opt-in by config/module/mount option.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/Kconfig     | 20 +++++++++++++++
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/ovl_entry.h |  3 +++
 fs/overlayfs/super.c     | 67 ++++++++++++++++++++++++++++++++++++++----------
 fs/overlayfs/util.c      |  7 +++++
 5 files changed, 85 insertions(+), 13 deletions(-)

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index c0c9683934b7..b7241f273c36 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -23,3 +23,23 @@ config OVERLAY_FS_REDIRECT_DIR
 	  Note, that redirects are not backward compatible.  That is, mounting
 	  an overlay which has redirects on a kernel that doesn't support this
 	  feature will have unexpected results.
+
+config OVERLAY_FS_INDEX
+	bool "Overlayfs: turn on inodes index feature by default"
+	depends on OVERLAY_FS
+	help
+	  If this config option is enabled then overlay filesystems will use
+	  the inodes index dir to map lower inodes to upper inodes by default.
+	  In this case it is still possible to turn off index globally with the
+	  "index=off" module option or on a filesystem instance basis with the
+	  "index=off" mount option.
+
+	  The inodes index feature prevents breaking of lower hardlinks on copy
+	  up and enables NFS export.
+
+	  Note, that the inodes index feature is read-only backward compatible.
+	  That is, mounting an overlay which has an index dir on a kernel that
+	  doesn't support this feature read-only, will not have any negative
+	  outcomes.  However, mounting the same overlay with an old kernel
+	  read-write and then mounting it again with a new kernel, will have
+	  unexpected results.
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index bf7e1d95e640..45f1cd605f4d 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -204,6 +204,7 @@ void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
 struct super_block *ovl_same_sb(struct super_block *sb);
+struct dentry *ovl_indexdir(struct super_block *sb);
 unsigned int ovl_verify_dir(struct super_block *sb);
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
 bool ovl_dentry_remote(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 298670fccbb6..f1b1f69e42f0 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -14,6 +14,7 @@ struct ovl_config {
 	char *workdir;
 	bool default_permissions;
 	bool redirect_dir;
+	bool index;
 	unsigned int verify_dir;
 };
 
@@ -26,6 +27,8 @@ struct ovl_fs {
 	struct dentry *workbasedir;
 	/* workdir is the 'work' directory under workbasedir */
 	struct dentry *workdir;
+	/* index directory listing overlay inodes by origin file handle */
+	struct dentry *indexdir;
 	long namelen;
 	/* pathnames of lower and upper dirs, for show_options */
 	struct ovl_config config;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 3cdefd401713..76d5a8cfa86a 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -34,6 +34,11 @@ module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644);
 MODULE_PARM_DESC(ovl_redirect_dir_def,
 		 "Default to on or off for the redirect_dir feature");
 
+static bool ovl_index_def = IS_ENABLED(CONFIG_OVERLAY_FS_INDEX);
+module_param_named(index, ovl_index_def, bool, 0644);
+MODULE_PARM_DESC(ovl_index_def,
+		 "Default to on or off for the inodes index feature");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -182,6 +187,7 @@ static void ovl_put_super(struct super_block *sb)
 	struct ovl_fs *ufs = sb->s_fs_info;
 	unsigned i;
 
+	dput(ufs->indexdir);
 	dput(ufs->workdir);
 	ovl_dir_unlock(ufs->workbasedir);
 	dput(ufs->workbasedir);
@@ -265,6 +271,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ufs->config.redirect_dir != ovl_redirect_dir_def)
 		seq_printf(m, ",redirect_dir=%s",
 			   ufs->config.redirect_dir ? "on" : "off");
+	if (ufs->config.index != ovl_index_def)
+		seq_printf(m, ",index=%s",
+			   ufs->config.index ? "on" : "off");
 	if (ufs->config.verify_dir) {
 		if (ufs->config.verify_dir == OVL_VERIFY_LOWER)
 			seq_puts(m, ",verify_lower");
@@ -300,6 +309,8 @@ enum {
 	OPT_DEFAULT_PERMISSIONS,
 	OPT_REDIRECT_DIR_ON,
 	OPT_REDIRECT_DIR_OFF,
+	OPT_INDEX_ON,
+	OPT_INDEX_OFF,
 	OPT_VERIFY_LOWER,
 	OPT_VERIFY_DIR,
 	OPT_ERR,
@@ -312,6 +323,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_DEFAULT_PERMISSIONS,	"default_permissions"},
 	{OPT_REDIRECT_DIR_ON,		"redirect_dir=on"},
 	{OPT_REDIRECT_DIR_OFF,		"redirect_dir=off"},
+	{OPT_INDEX_ON,			"index=on"},
+	{OPT_INDEX_OFF,			"index=off"},
 	{OPT_VERIFY_LOWER,		"verify_lower"},
 	{OPT_VERIFY_DIR,		"verify_dir=%u"},
 	{OPT_ERR,			NULL}
@@ -386,6 +399,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->redirect_dir = false;
 			break;
 
+		case OPT_INDEX_ON:
+			config->index = true;
+			break;
+
+		case OPT_INDEX_OFF:
+			config->index = false;
+			break;
+
 		case OPT_VERIFY_LOWER:
 			config->verify_dir = OVL_VERIFY_LOWER;
 			break;
@@ -449,6 +470,7 @@ static int ovl_verify_set_origin(struct dentry *dir, struct vfsmount *mnt,
 }
 
 #define OVL_WORKDIR_NAME "work"
+#define OVL_INDEXDIR_NAME "index"
 
 static struct dentry *ovl_workdir_create(struct super_block *sb,
 					 struct ovl_fs *ufs,
@@ -840,6 +862,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	init_waitqueue_head(&ufs->copyup_wq);
 	ufs->config.redirect_dir = ovl_redirect_dir_def;
+	ufs->config.index = ovl_index_def;
 	err = ovl_parse_opt((char *) data, &ufs->config);
 	if (err)
 		goto out_free_config;
@@ -1002,6 +1025,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			} else {
 				vfs_removexattr(ufs->workdir, OVL_XATTR_OPAQUE);
 			}
+
+			/* Check if upper/work fs supports file handles */
+			if (!ovl_can_decode_fh(ufs->workdir->d_sb)) {
+				ufs->config.index = false;
+				pr_warn("overlayfs: upper fs does not support NFS export.\n");
+			}
 		}
 	}
 
@@ -1033,34 +1062,44 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			ufs->same_sb = NULL;
 
 		/*
-		 * The verify_lower feature is used to verify that lower dir
-		 * found by path matches the stored copy up origin file handle.
-		 * It requires that all layers support NFS export.
+		 * The inodes index and verify_lower features need to encode
+		 * and decode real inode file handles. They require that all
+		 * layers support NFS export.
 		 */
-		if (ufs->config.verify_dir) {
+		if (!ovl_can_decode_fh(mnt->mnt_sb)) {
+			ufs->config.index = false;
+			pr_warn("overlayfs: lower fs does not support NFS export.\n");
 			err = -EOPNOTSUPP;
-			if (!ovl_can_decode_fh(mnt->mnt_sb)) {
+			if (ufs->config.verify_dir) {
 				pr_err("overlayfs: option \"verify_lower\" not supported by lower fs.\n");
 				goto out_put_lower_mnt;
 			}
+		} else if (i == 0 && OVL_VERIFY_ROOT(ufs->config.verify_dir)) {
 			/* Verify lower root matches origin stored in upper */
-			if (i == 0 && OVL_VERIFY_ROOT(ufs->config.verify_dir)) {
-				err = ovl_verify_set_origin(upperpath.dentry,
-							    mnt, mnt->mnt_root,
-							    "lower root");
-				if (err)
-					goto out_put_lower_mnt;
-			}
+			err = ovl_verify_set_origin(upperpath.dentry, mnt,
+						    mnt->mnt_root,
+						    "lower root");
+			if (err)
+				goto out_put_lower_mnt;
 		}
 
 	}
 
+
 	/* If the upper fs is nonexistent, we mark overlayfs r/o too */
 	if (!ufs->upper_mnt)
 		sb->s_flags |= MS_RDONLY;
 	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
 		ufs->same_sb = NULL;
 
+	if (!(sb->s_flags & MS_RDONLY) && ufs->config.index) {
+		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
+						   OVL_INDEXDIR_NAME, true);
+		err = PTR_ERR(ufs->indexdir);
+		if (IS_ERR(ufs->indexdir))
+			goto out_put_lower_mnt;
+	}
+
 	if (remote)
 		sb->s_d_op = &ovl_reval_dentry_operations;
 	else
@@ -1068,7 +1107,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	ufs->creator_cred = cred = prepare_creds();
 	if (!cred)
-		goto out_put_lower_mnt;
+		goto out_put_indexdir;
 
 	/* Never override disk quota limits or use reserved space */
 	cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
@@ -1118,6 +1157,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	kfree(oe);
 out_put_cred:
 	put_cred(ufs->creator_cred);
+out_put_indexdir:
+	dput(ufs->indexdir);
 out_put_lower_mnt:
 	for (i = 0; i < ufs->numlower; i++)
 		mntput(ufs->lower_mnt[i]);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 535665243fe8..53c2fdb10e36 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -47,6 +47,13 @@ struct super_block *ovl_same_sb(struct super_block *sb)
 	return ofs->same_sb;
 }
 
+struct dentry *ovl_indexdir(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return ofs->indexdir;
+}
+
 unsigned int ovl_verify_dir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/17] ovl: verify index dir matches upper dir
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 03/17] ovl: introduce the inodes index dir feature Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 05/17] ovl: create helper ovl_lookup_index() Amir Goldstein
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

An index dir contains persistent hardlinks to files in upper dir.
Therefore, we must never mount an existing index dir with a differnt
upper dir.

Store the upper root dir file handle in index dir inode when index
dir is created and verify the file handle before using an existing
index dir on mount.

When failing to verify upper dir file handle, cleanup existing index
dir and create a new empty one. If the 'verify_lower' mount option was
specified by user, leave the mismatch index dir intact and mount
readonly.

Add an 'is_upper' flag to the overlay file handle encoding and set it
when encoding the upper root file handle. This is not critical for index
dir verification, but it is good practice towards a standard overlayfs
file handle format.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 12 ++++++++++--
 fs/overlayfs/namei.c     |  6 +++---
 fs/overlayfs/overlayfs.h |  6 ++++--
 fs/overlayfs/super.c     | 48 ++++++++++++++++++++++++++++++++++++++++++------
 4 files changed, 59 insertions(+), 13 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 047b2c3fdf6a..95568ec4f1d0 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -239,7 +239,7 @@ bool ovl_can_decode_fh(struct super_block *sb)
 		uuid_be_cmp(*(uuid_be *) &sb->s_uuid, NULL_UUID_BE));
 }
 
-struct ovl_fh *ovl_encode_fh(struct dentry *lower)
+struct ovl_fh *ovl_encode_fh(struct dentry *lower, bool is_upper)
 {
 	struct ovl_fh *fh;
 	int fh_type, fh_len, dwords;
@@ -278,6 +278,14 @@ struct ovl_fh *ovl_encode_fh(struct dentry *lower)
 	fh->magic = OVL_FH_MAGIC;
 	fh->type = fh_type;
 	fh->flags = OVL_FH_FLAG_CPU_ENDIAN;
+	/*
+	 * When we will want to decode an overlay dentry from this handle
+	 * and all layers are on the same fs, if we get a disconncted real
+	 * dentry when we decode fid, the only way to tell if we should assign
+	 * it to upperdentry or to lowerstack is by checking this flag.
+	 */
+	if (is_upper)
+		fh->flags |= OVL_FH_FLAG_PATH_UPPER;
 	fh->len = fh_len;
 	fh->uuid = *uuid;
 	memcpy(fh->fid, buf, buflen);
@@ -299,7 +307,7 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 	 * up and a pure upper inode.
 	 */
 	if (ovl_can_decode_fh(lower->d_sb)) {
-		fh = ovl_encode_fh(lower);
+		fh = ovl_encode_fh(lower, false);
 		if (IS_ERR(fh))
 			return PTR_ERR(fh);
 	}
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 4a37f2fc3bbe..f5b49533c0e3 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -327,7 +327,7 @@ static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
  * Return 0 on match, -ESTALE on mismatch, < 0 on error.
  */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
-		      struct dentry *origin)
+		      struct dentry *origin, bool is_upper)
 {
 	struct inode *inode = NULL;
 	struct ovl_fh *fh = NULL;
@@ -343,7 +343,7 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 		goto fail;
 	}
 
-	fh = ovl_encode_fh(origin);
+	fh = ovl_encode_fh(origin, is_upper);
 	if (IS_ERR(fh)) {
 		err = PTR_ERR(fh);
 		fh = NULL;
@@ -482,7 +482,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		if (this && upperdentry && !ctr &&
 		    OVL_VERIFY_MERGE(ovl_verify_dir(dentry->d_sb))) {
 			err = ovl_verify_origin(upperdentry, lowerpath.mnt,
-						this);
+						this, false);
 			if (err && err != -ENODATA) {
 				dput(this);
 				break;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 45f1cd605f4d..31920a649a1c 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -51,6 +51,8 @@ enum ovl_verify_dir {
 /* CPU byte order required for fid decoding:  */
 #define OVL_FH_FLAG_BIG_ENDIAN	(1 << 0)
 #define OVL_FH_FLAG_ANY_ENDIAN	(1 << 1)
+/* Is the real inode encoded in fid an upper inode? */
+#define OVL_FH_FLAG_PATH_UPPER	(1 << 2)
 
 #define OVL_FH_FLAG_ALL (OVL_FH_FLAG_BIG_ENDIAN | OVL_FH_FLAG_ANY_ENDIAN)
 
@@ -249,7 +251,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 
 /* namei.c */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
-		      struct dentry *origin);
+		      struct dentry *origin, bool is_upper);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
 bool ovl_lower_positive(struct dentry *dentry);
@@ -309,4 +311,4 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
 bool ovl_can_decode_fh(struct super_block *sb);
-struct ovl_fh *ovl_encode_fh(struct dentry *lower);
+struct ovl_fh *ovl_encode_fh(struct dentry *lower, bool is_upper);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 76d5a8cfa86a..7e3976c34aab 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -439,19 +439,20 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
  * If dir has no stored file handle, encode and store origin file handle.
  */
 static int ovl_verify_set_origin(struct dentry *dir, struct vfsmount *mnt,
-				 struct dentry *origin, const char *name)
+				 struct dentry *origin, const char *name,
+				 bool is_upper)
 {
 	const struct ovl_fh *fh = NULL;
 	int err;
 
-	err = ovl_verify_origin(dir, mnt, origin);
+	err = ovl_verify_origin(dir, mnt, origin, is_upper);
 	if (!err)
 		return 0;
 
 	if (err != -ENODATA)
 		goto fail;
 
-	fh = ovl_encode_fh(origin);
+	fh = ovl_encode_fh(origin, is_upper);
 	err = PTR_ERR(fh);
 	if (IS_ERR(fh))
 		goto fail;
@@ -479,6 +480,7 @@ static struct dentry *ovl_workdir_create(struct super_block *sb,
 {
 	struct vfsmount *mnt = ufs->upper_mnt;
 	struct inode *dir = NULL;
+	struct dentry *upperdir = mnt->mnt_root;
 	struct dentry *work = NULL;
 	int err;
 	bool retried = false;
@@ -503,8 +505,29 @@ static struct dentry *ovl_workdir_create(struct super_block *sb,
 			if (retried)
 				goto out_dput;
 
-			if (persist)
-				goto out_unlock;
+			/*
+			 * Persistent work dir has a stored file handle of upper
+			 * root dir. If we verify the upper root handle matches
+			 * upper root dir, we can use the persistent work dir.
+			 * By default, failure to verify upper root file handle
+			 * will result in re-creating the persistent work dir.
+			 * With the verify_lower mount option, persistent work
+			 * dir will not be cleaned and mounted will fail.
+			 */
+			if (persist) {
+				err = ovl_verify_set_origin(work, mnt, upperdir,
+							    "upper root", true);
+				if (!err)
+					goto out_unlock;
+
+				/* With -o verify_lower, verify must succeed */
+				if (OVL_VERIFY_ROOT(ufs->config.verify_dir))
+					goto out_dput;
+
+				/* Blow away stale persistent work dir */
+				pr_warn("overlayfs: discarding existing directory %s/%s\n",
+					ufs->config.workdir, name);
+			}
 
 			retried = true;
 			ovl_workdir_cleanup(dir, mnt, work, 0);
@@ -518,6 +541,19 @@ static struct dentry *ovl_workdir_create(struct super_block *sb,
 		if (err)
 			goto out_dput;
 
+		if (persist) {
+			/*
+			 * Persistent work dir is associated with an upper dir
+			 * by storing the upper dir root file handle in xattr.
+			 * We use that file handle to verify that the persistent
+			 * work dir is never re-used with another upper dir.
+			 */
+			err = ovl_verify_set_origin(work, mnt, upperdir,
+						    "upper root", true);
+			if (err)
+				goto out_dput;
+		}
+
 		/*
 		 * Try to remove POSIX ACL xattrs from workdir.  We are good if:
 		 *
@@ -1078,7 +1114,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			/* Verify lower root matches origin stored in upper */
 			err = ovl_verify_set_origin(upperpath.dentry, mnt,
 						    mnt->mnt_root,
-						    "lower root");
+						    "lower root", false);
 			if (err)
 				goto out_put_lower_mnt;
 		}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/17] ovl: create helper ovl_lookup_index()
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 04/17] ovl: verify index dir matches upper dir Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 06/17] ovl: move inode helpers to inode.c Amir Goldstein
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

The index dir should contain an entry for each upper inode with a known
lower origin inode. The index entry name is the hex representation of the
lower inode file handle.

Given a lower dentry, the helper encodes the lower file handle and looks
up the index dentry in the index dir.

This is going to be used to prevent breaking hardlinks on copy up.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/overlayfs/overlayfs.h |  1 +
 2 files changed, 49 insertions(+)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index f5b49533c0e3..b50918b283e7 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -565,6 +565,54 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	return ERR_PTR(err);
 }
 
+/*
+ * Given a lower dentry, find the dentry for the upper inode in the index dir.
+ * If the index dentry in negative, then either no lower alias has been copied
+ * up yet, or aliases have been copied up in older kernels and are not indexed.
+ */
+struct dentry *ovl_lookup_index(struct super_block *sb, struct dentry *lower)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+	struct ovl_fh *fh;
+	struct dentry *temp;
+	char *s, *name = NULL;
+	long namelen;
+	int err;
+
+	if (!ofs->indexdir)
+		return ERR_PTR(-ENOENT);
+
+	fh = ovl_encode_fh(lower, false);
+	if (IS_ERR(fh))
+		return (struct dentry *) fh;
+
+	err = -ENAMETOOLONG;
+	namelen = fh->len * 2;
+	if (namelen > ofs->namelen)
+		goto fail;
+
+	err = -ENOMEM;
+	name = kzalloc(namelen + 1, GFP_TEMPORARY);
+	if (!name)
+		goto fail;
+
+	s  = bin2hex(name, fh, fh->len);
+	namelen = s - name;
+
+	temp = lookup_one_len_unlocked(name, ofs->indexdir, namelen);
+
+out:
+	kfree(fh);
+	kfree(name);
+	return temp;
+
+fail:
+	pr_warn_ratelimited("overlayfs: failed lookup index (key=%*s, err=%i)\n",
+			    (int)namelen, name, err);
+	temp = ERR_PTR(err);
+	goto out;
+}
+
 bool ovl_lower_positive(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 31920a649a1c..7ddb3a26a3da 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -254,6 +254,7 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 		      struct dentry *origin, bool is_upper);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
+struct dentry *ovl_lookup_index(struct super_block *sb, struct dentry *lower);
 bool ovl_lower_positive(struct dentry *dentry);
 
 /* readdir.c */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/17] ovl: move inode helpers to inode.c
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (4 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 05/17] ovl: create helper ovl_lookup_index() Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 07/17] ovl: create helpers for initializing hashed inode Amir Goldstein
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 16 ++++++++++++++++
 fs/overlayfs/overlayfs.h |  7 ++++---
 fs/overlayfs/util.c      | 16 ----------------
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 5f285c179bbd..753d4d4c0ad5 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -457,6 +457,22 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 	return inode;
 }
 
+void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
+{
+	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
+		   (is_upper ? OVL_ISUPPER_MASK : 0));
+}
+
+void ovl_inode_update(struct inode *inode, struct inode *upperinode)
+{
+	WARN_ON(!upperinode);
+	WARN_ON(!inode_unhashed(inode));
+	WRITE_ONCE(inode->i_private,
+		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
+	if (!S_ISDIR(upperinode->i_mode))
+		__insert_inode_hash(inode, (unsigned long) upperinode);
+}
+
 static int ovl_inode_test(struct inode *inode, void *data)
 {
 	return ovl_inode_real(inode, NULL) == data;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 7ddb3a26a3da..37e3b3902e13 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -228,9 +228,6 @@ bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
-void ovl_inode_init(struct inode *inode, struct inode *realinode,
-		    bool is_upper);
-void ovl_inode_update(struct inode *inode, struct inode *upperinode);
 void ovl_dentry_version_inc(struct dentry *dentry);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
@@ -282,7 +279,11 @@ int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
+void ovl_inode_init(struct inode *inode, struct inode *realinode,
+		    bool is_upper);
+void ovl_inode_update(struct inode *inode, struct inode *upperinode);
 struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode);
+
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 53c2fdb10e36..fab76415c8eb 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -244,22 +244,6 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 	oe->__upperdentry = upperdentry;
 }
 
-void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
-{
-	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
-		   (is_upper ? OVL_ISUPPER_MASK : 0));
-}
-
-void ovl_inode_update(struct inode *inode, struct inode *upperinode)
-{
-	WARN_ON(!upperinode);
-	WARN_ON(!inode_unhashed(inode));
-	WRITE_ONCE(inode->i_private,
-		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
-	if (!S_ISDIR(upperinode->i_mode))
-		__insert_inode_hash(inode, (unsigned long) upperinode);
-}
-
 void ovl_dentry_version_inc(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/17] ovl: create helpers for initializing hashed inode
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (5 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 06/17] ovl: move inode helpers to inode.c Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 08/17] ovl: use ovl_inode_init() for initializing new inode Amir Goldstein
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

In preparation for hashing all overlay inodes, factor out helpers
ovl_inode_data(), ovl_inode_data_is_upper() and ovl_inode_data_real()
to set and test overlay inode i_private data content.

Those helpers are going to be used for hashing overlay inodes.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |  8 +++-----
 fs/overlayfs/overlayfs.h | 25 +++++++++++++++++++++----
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 753d4d4c0ad5..cfcefeb88a32 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -459,16 +459,14 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
-	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
-		   (is_upper ? OVL_ISUPPER_MASK : 0));
+	WRITE_ONCE(inode->i_private, ovl_inode_data(realinode, is_upper));
 }
 
 void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
 	WARN_ON(!upperinode);
 	WARN_ON(!inode_unhashed(inode));
-	WRITE_ONCE(inode->i_private,
-		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
+	WRITE_ONCE(inode->i_private, ovl_inode_data(upperinode, true));
 	if (!S_ISDIR(upperinode->i_mode))
 		__insert_inode_hash(inode, (unsigned long) upperinode);
 }
@@ -480,7 +478,7 @@ static int ovl_inode_test(struct inode *inode, void *data)
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	inode->i_private = (void *) (((unsigned long) data) | OVL_ISUPPER_MASK);
+	inode->i_private = ovl_inode_data(data, true);
 	return 0;
 }
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 37e3b3902e13..21f062611360 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -75,7 +75,6 @@ struct ovl_fh {
 	u8 fid[0];	/* file identifier */
 } __packed;
 
-#define OVL_ISUPPER_MASK 1UL
 
 static inline int ovl_do_rmdir(struct inode *dir, struct dentry *dentry)
 {
@@ -190,14 +189,32 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
 	return ret;
 }
 
+#define OVL_INODE_ISUPPER_MASK 1UL
+
+static inline void *ovl_inode_data(struct inode *realinode, bool is_upper)
+{
+	return (void *)((unsigned long) realinode |
+			(is_upper ? OVL_INODE_ISUPPER_MASK : 0));
+}
+
+static inline bool ovl_inode_data_is_upper(void *data)
+{
+	return ((unsigned long)data & OVL_INODE_ISUPPER_MASK);
+}
+
+static inline struct inode *ovl_inode_data_real(void *data)
+{
+	return (struct inode *) ((unsigned long)data & ~OVL_INODE_ISUPPER_MASK);
+}
+
 static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
 {
-	unsigned long x = (unsigned long) READ_ONCE(inode->i_private);
+	void *data = READ_ONCE(inode->i_private);
 
 	if (is_upper)
-		*is_upper = x & OVL_ISUPPER_MASK;
+		*is_upper = ovl_inode_data_is_upper(data);
 
-	return (struct inode *) (x & ~OVL_ISUPPER_MASK);
+	return ovl_inode_data_real(data);
 }
 
 /* util.c */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/17] ovl: use ovl_inode_init() for initializing new inode
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (6 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 07/17] ovl: create helpers for initializing hashed inode Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 09/17] ovl: allow hashing non upper inodes Amir Goldstein
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

In preparation for hashing all overlay inodes, modify the helper
ovl_inode_init() to hash a new non-dir upper overlay inode.

Use this helper for initializing new upper inode in ovl_instantiate()
instead of ovl_inode_update(), because the implementations for new inode
case and copy up case are going to diverge.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c   |  2 +-
 fs/overlayfs/inode.c | 12 ++++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index a63a71656e9b..4f0096fee276 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -156,7 +156,7 @@ static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 	ovl_dentry_version_inc(dentry->d_parent);
 	ovl_dentry_update(dentry, newdentry);
 	if (!hardlink) {
-		ovl_inode_update(inode, d_inode(newdentry));
+		ovl_inode_init(inode, d_inode(newdentry), true);
 		ovl_copyattr(newdentry->d_inode, inode);
 	} else {
 		WARN_ON(ovl_inode_real(inode, NULL) != d_inode(newdentry));
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index cfcefeb88a32..136c19aff5e0 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -457,18 +457,26 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 	return inode;
 }
 
+static void ovl_insert_inode_hash(struct inode *inode, struct inode *realinode)
+{
+	WARN_ON(!inode_unhashed(inode));
+	__insert_inode_hash(inode, (unsigned long) realinode);
+}
+
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
+
 	WRITE_ONCE(inode->i_private, ovl_inode_data(realinode, is_upper));
+	if (is_upper && !S_ISDIR(realinode->i_mode))
+		ovl_insert_inode_hash(inode, realinode);
 }
 
 void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
 	WARN_ON(!upperinode);
-	WARN_ON(!inode_unhashed(inode));
 	WRITE_ONCE(inode->i_private, ovl_inode_data(upperinode, true));
 	if (!S_ISDIR(upperinode->i_mode))
-		__insert_inode_hash(inode, (unsigned long) upperinode);
+		ovl_insert_inode_hash(inode, upperinode);
 }
 
 static int ovl_inode_test(struct inode *inode, void *data)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/17] ovl: allow hashing non upper inodes
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (7 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 08/17] ovl: use ovl_inode_init() for initializing new inode Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 10/17] ovl: allow hashing inodes by arbitrary key Amir Goldstein
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

In preparation for hashing all overlay inodes, pass is_upper to
ovl_get_inode() and the iget5_locked() callbacks, instead of assuming
that the hashed inode is upper.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 11 ++++++-----
 fs/overlayfs/namei.c     |  2 +-
 fs/overlayfs/overlayfs.h |  3 ++-
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 136c19aff5e0..19df2961f1c4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -481,22 +481,23 @@ void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 
 static int ovl_inode_test(struct inode *inode, void *data)
 {
-	return ovl_inode_real(inode, NULL) == data;
+	return READ_ONCE(inode->i_private) == data;
 }
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	inode->i_private = ovl_inode_data(data, true);
+	inode->i_private = data;
 	return 0;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode)
-
+struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
+			    bool is_upper)
 {
 	struct inode *inode;
 
 	inode = iget5_locked(sb, (unsigned long) realinode,
-			     ovl_inode_test, ovl_inode_set, realinode);
+			     ovl_inode_test, ovl_inode_set,
+			     ovl_inode_data(realinode, is_upper));
 	if (inode && inode->i_state & I_NEW) {
 		ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
 		set_nlink(inode, realinode->i_nlink);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index b50918b283e7..7bc570f63259 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -525,7 +525,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 		err = -ENOMEM;
 		if (upperdentry && !d_is_dir(upperdentry)) {
-			inode = ovl_get_inode(dentry->d_sb, realinode);
+			inode = ovl_get_inode(dentry->d_sb, realinode, true);
 		} else {
 			inode = ovl_new_inode(dentry->d_sb, realinode->i_mode,
 					      realinode->i_rdev);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 21f062611360..b500dee7de69 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -299,7 +299,8 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
 void ovl_inode_update(struct inode *inode, struct inode *upperinode);
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode);
+struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
+			    bool is_upper);
 
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/17] ovl: allow hashing inodes by arbitrary key
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (8 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 09/17] ovl: allow hashing non upper inodes Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 11/17] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

In preparation for hashing all overlay inodes, pass realinode to
ovl_get_inode() as explicit hashval argument, instead of implicitly
assuming that overlay inodes are hashed by the realinode address.

realinode is also passed to ovl_get_inode() as an independent argument,
that is used as the context to the iget5_locked() test/set callbacks.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     | 7 +++----
 fs/overlayfs/namei.c     | 4 +++-
 fs/overlayfs/overlayfs.h | 4 ++--
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 19df2961f1c4..1e89885d5304 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -490,13 +490,12 @@ static int ovl_inode_set(struct inode *inode, void *data)
 	return 0;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
-			    bool is_upper)
+struct inode *ovl_get_inode(struct super_block *sb, unsigned long hashval,
+			    struct inode *realinode, bool is_upper)
 {
 	struct inode *inode;
 
-	inode = iget5_locked(sb, (unsigned long) realinode,
-			     ovl_inode_test, ovl_inode_set,
+	inode = iget5_locked(sb, hashval, ovl_inode_test, ovl_inode_set,
 			     ovl_inode_data(realinode, is_upper));
 	if (inode && inode->i_state & I_NEW) {
 		ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 7bc570f63259..615fb23fcba5 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -525,7 +525,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 
 		err = -ENOMEM;
 		if (upperdentry && !d_is_dir(upperdentry)) {
-			inode = ovl_get_inode(dentry->d_sb, realinode, true);
+			inode = ovl_get_inode(dentry->d_sb,
+					      (unsigned long) realinode,
+					      realinode, true);
 		} else {
 			inode = ovl_new_inode(dentry->d_sb, realinode->i_mode,
 					      realinode->i_rdev);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index b500dee7de69..db3c7641efe6 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -299,8 +299,8 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
 void ovl_inode_update(struct inode *inode, struct inode *upperinode);
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode,
-			    bool is_upper);
+struct inode *ovl_get_inode(struct super_block *sb, unsigned long hashval,
+			    struct inode *realinode, bool is_upper);
 
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 11/17] ovl: hash overlay non-dir inodes by copy up origin inode
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (9 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 10/17] ovl: allow hashing inodes by arbitrary key Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-05 12:42   ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 12/17] ovl: defer upper dir lock to tempfile link Amir Goldstein
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Non-dir pure upper overlay inodes are hashed by the address of the
upper real inode.  When inodes index feature is enabled, hash all non-dir
inodes by the address of the copy up origin inode if we know it.

This is going to be used to prevent breaking lower hardlinks on copy up.

Because overlay dentries of lower hardlink aliases have the same overlay
inode, copy up of those lower aliases can result is conflict when setting
the real upper inode of a broken hardlink. Copy up of an alias that lost
in this conflict will return -EEXIST and drop the overlay dentry.

This conflict is going to be handled more gracefully by following patches.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   |  6 +++++-
 fs/overlayfs/inode.c     | 40 ++++++++++++++++++++++++++++++++++++----
 fs/overlayfs/namei.c     | 20 ++++++++++++++++----
 fs/overlayfs/overlayfs.h |  2 +-
 4 files changed, 58 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 95568ec4f1d0..0fff3b7cb3e2 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -424,7 +424,11 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 
 	newdentry = dget(tmpfile ? upper : temp);
 	ovl_dentry_update(dentry, newdentry);
-	ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	err = ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	if (err) {
+		/* Broken hardlink - drop cache and return error */
+		d_drop(dentry);
+	}
 
 	/* Restore timestamps on parent (best effort) */
 	ovl_set_timestamps(upperdir, pstat);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 1e89885d5304..95746b228a0c 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -471,17 +471,49 @@ void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 		ovl_insert_inode_hash(inode, realinode);
 }
 
-void ovl_inode_update(struct inode *inode, struct inode *upperinode)
+/*
+ * When inodes index is enabled, we hash all non-dir inodes by the address
+ * of the lower origin inode. We need to take care on concurrent copy up of
+ * different lower hardlinks, that only one alias can set the upper real inode.
+ * Copy up of an alias that lost the ovl_inode_update() race will get -EEXIST
+ * and needs to d_drop() the overlay dentry of that alias, so the next
+ * ovl_lookup() will initialize a new overlay inode for the broken hardlink.
+ */
+int ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
+	bool is_upper;
+	struct inode *realinode;
+
 	WARN_ON(!upperinode);
-	WRITE_ONCE(inode->i_private, ovl_inode_data(upperinode, true));
-	if (!S_ISDIR(upperinode->i_mode))
+	spin_lock(&inode->i_lock);
+	realinode = ovl_inode_real(inode, &is_upper);
+	if (!is_upper)
+		inode->i_private = ovl_inode_data(upperinode, true);
+	spin_unlock(&inode->i_lock);
+	if (is_upper && realinode != upperinode)
+		return -EEXIST;
+	/* When inodes index is enabled, inode is hashed before copy up */
+	if (!S_ISDIR(upperinode->i_mode) && !ovl_indexdir(inode->i_sb))
 		ovl_insert_inode_hash(inode, upperinode);
+	return 0;
 }
 
 static int ovl_inode_test(struct inode *inode, void *data)
 {
-	return READ_ONCE(inode->i_private) == data;
+	void *this = READ_ONCE(inode->i_private);
+
+	if (this == data)
+		return true;
+
+	/*
+	 * Overlay inodes hash key is the address of the lower origin inode.
+	 * Return same overlay inode when looking up by lower real inode, but
+	 * an existing overlay inode, that is hashed by the same lower origin
+	 * inode, has already been updated on copy up to a real upper inode.
+	 */
+	return ovl_indexdir(inode->i_sb) &&
+		ovl_inode_data_is_upper(this) &&
+		!ovl_inode_data_is_upper(data);
 }
 
 static int ovl_inode_set(struct inode *inode, void *data)
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 615fb23fcba5..b6f9870ba13a 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -519,15 +519,27 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	if (upperdentry || ctr) {
 		struct dentry *realdentry;
 		struct inode *realinode;
+		unsigned long hashval = 0;
 
 		realdentry = upperdentry ? upperdentry : stack[0].dentry;
 		realinode = d_inode(realdentry);
+		/*
+		 * When inodes index is enabled, we hash all non-dir inodes
+		 * by the address of the lower origin inode.
+		 * When inodes index is disabled, or if entry is pure upper, we
+		 * hash upper non-dir inodes by the addess of the upper inode.
+		 * Regardless of the hash key we use, we always pass the real
+		 * inode (upper most) to iget5_locked() test/set callbacks.
+		 */
+		if (ofs->indexdir && !d.is_dir && ctr)
+			hashval = (unsigned long) d_inode(stack[0].dentry);
+		else if (upperdentry && !d_is_dir(upperdentry))
+			hashval = (unsigned long) realinode;
 
 		err = -ENOMEM;
-		if (upperdentry && !d_is_dir(upperdentry)) {
-			inode = ovl_get_inode(dentry->d_sb,
-					      (unsigned long) realinode,
-					      realinode, true);
+		if (hashval) {
+			inode = ovl_get_inode(dentry->d_sb, hashval,
+					      realinode, !!upperdentry);
 		} else {
 			inode = ovl_new_inode(dentry->d_sb, realinode->i_mode,
 					      realinode->i_rdev);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index db3c7641efe6..33d596337044 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -298,7 +298,7 @@ bool ovl_is_private_xattr(const char *name);
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
-void ovl_inode_update(struct inode *inode, struct inode *upperinode);
+int ovl_inode_update(struct inode *inode, struct inode *upperinode);
 struct inode *ovl_get_inode(struct super_block *sb, unsigned long hashval,
 			    struct inode *realinode, bool is_upper);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 12/17] ovl: defer upper dir lock to tempfile link
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (10 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 11/17] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 13/17] ovl: factor out ovl_copy_up_inode() helper Amir Goldstein
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

On copy up of regular file using an O_TMPFILE, lock upper dir only
before linking the tempfile in place.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 35 ++++++++++++++++++-----------------
 fs/overlayfs/util.c    |  3 ++-
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 0fff3b7cb3e2..1fd73e4fa380 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -342,8 +342,13 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 		.link = link
 	};
 
-	upper = lookup_one_len(dentry->d_name.name, upperdir,
-			       dentry->d_name.len);
+	if (tmpfile) {
+		upper = lookup_one_len_unlocked(dentry->d_name.name, upperdir,
+						dentry->d_name.len);
+	} else {
+		upper = lookup_one_len(dentry->d_name.name, upperdir,
+				       dentry->d_name.len);
+	}
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
 		goto out;
@@ -383,16 +388,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 		BUG_ON(upperpath.dentry != NULL);
 		upperpath.dentry = temp;
 
-		if (tmpfile) {
-			inode_unlock(udir);
-			err = ovl_copy_up_data(lowerpath, &upperpath,
-					       stat->size);
-			inode_lock_nested(udir, I_MUTEX_PARENT);
-		} else {
-			err = ovl_copy_up_data(lowerpath, &upperpath,
-					       stat->size);
-		}
-
+		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
 		if (err)
 			goto out_cleanup;
 	}
@@ -415,10 +411,16 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out_cleanup;
 
-	if (tmpfile)
+	if (tmpfile) {
+		inode_lock_nested(udir, I_MUTEX_PARENT);
 		err = ovl_do_link(temp, udir, upper, true);
-	else
+		if (!err)
+			ovl_set_timestamps(upperdir, pstat);
+		pstat = NULL;
+		inode_unlock(udir);
+	} else {
 		err = ovl_do_rename(wdir, temp, udir, upper, 0);
+	}
 	if (err)
 		goto out_cleanup;
 
@@ -431,7 +433,8 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	}
 
 	/* Restore timestamps on parent (best effort) */
-	ovl_set_timestamps(upperdir, pstat);
+	if (pstat)
+		ovl_set_timestamps(upperdir, pstat);
 out2:
 	dput(temp);
 out1:
@@ -502,10 +505,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			goto out_done;
 		}
 
-		inode_lock_nested(upperdir->d_inode, I_MUTEX_PARENT);
 		err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
 					 stat, link, &pstat, true);
-		inode_unlock(upperdir->d_inode);
 		ovl_copy_up_end(dentry);
 		goto out_done;
 	}
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index fab76415c8eb..750c32827f60 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -234,7 +234,8 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
-	WARN_ON(!inode_is_locked(upperdentry->d_parent->d_inode));
+	WARN_ON(!inode_is_locked(upperdentry->d_parent->d_inode) &&
+		!oe->copying);
 	WARN_ON(oe->__upperdentry);
 	/*
 	 * Make sure upperdentry is consistent before making it visible to
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 13/17] ovl: factor out ovl_copy_up_inode() helper
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (11 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 12/17] ovl: defer upper dir lock to tempfile link Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 14/17] ovl: generalize ovl_copy_up_locked() using actors Amir Goldstein
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Factor out helper for copying lower inode data and metadata to temp
upper inode, that is common to copy up using O_TMPFILE and workdir.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 64 +++++++++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 27 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 1fd73e4fa380..44985d9b3c8d 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -322,6 +322,42 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 	return err;
 }
 
+static int ovl_copy_up_inode(struct dentry *dentry, struct dentry *temp,
+			     struct path *lowerpath, struct kstat *stat)
+{
+	int err;
+
+	if (S_ISREG(stat->mode)) {
+		struct path upperpath;
+
+		ovl_path_upper(dentry, &upperpath);
+		BUG_ON(upperpath.dentry != NULL);
+		upperpath.dentry = temp;
+
+		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
+		if (err)
+			return err;
+	}
+
+	err = ovl_copy_xattr(lowerpath->dentry, temp);
+	if (err)
+		return err;
+
+	inode_lock(temp->d_inode);
+	err = ovl_set_attr(temp, stat);
+	inode_unlock(temp->d_inode);
+	if (err)
+		return err;
+
+	/*
+	 * Store identifier of lower inode in upper inode xattr to
+	 * allow lookup of the copy up origin inode.
+	 */
+	err = ovl_set_origin(dentry, lowerpath->dentry, temp);
+
+	return err;
+}
+
 static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 			      struct dentry *dentry, struct path *lowerpath,
 			      struct kstat *stat, const char *link,
@@ -381,33 +417,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out2;
 
-	if (S_ISREG(stat->mode)) {
-		struct path upperpath;
-
-		ovl_path_upper(dentry, &upperpath);
-		BUG_ON(upperpath.dentry != NULL);
-		upperpath.dentry = temp;
-
-		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
-		if (err)
-			goto out_cleanup;
-	}
-
-	err = ovl_copy_xattr(lowerpath->dentry, temp);
-	if (err)
-		goto out_cleanup;
-
-	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, stat);
-	inode_unlock(temp->d_inode);
-	if (err)
-		goto out_cleanup;
-
-	/*
-	 * Store identifier of lower inode in upper inode xattr to
-	 * allow lookup of the copy up origin inode.
-	 */
-	err = ovl_set_origin(dentry, lowerpath->dentry, temp);
+	err = ovl_copy_up_inode(dentry, temp, lowerpath, stat);
 	if (err)
 		goto out_cleanup;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 14/17] ovl: generalize ovl_copy_up_locked() using actors
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (12 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 13/17] ovl: factor out ovl_copy_up_inode() helper Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 15/17] ovl: generalize ovl_copy_up_one() " Amir Goldstein
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

ovl_copy_up_locked() is implementing two different copy up methods in one
function with many conditional statements.

Split the function into actors: aquire,prepare,commit,cancel,release.
Implement the actors prepare,commit,cancel for the 'workdir' and
'tmpfile' copy up methods.

The aquire,release code for the two methods is still open coded in
ovl_copy_up_one(). Next patch will implement the aquire,release actors.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 280 +++++++++++++++++++++++++++++++++++--------------
 1 file changed, 202 insertions(+), 78 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 44985d9b3c8d..d3b922bcd983 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -358,104 +358,227 @@ static int ovl_copy_up_inode(struct dentry *dentry, struct dentry *temp,
 	return err;
 }
 
-static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
-			      struct dentry *dentry, struct path *lowerpath,
-			      struct kstat *stat, const char *link,
-			      struct kstat *pstat, bool tmpfile)
+/*
+ * Context and operations for copying up a single lower file.
+ */
+struct ovl_copy_up_ctx {
+	struct dentry *dentry;
+	struct path *lowerpath;
+	struct kstat *stat;
+	struct kstat pstat;
+	const char *link;
+	struct dentry *upperdir;
+	struct dentry *tempdir;
+	struct dentry *upper;
+	struct dentry *temp;
+};
+
+struct ovl_copy_up_ops {
+	int (*aquire)(struct ovl_copy_up_ctx *);
+	int (*prepare)(struct ovl_copy_up_ctx *);
+	int (*commit)(struct ovl_copy_up_ctx *);
+	int (*cancel)(struct ovl_copy_up_ctx *);
+	int (*release)(struct ovl_copy_up_ctx *);
+};
+
+/*
+ * Copy up operations using workdir.
+ * Upper file is created in workdir, copied and moved to upperdir.
+ */
+static int ovl_copy_up_workdir_aquire(struct ovl_copy_up_ctx *ctx)
+{
+	return 0;
+}
+
+static int ovl_copy_up_workdir_prepare(struct ovl_copy_up_ctx *ctx)
 {
-	struct inode *wdir = workdir->d_inode;
-	struct inode *udir = upperdir->d_inode;
-	struct dentry *newdentry = NULL;
 	struct dentry *upper = NULL;
 	struct dentry *temp = NULL;
 	int err;
-	const struct cred *old_creds = NULL;
-	struct cred *new_creds = NULL;
 	struct cattr cattr = {
 		/* Can't properly set mode on creation because of the umask */
-		.mode = stat->mode & S_IFMT,
-		.rdev = stat->rdev,
-		.link = link
+		.mode = ctx->stat->mode & S_IFMT,
+		.rdev = ctx->stat->rdev,
+		.link = ctx->link,
 	};
 
-	if (tmpfile) {
-		upper = lookup_one_len_unlocked(dentry->d_name.name, upperdir,
-						dentry->d_name.len);
-	} else {
-		upper = lookup_one_len(dentry->d_name.name, upperdir,
-				       dentry->d_name.len);
-	}
+
+	upper = lookup_one_len(ctx->dentry->d_name.name, ctx->upperdir,
+			       ctx->dentry->d_name.len);
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
 		goto out;
 
+	temp = ovl_lookup_temp(ctx->tempdir);
+	if (IS_ERR(temp)) {
+		err = PTR_ERR(temp);
+		goto out_dput_upper;
+	}
+
+	err = ovl_create_real(d_inode(ctx->tempdir), temp, &cattr, NULL, true);
+	if (err)
+		goto out_dput_temp;
+
+	ctx->upper = upper;
+	ctx->temp = temp;
+	return 0;
+
+out_dput_temp:
+	dput(temp);
+out_dput_upper:
+	dput(upper);
+out:
+	return err;
+}
+
+static int ovl_copy_up_workdir_commit(struct ovl_copy_up_ctx *ctx)
+{
+	int err;
+
+	err = ovl_do_rename(d_inode(ctx->tempdir), ctx->temp,
+			    d_inode(ctx->upperdir), ctx->upper, 0);
+	if (err)
+		return err;
+
+	/* After rename, ctx->temp is the upper entry we will use */
+	swap(ctx->temp, ctx->upper);
+
+	/* Restore timestamps on parent (best effort) */
+	ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
+
+	return err;
+}
+
+static int ovl_copy_up_workdir_cancel(struct ovl_copy_up_ctx *ctx)
+{
+	ovl_cleanup(d_inode(ctx->tempdir), ctx->temp);
+	return 0;
+}
+
+static int ovl_copy_up_workdir_release(struct ovl_copy_up_ctx *ctx)
+{
+	return 0;
+}
+
+static const struct ovl_copy_up_ops ovl_copy_up_workdir_ops = {
+	.aquire = ovl_copy_up_workdir_aquire,
+	.prepare = ovl_copy_up_workdir_prepare,
+	.commit = ovl_copy_up_workdir_commit,
+	.cancel = ovl_copy_up_workdir_cancel,
+	.release = ovl_copy_up_workdir_release,
+};
+
+/*
+ * Copy up operations using O_TMPFILE.
+ * Upper file is created unlinked, copied and linked to upperdir.
+ */
+static int ovl_copy_up_tmpfile_aquire(struct ovl_copy_up_ctx *ctx)
+{
+	return 0;
+}
+
+static int ovl_copy_up_tmpfile_prepare(struct ovl_copy_up_ctx *ctx)
+{
+	struct dentry *upper;
+	struct dentry *temp;
+
+	upper = lookup_one_len_unlocked(ctx->dentry->d_name.name, ctx->upperdir,
+					ctx->dentry->d_name.len);
+	if (IS_ERR(upper))
+		return PTR_ERR(upper);
+
+	temp = ovl_do_tmpfile(ctx->upperdir, ctx->stat->mode);
+	if (IS_ERR(temp)) {
+		dput(upper);
+		return PTR_ERR(temp);
+	}
+
+	ctx->upper = upper;
+	ctx->temp = temp;
+	return 0;
+}
+
+static int ovl_copy_up_tmpfile_commit(struct ovl_copy_up_ctx *ctx)
+{
+	int err;
+
+	inode_lock_nested(d_inode(ctx->upperdir), I_MUTEX_PARENT);
+	/* link the sucker ;) */
+	err = ovl_do_link(ctx->temp, d_inode(ctx->upperdir), ctx->upper, true);
+	/* Restore timestamps on parent (best effort) */
+	if (!err)
+		ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
+	inode_unlock(d_inode(ctx->upperdir));
+
+	return err;
+}
+
+static int ovl_copy_up_tmpfile_cancel(struct ovl_copy_up_ctx *ctx)
+{
+	return 0;
+}
+
+static int ovl_copy_up_tmpfile_release(struct ovl_copy_up_ctx *ctx)
+{
+	return 0;
+}
+
+static const struct ovl_copy_up_ops ovl_copy_up_tmpfile_ops = {
+	.aquire = ovl_copy_up_tmpfile_aquire,
+	.prepare = ovl_copy_up_tmpfile_prepare,
+	.commit = ovl_copy_up_tmpfile_commit,
+	.cancel = ovl_copy_up_tmpfile_cancel,
+	.release = ovl_copy_up_tmpfile_release,
+};
+
+static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
+			      const struct ovl_copy_up_ops *ops)
+{
+	struct dentry *dentry = ctx->dentry;
+	const struct cred *old_creds = NULL;
+	struct cred *new_creds = NULL;
+	int err;
+
 	err = security_inode_copy_up(dentry, &new_creds);
 	if (err < 0)
-		goto out1;
+		return err;
 
 	if (new_creds)
 		old_creds = override_creds(new_creds);
 
-	if (tmpfile)
-		temp = ovl_do_tmpfile(upperdir, stat->mode);
-	else
-		temp = ovl_lookup_temp(workdir);
-	err = 0;
-	if (IS_ERR(temp)) {
-		err = PTR_ERR(temp);
-		temp = NULL;
-	}
-
-	if (!err && !tmpfile)
-		err = ovl_create_real(wdir, temp, &cattr, NULL, true);
+	ctx->upper = ctx->temp = NULL;
+	err = ops->prepare(ctx);
 
 	if (new_creds) {
 		revert_creds(old_creds);
 		put_cred(new_creds);
 	}
-
 	if (err)
-		goto out2;
+		goto out;
 
-	err = ovl_copy_up_inode(dentry, temp, lowerpath, stat);
+	err = ovl_copy_up_inode(dentry, ctx->temp, ctx->lowerpath, ctx->stat);
 	if (err)
-		goto out_cleanup;
+		goto out_cancel;
 
-	if (tmpfile) {
-		inode_lock_nested(udir, I_MUTEX_PARENT);
-		err = ovl_do_link(temp, udir, upper, true);
-		if (!err)
-			ovl_set_timestamps(upperdir, pstat);
-		pstat = NULL;
-		inode_unlock(udir);
-	} else {
-		err = ovl_do_rename(wdir, temp, udir, upper, 0);
-	}
+	err = ops->commit(ctx);
 	if (err)
-		goto out_cleanup;
+		goto out_cancel;
 
-	newdentry = dget(tmpfile ? upper : temp);
-	ovl_dentry_update(dentry, newdentry);
-	err = ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	ovl_dentry_update(dentry, dget(ctx->upper));
+	err = ovl_inode_update(d_inode(dentry), d_inode(ctx->upper));
 	if (err) {
 		/* Broken hardlink - drop cache and return error */
 		d_drop(dentry);
 	}
 
-	/* Restore timestamps on parent (best effort) */
-	if (pstat)
-		ovl_set_timestamps(upperdir, pstat);
-out2:
-	dput(temp);
-out1:
-	dput(upper);
 out:
+	dput(ctx->temp);
+	dput(ctx->upper);
 	return err;
 
-out_cleanup:
-	if (!tmpfile)
-		ovl_cleanup(wdir, temp);
-	goto out2;
+out_cancel:
+	ops->cancel(ctx);
+	goto out;
 }
 
 /*
@@ -471,37 +594,40 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   struct path *lowerpath, struct kstat *stat)
 {
 	DEFINE_DELAYED_CALL(done);
-	struct dentry *workdir = ovl_workdir(dentry);
 	int err;
-	struct kstat pstat;
 	struct path parentpath;
 	struct dentry *lowerdentry = lowerpath->dentry;
-	struct dentry *upperdir;
-	const char *link = NULL;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	struct ovl_copy_up_ctx ctx = {
+		.dentry = dentry,
+		.lowerpath = lowerpath,
+		.stat = stat,
+		.link = NULL,
+		.tempdir = ovl_workdir(dentry),
+	};
 
-	if (WARN_ON(!workdir))
+	if (WARN_ON(!ctx.tempdir))
 		return -EROFS;
 
 	ovl_do_check_copy_up(lowerdentry);
 
 	ovl_path_upper(parent, &parentpath);
-	upperdir = parentpath.dentry;
+	ctx.upperdir = parentpath.dentry;
 
 	/* Mark parent "impure" because it may now contain non-pure upper */
-	err = ovl_set_impure(parent, upperdir);
+	err = ovl_set_impure(parent, ctx.upperdir);
 	if (err)
 		return err;
 
-	err = vfs_getattr(&parentpath, &pstat,
+	err = vfs_getattr(&parentpath, &ctx.pstat,
 			  STATX_ATIME | STATX_MTIME, AT_STATX_SYNC_AS_STAT);
 	if (err)
 		return err;
 
 	if (S_ISLNK(stat->mode)) {
-		link = vfs_get_link(lowerdentry, &done);
-		if (IS_ERR(link))
-			return PTR_ERR(link);
+		ctx.link = vfs_get_link(lowerdentry, &done);
+		if (IS_ERR(ctx.link))
+			return PTR_ERR(ctx.link);
 	}
 
 	/* Should we copyup with O_TMPFILE or with workdir? */
@@ -515,14 +641,13 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			goto out_done;
 		}
 
-		err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
-					 stat, link, &pstat, true);
+		err = ovl_copy_up_locked(&ctx, &ovl_copy_up_tmpfile_ops);
 		ovl_copy_up_end(dentry);
 		goto out_done;
 	}
 
 	err = -EIO;
-	if (lock_rename(workdir, upperdir) != NULL) {
+	if (lock_rename(ctx.tempdir, ctx.upperdir) != NULL) {
 		pr_err("overlayfs: failed to lock workdir+upperdir\n");
 		goto out_unlock;
 	}
@@ -532,10 +657,9 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		goto out_unlock;
 	}
 
-	err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
-				 stat, link, &pstat, false);
+	err = ovl_copy_up_locked(&ctx, &ovl_copy_up_workdir_ops);
 out_unlock:
-	unlock_rename(workdir, upperdir);
+	unlock_rename(ctx.tempdir, ctx.upperdir);
 out_done:
 	do_delayed_call(&done);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 15/17] ovl: generalize ovl_copy_up_one() using actors
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (13 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 14/17] ovl: generalize ovl_copy_up_locked() using actors Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 16/17] ovl: implement index dir copy up method Amir Goldstein
  2017-06-02 14:04 ` [PATCH 17/17] ovl: handle race of concurrent lower hardlinks copy up Amir Goldstein
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

ovl_copy_up_one() is implementing two different locking methods
for regular files and non-regular files.

Implement the actors aquire,release for the 'workdir' and 'tmpfile'
copy up methods and use them to generalize locking in ovl_copy_up_one().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 59 ++++++++++++++++++++++++++++----------------------
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index d3b922bcd983..5031dd7fef89 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -387,7 +387,23 @@ struct ovl_copy_up_ops {
  */
 static int ovl_copy_up_workdir_aquire(struct ovl_copy_up_ctx *ctx)
 {
+	int err = -EIO;
+
+	if (lock_rename(ctx->tempdir, ctx->upperdir) != NULL) {
+		pr_err("overlayfs: failed to lock workdir+upperdir\n");
+		goto out_unlock;
+	}
+	if (ovl_dentry_upper(ctx->dentry)) {
+		/* Raced with another copy-up? */
+		err = 1;
+		goto out_unlock;
+	}
+
 	return 0;
+
+out_unlock:
+	unlock_rename(ctx->tempdir, ctx->upperdir);
+	return err;
 }
 
 static int ovl_copy_up_workdir_prepare(struct ovl_copy_up_ctx *ctx)
@@ -457,6 +473,7 @@ static int ovl_copy_up_workdir_cancel(struct ovl_copy_up_ctx *ctx)
 
 static int ovl_copy_up_workdir_release(struct ovl_copy_up_ctx *ctx)
 {
+	unlock_rename(ctx->tempdir, ctx->upperdir);
 	return 0;
 }
 
@@ -474,7 +491,7 @@ static const struct ovl_copy_up_ops ovl_copy_up_workdir_ops = {
  */
 static int ovl_copy_up_tmpfile_aquire(struct ovl_copy_up_ctx *ctx)
 {
-	return 0;
+	return ovl_copy_up_start(ctx->dentry);
 }
 
 static int ovl_copy_up_tmpfile_prepare(struct ovl_copy_up_ctx *ctx)
@@ -520,6 +537,7 @@ static int ovl_copy_up_tmpfile_cancel(struct ovl_copy_up_ctx *ctx)
 
 static int ovl_copy_up_tmpfile_release(struct ovl_copy_up_ctx *ctx)
 {
+	ovl_copy_up_end(ctx->dentry);
 	return 0;
 }
 
@@ -605,6 +623,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		.link = NULL,
 		.tempdir = ovl_workdir(dentry),
 	};
+	const struct ovl_copy_up_ops *ops;
 
 	if (WARN_ON(!ctx.tempdir))
 		return -EROFS;
@@ -631,35 +650,23 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	}
 
 	/* Should we copyup with O_TMPFILE or with workdir? */
-	if (S_ISREG(stat->mode) && ofs->tmpfile) {
-		err = ovl_copy_up_start(dentry);
-		/* err < 0: interrupted, err > 0: raced with another copy-up */
-		if (unlikely(err)) {
-			pr_debug("ovl_copy_up_start(%pd2) = %i\n", dentry, err);
-			if (err > 0)
-				err = 0;
-			goto out_done;
-		}
-
-		err = ovl_copy_up_locked(&ctx, &ovl_copy_up_tmpfile_ops);
-		ovl_copy_up_end(dentry);
+	if (S_ISREG(stat->mode) && ofs->tmpfile)
+		ops = &ovl_copy_up_tmpfile_ops;
+	else
+		ops = &ovl_copy_up_workdir_ops;
+
+	err = ops->aquire(&ctx);
+	/* err < 0: interrupted, err > 0: raced with another copy-up */
+	if (unlikely(err)) {
+		pr_debug("%s(%pd2): aquire = %i\n", __func__, dentry, err);
+		if (err > 0)
+			err = 0;
 		goto out_done;
 	}
 
-	err = -EIO;
-	if (lock_rename(ctx.tempdir, ctx.upperdir) != NULL) {
-		pr_err("overlayfs: failed to lock workdir+upperdir\n");
-		goto out_unlock;
-	}
-	if (ovl_dentry_upper(dentry)) {
-		/* Raced with another copy-up?  Nothing to do, then... */
-		err = 0;
-		goto out_unlock;
-	}
+	err = ovl_copy_up_locked(&ctx, ops);
+	ops->release(&ctx);
 
-	err = ovl_copy_up_locked(&ctx, &ovl_copy_up_workdir_ops);
-out_unlock:
-	unlock_rename(ctx.tempdir, ctx.upperdir);
 out_done:
 	do_delayed_call(&done);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 16/17] ovl: implement index dir copy up method
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (14 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 15/17] ovl: generalize ovl_copy_up_one() " Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  2017-06-02 14:04 ` [PATCH 17/17] ovl: handle race of concurrent lower hardlinks copy up Amir Goldstein
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Implement a copy up method using index dir to prevent breaking lower
hardlinks on copy up.

This method requires that the inodes index dir feature was enabled and
that all underlying fs support NFS export.

On the first lower hardlink copy up, upper file is created in index dir,
named after the hex representation of the lower origin inode file handle.
On the second lower hardlink copy up, upper file is found in index dir,
by the same lower handle key.
On either case, the upper indexed inode is then linked to the copy up
upper path.

The index entry remains linked for future lower hardlink copy up and for
lower to upper inode map, that is needed for exporting overlayfs to NFS.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 175 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 167 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 5031dd7fef89..59ea4c365b7a 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -351,7 +351,8 @@ static int ovl_copy_up_inode(struct dentry *dentry, struct dentry *temp,
 
 	/*
 	 * Store identifier of lower inode in upper inode xattr to
-	 * allow lookup of the copy up origin inode.
+	 * allow lookup of the copy up origin inode. We do this last
+	 * to bless the inode, in case it was created in the index dir.
 	 */
 	err = ovl_set_origin(dentry, lowerpath->dentry, temp);
 
@@ -549,6 +550,156 @@ static const struct ovl_copy_up_ops ovl_copy_up_tmpfile_ops = {
 	.release = ovl_copy_up_tmpfile_release,
 };
 
+/*
+ * Copy up operations using index dir.
+ * Upper file is created in index dir, copied and linked to upperdir.
+ * The index entry remains to be used for mapping lower inode to upper inode.
+ */
+static int ovl_copy_up_indexdir_aquire(struct ovl_copy_up_ctx *ctx)
+{
+	/* TODO: handle race of concurrent lower hardlinks copy up */
+	return ovl_copy_up_start(ctx->dentry);
+}
+
+static int ovl_copy_up_indexdir_prepare(struct ovl_copy_up_ctx *ctx)
+{
+	struct dentry *upper;
+	struct dentry *temp;
+	struct inode *inode;
+	int err;
+	struct cattr cattr = {
+		/* Can't properly set mode on creation because of the umask */
+		.mode = ctx->stat->mode & S_IFMT,
+		.rdev = ctx->stat->rdev,
+		.link = ctx->link,
+	};
+
+	upper = lookup_one_len_unlocked(ctx->dentry->d_name.name, ctx->upperdir,
+					ctx->dentry->d_name.len);
+	if (IS_ERR(upper))
+		return PTR_ERR(upper);
+
+	temp = ovl_lookup_index(ctx->dentry->d_sb, ctx->lowerpath->dentry);
+	if (IS_ERR(temp)) {
+		err = PTR_ERR(temp);
+		goto out_dput_upper;
+	}
+
+	inode = d_inode(temp);
+	if (inode) {
+		/* Another lower hardlink already copied-up? */
+		err = -EEXIST;
+		if ((inode->i_mode & S_IFMT) != cattr.mode)
+			goto out_dput_temp;
+
+		/*
+		 * Verify that found index is a copy up of lower inode.
+		 * If index inode doesn't point back to lower inode via
+		 * origin file handle, then this is either an in-progress
+		 * copy up or leftover from index dir before copying layers.
+		 * In both cases, we cannot use this index and must fail the
+		 * copy up. The in-progress case will return -EEXISTS and the
+		 * leftover case will return -ESTALE.
+		 */
+		err = ovl_verify_origin(temp, ctx->lowerpath->mnt,
+					ctx->lowerpath->dentry, false);
+		if (err) {
+			if (err == -ENODATA)
+				err = -EEXIST;
+			goto out_dput_temp;
+		}
+
+		if (inode->i_nlink < 2) {
+			/*
+			 * An orphan index inode can be created by copying up
+			 * a lower hardlink alias and then unlinking it. From
+			 * overlayfs perspective, this inode may still live if
+			 * there are more lower hardlinks and it should contain
+			 * the data of the upper inode that was unlinked. So if
+			 * an orphan inode is found in the index dir and we
+			 * should reuse it on copy up of another lower alias.
+			 */
+			pr_warn_ratelimited("overlayfs: linking to orphan upper (%pd2, ino=%lu)\n",
+					    temp, inode->i_ino);
+		}
+
+		/* Link to existing upper without copying lower */
+		err = 1;
+		goto out;
+	}
+
+	inode_lock_nested(d_inode(ctx->tempdir), I_MUTEX_PARENT);
+	err = ovl_create_real(d_inode(ctx->tempdir), temp, &cattr, NULL, true);
+	inode_unlock(d_inode(ctx->tempdir));
+	if (err)
+		goto out_dput_temp;
+
+out:
+	ctx->upper = upper;
+	ctx->temp = temp;
+	return err;
+
+out_dput_temp:
+	pr_warn_ratelimited("overlayfs: failed to create index entry (%pd2, ino=%lu, err=%i)\n",
+			    temp, inode ? inode->i_ino : 0, err);
+	pr_warn_ratelimited("overlayfs: try clearing index dir or mounting without 'verify_lower' mount option.\n");
+	dput(temp);
+out_dput_upper:
+	dput(upper);
+	return err;
+}
+
+static int ovl_copy_up_indexdir_commit(struct ovl_copy_up_ctx *ctx)
+{
+	int err;
+
+	inode_lock_nested(d_inode(ctx->upperdir), I_MUTEX_PARENT);
+	/* link the sucker ;) */
+	err = ovl_do_link(ctx->temp, d_inode(ctx->upperdir), ctx->upper, true);
+	/* Restore timestamps on parent (best effort) */
+	if (!err)
+		ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
+	inode_unlock(d_inode(ctx->upperdir));
+
+	return err;
+}
+
+static int ovl_copy_up_indexdir_cancel(struct ovl_copy_up_ctx *ctx)
+{
+	struct inode *inode = d_inode(ctx->temp);
+
+	if (WARN_ON(!inode))
+		return 0;
+
+	inode_lock_nested(d_inode(ctx->tempdir), I_MUTEX_PARENT);
+
+	/*
+	 * We must not cleanup an already hardlinked index.
+	 */
+	if (inode->i_nlink != 1)
+		goto out_unlock;
+
+	ovl_cleanup(d_inode(ctx->tempdir), ctx->temp);
+
+out_unlock:
+	inode_unlock(d_inode(ctx->tempdir));
+	return 0;
+}
+
+static int ovl_copy_up_indexdir_release(struct ovl_copy_up_ctx *ctx)
+{
+	ovl_copy_up_end(ctx->dentry);
+	return 0;
+}
+
+static const struct ovl_copy_up_ops ovl_copy_up_indexdir_ops = {
+	.aquire = ovl_copy_up_indexdir_aquire,
+	.prepare = ovl_copy_up_indexdir_prepare,
+	.commit = ovl_copy_up_indexdir_commit,
+	.cancel = ovl_copy_up_indexdir_cancel,
+	.release = ovl_copy_up_indexdir_release,
+};
+
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
 			      const struct ovl_copy_up_ops *ops)
 {
@@ -571,12 +722,16 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
 		revert_creds(old_creds);
 		put_cred(new_creds);
 	}
-	if (err)
+	if (err < 0)
 		goto out;
 
-	err = ovl_copy_up_inode(dentry, ctx->temp, ctx->lowerpath, ctx->stat);
-	if (err)
-		goto out_cancel;
+	/* err == 1 means we found an existing hardlinked upper inode */
+	if (!err) {
+		err = ovl_copy_up_inode(dentry, ctx->temp, ctx->lowerpath,
+					ctx->stat);
+		if (err)
+			goto out_cancel;
+	}
 
 	err = ops->commit(ctx);
 	if (err)
@@ -616,12 +771,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	struct path parentpath;
 	struct dentry *lowerdentry = lowerpath->dentry;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	bool indexed = ovl_indexdir(dentry->d_sb) && !S_ISDIR(stat->mode);
 	struct ovl_copy_up_ctx ctx = {
 		.dentry = dentry,
 		.lowerpath = lowerpath,
 		.stat = stat,
 		.link = NULL,
-		.tempdir = ovl_workdir(dentry),
+		.tempdir = indexed ? ovl_indexdir(dentry->d_sb) :
+				     ovl_workdir(dentry),
 	};
 	const struct ovl_copy_up_ops *ops;
 
@@ -649,8 +806,10 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			return PTR_ERR(ctx.link);
 	}
 
-	/* Should we copyup with O_TMPFILE or with workdir? */
-	if (S_ISREG(stat->mode) && ofs->tmpfile)
+	/* Should we copyup with O_TMPFILE with indexdir or with workdir? */
+	if (indexed)
+		ops = &ovl_copy_up_indexdir_ops;
+	else if (S_ISREG(stat->mode) && ofs->tmpfile)
 		ops = &ovl_copy_up_tmpfile_ops;
 	else
 		ops = &ovl_copy_up_workdir_ops;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 17/17] ovl: handle race of concurrent lower hardlinks copy up
  2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
                   ` (15 preceding siblings ...)
  2017-06-02 14:04 ` [PATCH 16/17] ovl: implement index dir copy up method Amir Goldstein
@ 2017-06-02 14:04 ` Amir Goldstein
  16 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-02 14:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Make sure that task B does not link to an index inode created by
task A, before task A completed copying the inode data.
Task A marks the index inode 'inuse' and task B waits for 'inuse' flag
to be cleared.  Take care of the race between task A canceling the
copy up and unlinking the index inode and task B trying to link to it.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 93 ++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 83 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 59ea4c365b7a..69a581d3d609 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -372,6 +372,7 @@ struct ovl_copy_up_ctx {
 	struct dentry *tempdir;
 	struct dentry *upper;
 	struct dentry *temp;
+	bool created;
 };
 
 struct ovl_copy_up_ops {
@@ -557,16 +558,30 @@ static const struct ovl_copy_up_ops ovl_copy_up_tmpfile_ops = {
  */
 static int ovl_copy_up_indexdir_aquire(struct ovl_copy_up_ctx *ctx)
 {
-	/* TODO: handle race of concurrent lower hardlinks copy up */
 	return ovl_copy_up_start(ctx->dentry);
 }
 
+/*
+ * Set ctx->temp to a positive dentry with the index inode.
+ *
+ * Return 0 if entry was created by us, in which case, we also got
+ * inode_inuse lock and we will release it on copy_up_commit after
+ * copying the inode data.
+ *
+ * Return 1 if we found an index inode, in which case, we do not need
+ * to copy the inode data.
+ *
+ * May return -EEXISTS/-ENOENT/-EBUSY on various races of concurrent
+ * lower hardlinks copy up on the same lower inode.
+ */
 static int ovl_copy_up_indexdir_prepare(struct ovl_copy_up_ctx *ctx)
 {
+	struct inode *dir = d_inode(ctx->tempdir);
 	struct dentry *upper;
 	struct dentry *temp;
 	struct inode *inode;
 	int err;
+	bool retried = false;
 	struct cattr cattr = {
 		/* Can't properly set mode on creation because of the umask */
 		.mode = ctx->stat->mode & S_IFMT,
@@ -579,6 +594,7 @@ static int ovl_copy_up_indexdir_prepare(struct ovl_copy_up_ctx *ctx)
 	if (IS_ERR(upper))
 		return PTR_ERR(upper);
 
+retry:
 	temp = ovl_lookup_index(ctx->dentry->d_sb, ctx->lowerpath->dentry);
 	if (IS_ERR(temp)) {
 		err = PTR_ERR(temp);
@@ -593,13 +609,31 @@ static int ovl_copy_up_indexdir_prepare(struct ovl_copy_up_ctx *ctx)
 			goto out_dput_temp;
 
 		/*
+		 * We need to be carefull not to link with a copy in-progress
+		 * index inode. Testing 'inuse' bit with indexdir i_mutex held
+		 * protect us from the time of creating the index entry below
+		 * to the time of ovl_copy_up_indexdir_commit. We also need to
+		 * test i_nlink with indexdir i_mutex held to make sure that
+		 * index has not been unlinked by ovl_copy_up_indexdir_cancel,
+		 * right after clearing 'inuse' bit.
+		 */
+		inode_lock_nested(dir, I_MUTEX_PARENT);
+		if (inode->i_nlink < 1)
+			err = -ENOENT;
+		else
+			err = wait_on_inode_inuse(inode, TASK_KILLABLE);
+		inode_unlock(dir);
+		if (err)
+			goto out_dput_temp;
+
+		/*
 		 * Verify that found index is a copy up of lower inode.
 		 * If index inode doesn't point back to lower inode via
-		 * origin file handle, then this is either an in-progress
-		 * copy up or leftover from index dir before copying layers.
+		 * origin file handle, then this is either a leftover from
+		 * failed copy up or an index dir entry before copying layers.
 		 * In both cases, we cannot use this index and must fail the
-		 * copy up. The in-progress case will return -EEXISTS and the
-		 * leftover case will return -ESTALE.
+		 * copy up. The failed copy up case will return -EEXISTS and
+		 * the copying layers case will return -ESTALE.
 		 */
 		err = ovl_verify_origin(temp, ctx->lowerpath->mnt,
 					ctx->lowerpath->dentry, false);
@@ -628,12 +662,32 @@ static int ovl_copy_up_indexdir_prepare(struct ovl_copy_up_ctx *ctx)
 		goto out;
 	}
 
-	inode_lock_nested(d_inode(ctx->tempdir), I_MUTEX_PARENT);
-	err = ovl_create_real(d_inode(ctx->tempdir), temp, &cattr, NULL, true);
-	inode_unlock(d_inode(ctx->tempdir));
+	/* Raced on creating index and not found on retry? */
+	err = -ENOENT;
+	if (retried)
+		goto out_dput_temp;
+
+	inode_lock_nested(dir, I_MUTEX_PARENT);
+	err = ovl_create_real(dir, temp, &cattr, NULL, true);
+	/*
+	 * Mark index inode 'inuse' before copying inode data to avoid racing
+	 * with copy up of another lower hardlink of the same lower inode.
+	 */
+	if (!err && !inode_inuse_trylock(temp->d_inode))
+		err = -EBUSY;
+	inode_unlock(dir);
+
+	/* Raced with copy up of another lower hardlink of the same inode? */
+	if (err == -EEXIST) {
+		retried = true;
+		dput(temp);
+		goto retry;
+	}
+
 	if (err)
 		goto out_dput_temp;
 
+	ctx->created = true;
 out:
 	ctx->upper = upper;
 	ctx->temp = temp;
@@ -661,6 +715,10 @@ static int ovl_copy_up_indexdir_commit(struct ovl_copy_up_ctx *ctx)
 		ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
 	inode_unlock(d_inode(ctx->upperdir));
 
+	/* Allow other lower hardlinks to link with our creation */
+	if (ctx->created)
+		inode_inuse_unlock(d_inode(ctx->temp));
+
 	return err;
 }
 
@@ -671,13 +729,28 @@ static int ovl_copy_up_indexdir_cancel(struct ovl_copy_up_ctx *ctx)
 	if (WARN_ON(!inode))
 		return 0;
 
+	/* Cleanup prepared index entry only if we created it */
+	if (!ctx->created)
+		return 0;
+
 	inode_lock_nested(d_inode(ctx->tempdir), I_MUTEX_PARENT);
 
+	if (WARN_ON(!inode->i_nlink))
+		goto out_unlock;
+
+	inode_inuse_unlock(inode);
+
 	/*
-	 * We must not cleanup an already hardlinked index.
+	 * We must clear 'inuse' flag before unlink, but we need to take care
+	 * because this could allow another copy up to race with this cleanup,
+	 * find the unborn index inode that looks like an orphan and link it,
+	 * before we unlink the unborn index entry. So before linking an index
+	 * inode we must take indexdir i_mutex, wait for 'inuse' flag to be
+	 * cleared and test that inode i_nlink > 0.
 	 */
 	if (inode->i_nlink != 1)
-		goto out_unlock;
+		pr_warn("overlayfs: cleanup of linked upper (%pd2, ino=%lu, nlink=%u)\n",
+			ctx->temp, inode->i_ino, inode->i_nlink);
 
 	ovl_cleanup(d_inode(ctx->tempdir), ctx->temp);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 11/17] ovl: hash overlay non-dir inodes by copy up origin inode
  2017-06-02 14:04 ` [PATCH 11/17] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
@ 2017-06-05 12:42   ` Amir Goldstein
  0 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2017-06-05 12:42 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Fri, Jun 2, 2017 at 5:04 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Non-dir pure upper overlay inodes are hashed by the address of the
> upper real inode.  When inodes index feature is enabled, hash all non-dir
> inodes by the address of the copy up origin inode if we know it.
>
> This is going to be used to prevent breaking lower hardlinks on copy up.
>
> Because overlay dentries of lower hardlink aliases have the same overlay
> inode, copy up of those lower aliases can result is conflict when setting
> the real upper inode of a broken hardlink. Copy up of an alias that lost
> in this conflict will return -EEXIST and drop the overlay dentry.
>
> This conflict is going to be handled more gracefully by following patches.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
...
>  static int ovl_inode_test(struct inode *inode, void *data)
>  {
> -       return READ_ONCE(inode->i_private) == data;
> +       void *this = READ_ONCE(inode->i_private);
> +
> +       if (this == data)
> +               return true;
> +
> +       /*
> +        * Overlay inodes hash key is the address of the lower origin inode.
> +        * Return same overlay inode when looking up by lower real inode, but
> +        * an existing overlay inode, that is hashed by the same lower origin
> +        * inode, has already been updated on copy up to a real upper inode.
> +        */
> +       return ovl_indexdir(inode->i_sb) &&
> +               ovl_inode_data_is_upper(this) &&
> +               !ovl_inode_data_is_upper(data);
>  }
>

Miklos,

FYI, the test above is completely bogus.
I used it for testing, because for small numbers of inodes hash collision
probability is low.

In v2 of the inode hash patches, I stored the origin inode in
i_data.private_data. I hope you will approve of this, if not, maybe
it's time to allocate an ovl_inode. I may have other interesting things
to store there (e.g. copyup and coverup nlink count).

Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-06-05 12:42 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-02 14:04 [PATCH 00/17] Avoid breaking lower hardlinks on copy up Amir Goldstein
2017-06-02 14:04 ` [PATCH 01/17] vfs: add helper wait_on_inode_inuse() Amir Goldstein
2017-06-02 14:04 ` [PATCH 02/17] ovl: generalize ovl_create_workdir() Amir Goldstein
2017-06-02 14:04 ` [PATCH 03/17] ovl: introduce the inodes index dir feature Amir Goldstein
2017-06-02 14:04 ` [PATCH 04/17] ovl: verify index dir matches upper dir Amir Goldstein
2017-06-02 14:04 ` [PATCH 05/17] ovl: create helper ovl_lookup_index() Amir Goldstein
2017-06-02 14:04 ` [PATCH 06/17] ovl: move inode helpers to inode.c Amir Goldstein
2017-06-02 14:04 ` [PATCH 07/17] ovl: create helpers for initializing hashed inode Amir Goldstein
2017-06-02 14:04 ` [PATCH 08/17] ovl: use ovl_inode_init() for initializing new inode Amir Goldstein
2017-06-02 14:04 ` [PATCH 09/17] ovl: allow hashing non upper inodes Amir Goldstein
2017-06-02 14:04 ` [PATCH 10/17] ovl: allow hashing inodes by arbitrary key Amir Goldstein
2017-06-02 14:04 ` [PATCH 11/17] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
2017-06-05 12:42   ` Amir Goldstein
2017-06-02 14:04 ` [PATCH 12/17] ovl: defer upper dir lock to tempfile link Amir Goldstein
2017-06-02 14:04 ` [PATCH 13/17] ovl: factor out ovl_copy_up_inode() helper Amir Goldstein
2017-06-02 14:04 ` [PATCH 14/17] ovl: generalize ovl_copy_up_locked() using actors Amir Goldstein
2017-06-02 14:04 ` [PATCH 15/17] ovl: generalize ovl_copy_up_one() " Amir Goldstein
2017-06-02 14:04 ` [PATCH 16/17] ovl: implement index dir copy up method Amir Goldstein
2017-06-02 14:04 ` [PATCH 17/17] ovl: handle race of concurrent lower hardlinks copy up Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.