All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] Overlayfs create object related fixes
@ 2018-05-18  8:29 Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 1/9] ovl: remove WARN_ON() real inode attributes mismatch Amir Goldstein
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

Miklos,

This is v4 of create object fixes, which addresses your comments on v3
and implement Vivek's suggestion to use ovl_get_inode() from create code.
Please note the change of semantics in ovl_instantiate() w.r.t the
reference on @inode argument - it now takes its own reference only if it
uses @inode to instantiate newdentry.

The series is based and tested on top of a merge of overlayfs-rorw
and Al's fixes branch.  It passed xfstests generic/quick and overlay/auto
including overlay/stress and all unionmount configurations.

Thanks,
Amir.

Changes since v3:
- Back to Al's version of ovl_create_real()
- Remvoe helper ovl_create_real_dir()
- Add Vivek's ovl_get_inode() cleanup patch
- Use new iget5_prealloc()
- Remove cc: stable because this is not a critical fix

Changes since v2:
- Don't use d_instantiate_new() helper
- Explain inode hash race (Vivek)
- Relax WARN_ON() to pass overlay/019 test
- Create hepler ovl_create_real_dir()
- pr_warn() when real mkdir succeeds and lookup fails

Changes since v1:
- Rebase on top of Al's fixes branch
- Use d_instantiate_new() helper
- Add re-factoring + Al's vfs_mkdir() safely patch

Amir Goldstein (7):
  ovl: remove WARN_ON() real inode attributes mismatch
  ovl: strip debug argument from ovl_do_ helpers
  ovl: struct cattr cleanups
  ovl: create helper ovl_create_temp()
  ovl: make ovl_create_real() cope with vfs_mkdir() safely
  vfs: export alloc_inode() and destroy_inode()
  ovl: use iget5_prealloc() to hash a newly created inode

Miklos Szeredi (1):
  vfs: factor out iget5_prealloc()

Vivek Goyal (1):
  ovl: Pass argument to ovl_get_inode() in a structure

 fs/inode.c               | 133 +++++++++++++++++-------------
 fs/overlayfs/copy_up.c   |  34 +++-----
 fs/overlayfs/dir.c       | 209 +++++++++++++++++++++++++++++------------------
 fs/overlayfs/export.c    |   8 +-
 fs/overlayfs/inode.c     |  28 +++++--
 fs/overlayfs/namei.c     |  10 ++-
 fs/overlayfs/overlayfs.h |  64 ++++++++-------
 fs/overlayfs/super.c     |   9 +-
 include/linux/fs.h       |  14 +++-
 9 files changed, 299 insertions(+), 210 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 1/9] ovl: remove WARN_ON() real inode attributes mismatch
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 2/9] ovl: strip debug argument from ovl_do_ helpers Amir Goldstein
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

Overlayfs should cope with online changes to underlying layer
without crashing the kernel, which is what xfstest overlay/019
checks.

This test may sometimes trigger WARN_ON() in ovl_create_or_link()
when linking an overlay inode that has been changed on underlying
layer.

Remove those WARN_ON() to prevent the stress test from failing.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 47dc980e8b33..33387c4887b9 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -510,13 +510,6 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 	}
 out_revert_creds:
 	revert_creds(old_cred);
-	if (!err) {
-		struct inode *realinode = d_inode(ovl_dentry_upper(dentry));
-
-		WARN_ON(inode->i_mode != realinode->i_mode);
-		WARN_ON(!uid_eq(inode->i_uid, realinode->i_uid));
-		WARN_ON(!gid_eq(inode->i_gid, realinode->i_gid));
-	}
 	return err;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 2/9] ovl: strip debug argument from ovl_do_ helpers
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 1/9] ovl: remove WARN_ON() real inode attributes mismatch Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 3/9] ovl: struct cattr cleanups Amir Goldstein
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

It did not prove to be useful

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   |  9 ++++-----
 fs/overlayfs/dir.c       | 20 ++++++++++----------
 fs/overlayfs/overlayfs.h | 40 ++++++++++++++++------------------------
 fs/overlayfs/super.c     |  2 +-
 4 files changed, 31 insertions(+), 40 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 8bede0742619..a8273dec0fb8 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -369,7 +369,7 @@ static int ovl_create_index(struct dentry *dentry, struct dentry *origin,
 	if (IS_ERR(temp))
 		goto temp_err;
 
-	err = ovl_do_mkdir(dir, temp, S_IFDIR, true);
+	err = ovl_do_mkdir(dir, temp, S_IFDIR);
 	if (err)
 		goto out;
 
@@ -439,8 +439,7 @@ static int ovl_link_up(struct ovl_copy_up_ctx *c)
 			       c->dentry->d_name.len);
 	err = PTR_ERR(upper);
 	if (!IS_ERR(upper)) {
-		err = ovl_do_link(ovl_dentry_upper(c->dentry), udir, upper,
-				  true);
+		err = ovl_do_link(ovl_dentry_upper(c->dentry), udir, upper);
 		dput(upper);
 
 		if (!err) {
@@ -470,7 +469,7 @@ static int ovl_install_temp(struct ovl_copy_up_ctx *c, struct dentry *temp,
 		return PTR_ERR(upper);
 
 	if (c->tmpfile)
-		err = ovl_do_link(temp, udir, upper, true);
+		err = ovl_do_link(temp, udir, upper);
 	else
 		err = ovl_do_rename(d_inode(c->workdir), temp, udir, upper, 0);
 
@@ -511,7 +510,7 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
 			goto temp_err;
 
 		err = ovl_create_real(d_inode(c->workdir), temp, &cattr,
-				      NULL, true);
+				      NULL);
 		if (err) {
 			dput(temp);
 			goto out;
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 33387c4887b9..2760a15cffe9 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -115,7 +115,7 @@ int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir,
 }
 
 int ovl_create_real(struct inode *dir, struct dentry *newdentry,
-		    struct cattr *attr, struct dentry *hardlink, bool debug)
+		    struct cattr *attr, struct dentry *hardlink)
 {
 	int err;
 
@@ -123,27 +123,27 @@ int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 		return -ESTALE;
 
 	if (hardlink) {
-		err = ovl_do_link(hardlink, dir, newdentry, debug);
+		err = ovl_do_link(hardlink, dir, newdentry);
 	} else {
 		switch (attr->mode & S_IFMT) {
 		case S_IFREG:
-			err = ovl_do_create(dir, newdentry, attr->mode, debug);
+			err = ovl_do_create(dir, newdentry, attr->mode);
 			break;
 
 		case S_IFDIR:
-			err = ovl_do_mkdir(dir, newdentry, attr->mode, debug);
+			err = ovl_do_mkdir(dir, newdentry, attr->mode);
 			break;
 
 		case S_IFCHR:
 		case S_IFBLK:
 		case S_IFIFO:
 		case S_IFSOCK:
-			err = ovl_do_mknod(dir, newdentry,
-					   attr->mode, attr->rdev, debug);
+			err = ovl_do_mknod(dir, newdentry, attr->mode,
+					   attr->rdev);
 			break;
 
 		case S_IFLNK:
-			err = ovl_do_symlink(dir, newdentry, attr->link, debug);
+			err = ovl_do_symlink(dir, newdentry, attr->link);
 			break;
 
 		default:
@@ -229,7 +229,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 	err = PTR_ERR(newdentry);
 	if (IS_ERR(newdentry))
 		goto out_unlock;
-	err = ovl_create_real(udir, newdentry, attr, hardlink, false);
+	err = ovl_create_real(udir, newdentry, attr, hardlink);
 	if (err)
 		goto out_dput;
 
@@ -286,7 +286,7 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 		goto out_unlock;
 
 	err = ovl_create_real(wdir, opaquedir,
-			      &(struct cattr){.mode = stat.mode}, NULL, true);
+			      &(struct cattr){.mode = stat.mode}, NULL);
 	if (err)
 		goto out_dput;
 
@@ -391,7 +391,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	if (IS_ERR(upper))
 		goto out_dput;
 
-	err = ovl_create_real(wdir, newdentry, cattr, hardlink, true);
+	err = ovl_create_real(wdir, newdentry, cattr, hardlink);
 	if (err)
 		goto out_dput2;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index caaa47cea2aa..9f3f551ba19e 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -87,6 +87,7 @@ struct ovl_fh {
 static inline int ovl_do_rmdir(struct inode *dir, struct dentry *dentry)
 {
 	int err = vfs_rmdir(dir, dentry);
+
 	pr_debug("rmdir(%pd2) = %i\n", dentry, err);
 	return err;
 }
@@ -94,56 +95,50 @@ static inline int ovl_do_rmdir(struct inode *dir, struct dentry *dentry)
 static inline int ovl_do_unlink(struct inode *dir, struct dentry *dentry)
 {
 	int err = vfs_unlink(dir, dentry, NULL);
+
 	pr_debug("unlink(%pd2) = %i\n", dentry, err);
 	return err;
 }
 
 static inline int ovl_do_link(struct dentry *old_dentry, struct inode *dir,
-			      struct dentry *new_dentry, bool debug)
+			      struct dentry *new_dentry)
 {
 	int err = vfs_link(old_dentry, dir, new_dentry, NULL);
-	if (debug) {
-		pr_debug("link(%pd2, %pd2) = %i\n",
-			 old_dentry, new_dentry, err);
-	}
+
+	pr_debug("link(%pd2, %pd2) = %i\n", old_dentry, new_dentry, err);
 	return err;
 }
 
 static inline int ovl_do_create(struct inode *dir, struct dentry *dentry,
-			     umode_t mode, bool debug)
+				umode_t mode)
 {
 	int err = vfs_create(dir, dentry, mode, true);
-	if (debug)
-		pr_debug("create(%pd2, 0%o) = %i\n", dentry, mode, err);
+
+	pr_debug("create(%pd2, 0%o) = %i\n", dentry, mode, err);
 	return err;
 }
 
 static inline int ovl_do_mkdir(struct inode *dir, struct dentry *dentry,
-			       umode_t mode, bool debug)
+			       umode_t mode)
 {
 	int err = vfs_mkdir(dir, dentry, mode);
-	if (debug)
-		pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err);
+	pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err);
 	return err;
 }
 
 static inline int ovl_do_mknod(struct inode *dir, struct dentry *dentry,
-			       umode_t mode, dev_t dev, bool debug)
+			       umode_t mode, dev_t dev)
 {
 	int err = vfs_mknod(dir, dentry, mode, dev);
-	if (debug) {
-		pr_debug("mknod(%pd2, 0%o, 0%o) = %i\n",
-			 dentry, mode, dev, err);
-	}
+	pr_debug("mknod(%pd2, 0%o, 0%o) = %i\n", dentry, mode, dev, err);
 	return err;
 }
 
 static inline int ovl_do_symlink(struct inode *dir, struct dentry *dentry,
-				 const char *oldname, bool debug)
+				 const char *oldname)
 {
 	int err = vfs_symlink(dir, dentry, oldname);
-	if (debug)
-		pr_debug("symlink(\"%s\", %pd2) = %i\n", oldname, dentry, err);
+	pr_debug("symlink(\"%s\", %pd2) = %i\n", oldname, dentry, err);
 	return err;
 }
 
@@ -169,11 +164,8 @@ static inline int ovl_do_rename(struct inode *olddir, struct dentry *olddentry,
 {
 	int err;
 
-	pr_debug("rename(%pd2, %pd2, 0x%x)\n",
-		 olddentry, newdentry, flags);
-
+	pr_debug("rename(%pd2, %pd2, 0x%x)\n", olddentry, newdentry, flags);
 	err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL, flags);
-
 	if (err) {
 		pr_debug("...rename(%pd2, %pd2, ...) = %i\n",
 			 olddentry, newdentry, err);
@@ -378,7 +370,7 @@ struct cattr {
 };
 int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 		    struct cattr *attr,
-		    struct dentry *hardlink, bool debug);
+		    struct dentry *hardlink);
 int ovl_cleanup(struct inode *dir, struct dentry *dentry);
 
 /* file.c */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 492d534058ae..28375009f9ee 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -605,7 +605,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 
 		err = ovl_create_real(dir, work,
 				      &(struct cattr){.mode = S_IFDIR | 0},
-				      NULL, true);
+				      NULL);
 		if (err)
 			goto out_dput;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 3/9] ovl: struct cattr cleanups
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 1/9] ovl: remove WARN_ON() real inode attributes mismatch Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 2/9] ovl: strip debug argument from ovl_do_ helpers Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 4/9] ovl: create helper ovl_create_temp() Amir Goldstein
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

- Rename to ovl_cattr
- Fold ovl_create_real() hardlink argument into struct ovl_cattr
- Create macro OVL_CATTR() to initialize struct ovl_cattr from mode

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   |  5 ++---
 fs/overlayfs/dir.c       | 45 +++++++++++++++++++++------------------------
 fs/overlayfs/overlayfs.h |  9 ++++++---
 fs/overlayfs/super.c     |  4 +---
 4 files changed, 30 insertions(+), 33 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index a8273dec0fb8..5f4c78b1bbeb 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -486,7 +486,7 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
 	struct dentry *temp;
 	const struct cred *old_creds = NULL;
 	struct cred *new_creds = NULL;
-	struct cattr cattr = {
+	struct ovl_cattr cattr = {
 		/* Can't properly set mode on creation because of the umask */
 		.mode = c->stat.mode & S_IFMT,
 		.rdev = c->stat.rdev,
@@ -509,8 +509,7 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
 		if (IS_ERR(temp))
 			goto temp_err;
 
-		err = ovl_create_real(d_inode(c->workdir), temp, &cattr,
-				      NULL);
+		err = ovl_create_real(d_inode(c->workdir), temp, &cattr);
 		if (err) {
 			dput(temp);
 			goto out;
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 2760a15cffe9..4b172faa3d29 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -115,15 +115,15 @@ int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir,
 }
 
 int ovl_create_real(struct inode *dir, struct dentry *newdentry,
-		    struct cattr *attr, struct dentry *hardlink)
+		    struct ovl_cattr *attr)
 {
 	int err;
 
 	if (newdentry->d_inode)
 		return -ESTALE;
 
-	if (hardlink) {
-		err = ovl_do_link(hardlink, dir, newdentry);
+	if (attr->hardlink) {
+		err = ovl_do_link(attr->hardlink, dir, newdentry);
 	} else {
 		switch (attr->mode & S_IFMT) {
 		case S_IFREG:
@@ -213,14 +213,14 @@ static bool ovl_type_origin(struct dentry *dentry)
 }
 
 static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
-			    struct cattr *attr, struct dentry *hardlink)
+			    struct ovl_cattr *attr)
 {
 	struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent);
 	struct inode *udir = upperdir->d_inode;
 	struct dentry *newdentry;
 	int err;
 
-	if (!hardlink && !IS_POSIXACL(udir))
+	if (!attr->hardlink && !IS_POSIXACL(udir))
 		attr->mode &= ~current_umask();
 
 	inode_lock_nested(udir, I_MUTEX_PARENT);
@@ -229,7 +229,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 	err = PTR_ERR(newdentry);
 	if (IS_ERR(newdentry))
 		goto out_unlock;
-	err = ovl_create_real(udir, newdentry, attr, hardlink);
+	err = ovl_create_real(udir, newdentry, attr);
 	if (err)
 		goto out_dput;
 
@@ -238,7 +238,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 		ovl_set_opaque(dentry, newdentry);
 	}
 
-	ovl_instantiate(dentry, inode, newdentry, !!hardlink);
+	ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
 	newdentry = NULL;
 out_dput:
 	dput(newdentry);
@@ -285,8 +285,7 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	if (IS_ERR(opaquedir))
 		goto out_unlock;
 
-	err = ovl_create_real(wdir, opaquedir,
-			      &(struct cattr){.mode = stat.mode}, NULL);
+	err = ovl_create_real(wdir, opaquedir, OVL_CATTR(stat.mode));
 	if (err)
 		goto out_dput;
 
@@ -354,8 +353,7 @@ static int ovl_set_upper_acl(struct dentry *upperdentry, const char *name,
 }
 
 static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
-				    struct cattr *cattr,
-				    struct dentry *hardlink)
+				    struct ovl_cattr *cattr)
 {
 	struct dentry *workdir = ovl_workdir(dentry);
 	struct inode *wdir = workdir->d_inode;
@@ -365,6 +363,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	struct dentry *newdentry;
 	int err;
 	struct posix_acl *acl, *default_acl;
+	bool hardlink = !!cattr->hardlink;
 
 	if (WARN_ON(!workdir))
 		return -EROFS;
@@ -391,7 +390,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	if (IS_ERR(upper))
 		goto out_dput;
 
-	err = ovl_create_real(wdir, newdentry, cattr, hardlink);
+	err = ovl_create_real(wdir, newdentry, cattr);
 	if (err)
 		goto out_dput2;
 
@@ -439,7 +438,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 		if (err)
 			goto out_cleanup;
 	}
-	ovl_instantiate(dentry, inode, newdentry, !!hardlink);
+	ovl_instantiate(dentry, inode, newdentry, hardlink);
 	newdentry = NULL;
 out_dput2:
 	dput(upper);
@@ -460,8 +459,7 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 }
 
 static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
-			      struct cattr *attr, struct dentry *hardlink,
-			      bool origin)
+			      struct ovl_cattr *attr, bool origin)
 {
 	int err;
 	const struct cred *old_cred;
@@ -489,7 +487,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 	if (override_cred) {
 		override_cred->fsuid = inode->i_uid;
 		override_cred->fsgid = inode->i_gid;
-		if (!hardlink) {
+		if (!attr->hardlink) {
 			err = security_dentry_create_files_as(dentry,
 					attr->mode, &dentry->d_name, old_cred,
 					override_cred);
@@ -502,11 +500,9 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 		put_cred(override_cred);
 
 		if (!ovl_dentry_is_whiteout(dentry))
-			err = ovl_create_upper(dentry, inode, attr,
-						hardlink);
+			err = ovl_create_upper(dentry, inode, attr);
 		else
-			err = ovl_create_over_whiteout(dentry, inode, attr,
-							hardlink);
+			err = ovl_create_over_whiteout(dentry, inode, attr);
 	}
 out_revert_creds:
 	revert_creds(old_cred);
@@ -518,7 +514,7 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
 {
 	int err;
 	struct inode *inode;
-	struct cattr attr = {
+	struct ovl_cattr attr = {
 		.rdev = rdev,
 		.link = link,
 	};
@@ -535,7 +531,7 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
 	inode_init_owner(inode, dentry->d_parent->d_inode, mode);
 	attr.mode = inode->i_mode;
 
-	err = ovl_create_or_link(dentry, inode, &attr, NULL, false);
+	err = ovl_create_or_link(dentry, inode, &attr, false);
 	if (err)
 		iput(inode);
 
@@ -594,8 +590,9 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
 	inode = d_inode(old);
 	ihold(inode);
 
-	err = ovl_create_or_link(new, inode, NULL, ovl_dentry_upper(old),
-				 ovl_type_origin(old));
+	err = ovl_create_or_link(new, inode,
+			&(struct ovl_cattr) {.hardlink = ovl_dentry_upper(old)},
+			ovl_type_origin(old));
 	if (err)
 		iput(inode);
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9f3f551ba19e..fafaf9b255c5 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -363,14 +363,17 @@ extern const struct inode_operations ovl_dir_inode_operations;
 struct dentry *ovl_lookup_temp(struct dentry *workdir);
 int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir,
 			     struct dentry *dentry);
-struct cattr {
+struct ovl_cattr {
 	dev_t rdev;
 	umode_t mode;
 	const char *link;
+	struct dentry *hardlink;
 };
+
+#define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) })
+
 int ovl_create_real(struct inode *dir, struct dentry *newdentry,
-		    struct cattr *attr,
-		    struct dentry *hardlink);
+		    struct ovl_cattr *attr);
 int ovl_cleanup(struct inode *dir, struct dentry *dentry);
 
 /* file.c */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 28375009f9ee..ace95319bf1d 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -603,9 +603,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 			goto retry;
 		}
 
-		err = ovl_create_real(dir, work,
-				      &(struct cattr){.mode = S_IFDIR | 0},
-				      NULL);
+		err = ovl_create_real(dir, work, OVL_CATTR(attr.ia_mode));
 		if (err)
 			goto out_dput;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 4/9] ovl: create helper ovl_create_temp()
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
                   ` (2 preceding siblings ...)
  2018-05-18  8:29 ` [PATCH v4 3/9] ovl: struct cattr cleanups Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 5/9] ovl: make ovl_create_real() cope with vfs_mkdir() safely Amir Goldstein
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

Al Viro suggested to simplify callers of ovl_create_real() by
returning the created dentry (or ERR_PTR) from ovl_create_real().
This prep patch makes Al's patch change less callers.

Also used ovl_create_temp() in ovl_create_index() instead of calling
ovl_do_mkdir() directly, so now all callers of ovl_do_mkdir() are routed
through ovl_create_real(), which paves the way for Al's fix for non-hashed
result from vfs_mkdir().

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 26 +++++++-------------------
 fs/overlayfs/dir.c       | 44 +++++++++++++++++++++++++++-----------------
 fs/overlayfs/overlayfs.h |  3 ++-
 3 files changed, 36 insertions(+), 37 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 5f4c78b1bbeb..f3024060be42 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -365,14 +365,10 @@ static int ovl_create_index(struct dentry *dentry, struct dentry *origin,
 	if (err)
 		return err;
 
-	temp = ovl_lookup_temp(indexdir);
+	temp = ovl_create_temp(indexdir, OVL_CATTR(S_IFDIR | 0));
 	if (IS_ERR(temp))
 		goto temp_err;
 
-	err = ovl_do_mkdir(dir, temp, S_IFDIR);
-	if (err)
-		goto out;
-
 	err = ovl_set_upper_fh(upper, temp);
 	if (err)
 		goto out_cleanup;
@@ -500,21 +496,13 @@ static int ovl_get_tmpfile(struct ovl_copy_up_ctx *c, struct dentry **tempp)
 	if (new_creds)
 		old_creds = override_creds(new_creds);
 
-	if (c->tmpfile) {
+	if (c->tmpfile)
 		temp = ovl_do_tmpfile(c->workdir, c->stat.mode);
-		if (IS_ERR(temp))
-			goto temp_err;
-	} else {
-		temp = ovl_lookup_temp(c->workdir);
-		if (IS_ERR(temp))
-			goto temp_err;
-
-		err = ovl_create_real(d_inode(c->workdir), temp, &cattr);
-		if (err) {
-			dput(temp);
-			goto out;
-		}
-	}
+	else
+		temp = ovl_create_temp(c->workdir, &cattr);
+	if (IS_ERR(temp))
+		goto temp_err;
+
 	err = 0;
 	*tempp = temp;
 out:
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 4b172faa3d29..41a8940964e3 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -160,6 +160,25 @@ int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 	return err;
 }
 
+struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr)
+{
+	struct inode *wdir = workdir->d_inode;
+	struct dentry *temp;
+	int err;
+
+	temp = ovl_lookup_temp(workdir);
+	if (IS_ERR(temp))
+		return temp;
+
+	err = ovl_create_real(wdir, temp, attr);
+	if (err) {
+		dput(temp);
+		return ERR_PTR(err);
+	}
+
+	return temp;
+}
+
 static int ovl_set_opaque_xerr(struct dentry *dentry, struct dentry *upper,
 			       int xerr)
 {
@@ -280,15 +299,11 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 	if (upper->d_parent->d_inode != udir)
 		goto out_unlock;
 
-	opaquedir = ovl_lookup_temp(workdir);
+	opaquedir = ovl_create_temp(workdir, OVL_CATTR(stat.mode));
 	err = PTR_ERR(opaquedir);
 	if (IS_ERR(opaquedir))
 		goto out_unlock;
 
-	err = ovl_create_real(wdir, opaquedir, OVL_CATTR(stat.mode));
-	if (err)
-		goto out_dput;
-
 	err = ovl_copy_xattr(upper, opaquedir);
 	if (err)
 		goto out_cleanup;
@@ -318,7 +333,6 @@ static struct dentry *ovl_clear_empty(struct dentry *dentry,
 
 out_cleanup:
 	ovl_cleanup(wdir, opaquedir);
-out_dput:
 	dput(opaquedir);
 out_unlock:
 	unlock_rename(workdir, upperdir);
@@ -379,20 +393,16 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	if (err)
 		goto out;
 
-	newdentry = ovl_lookup_temp(workdir);
-	err = PTR_ERR(newdentry);
-	if (IS_ERR(newdentry))
-		goto out_unlock;
-
 	upper = lookup_one_len(dentry->d_name.name, upperdir,
 			       dentry->d_name.len);
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
-		goto out_dput;
+		goto out_unlock;
 
-	err = ovl_create_real(wdir, newdentry, cattr);
-	if (err)
-		goto out_dput2;
+	newdentry = ovl_create_temp(workdir, cattr);
+	err = PTR_ERR(newdentry);
+	if (IS_ERR(newdentry))
+		goto out_dput;
 
 	/*
 	 * mode could have been mutilated due to umask (e.g. sgid directory)
@@ -441,9 +451,9 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 	ovl_instantiate(dentry, inode, newdentry, hardlink);
 	newdentry = NULL;
 out_dput2:
-	dput(upper);
-out_dput:
 	dput(newdentry);
+out_dput:
+	dput(upper);
 out_unlock:
 	unlock_rename(workdir, upperdir);
 out:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index fafaf9b255c5..e72eec6d440f 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -360,6 +360,7 @@ static inline void ovl_copyflags(struct inode *from, struct inode *to)
 
 /* dir.c */
 extern const struct inode_operations ovl_dir_inode_operations;
+int ovl_cleanup(struct inode *dir, struct dentry *dentry);
 struct dentry *ovl_lookup_temp(struct dentry *workdir);
 int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir,
 			     struct dentry *dentry);
@@ -374,7 +375,7 @@ struct ovl_cattr {
 
 int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 		    struct ovl_cattr *attr);
-int ovl_cleanup(struct inode *dir, struct dentry *dentry);
+struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr);
 
 /* file.c */
 extern const struct file_operations ovl_file_operations;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 5/9] ovl: make ovl_create_real() cope with vfs_mkdir() safely
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
                   ` (3 preceding siblings ...)
  2018-05-18  8:29 ` [PATCH v4 4/9] ovl: create helper ovl_create_temp() Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 6/9] vfs: factor out iget5_prealloc() Amir Goldstein
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

vfs_mkdir() may succeed and leave the dentry passed to it unhashed and
negative.  ovl_create_real() is the last caller breaking when that
happens.

Make ovl_create_real() return the dentry to be used or ERR_PTR(-E...)
in case of error; usually that'll be the dentry passed to it, but
it might be a different one - result of lookup in case of vfs_mkdir()
leaving looking the inode, etc. up to the next lookup to come.

In that case (as well as in case of an error), original dentry is
dropped.  That simplifies the callers.  Moreover, passing ERR_PTR()
as dentry leads to immediate return of the same ERR_PTR().  That
simplifies the callers even more.

[amir: split re-factoring of ovl_create_temp() to prep patch
       add comment about unhashed dir after mkdir
       add pr_warn() if mkdir succeeds and lookup fails]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c       | 70 +++++++++++++++++++++++++++++-------------------
 fs/overlayfs/overlayfs.h |  4 +--
 fs/overlayfs/super.c     |  7 ++---
 3 files changed, 48 insertions(+), 33 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 41a8940964e3..6f7400be3fc8 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -114,13 +114,18 @@ int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir,
 	goto out;
 }
 
-int ovl_create_real(struct inode *dir, struct dentry *newdentry,
-		    struct ovl_cattr *attr)
+struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry,
+			       struct ovl_cattr *attr)
 {
 	int err;
 
-	if (newdentry->d_inode)
-		return -ESTALE;
+	if (IS_ERR(newdentry))
+		return newdentry;
+
+	if (newdentry->d_inode) {
+		dput(newdentry);
+		return ERR_PTR(-ESTALE);
+	}
 
 	if (attr->hardlink) {
 		err = ovl_do_link(attr->hardlink, dir, newdentry);
@@ -132,6 +137,26 @@ int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 
 		case S_IFDIR:
 			err = ovl_do_mkdir(dir, newdentry, attr->mode);
+			/*
+			 * vfs_mkdir() may succeed and leave the dentry passed
+			 * to it unhashed and negative. If that happens, try to
+			 * lookup a new hashed and positive dentry.
+			 */
+			if (!err && unlikely(d_unhashed(newdentry))) {
+				struct dentry *d;
+
+				d = lookup_one_len(newdentry->d_name.name,
+						   newdentry->d_parent,
+						   newdentry->d_name.len);
+				dput(newdentry);
+				if (IS_ERR(d)) {
+					err = PTR_ERR(d);
+					pr_warn("overlayfs: failed lookup after mkdir (%pd2, err=%i).\n",
+						newdentry, err);
+					return d;
+				}
+				newdentry = d;
+			}
 			break;
 
 		case S_IFCHR:
@@ -157,26 +182,17 @@ int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 		 */
 		err = -ENOENT;
 	}
-	return err;
-}
-
-struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr)
-{
-	struct inode *wdir = workdir->d_inode;
-	struct dentry *temp;
-	int err;
-
-	temp = ovl_lookup_temp(workdir);
-	if (IS_ERR(temp))
-		return temp;
-
-	err = ovl_create_real(wdir, temp, attr);
 	if (err) {
-		dput(temp);
+		dput(newdentry);
 		return ERR_PTR(err);
 	}
+	return newdentry;
+}
 
-	return temp;
+struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr)
+{
+	return ovl_create_real(d_inode(workdir), ovl_lookup_temp(workdir),
+			       attr);
 }
 
 static int ovl_set_opaque_xerr(struct dentry *dentry, struct dentry *upper,
@@ -243,14 +259,14 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 		attr->mode &= ~current_umask();
 
 	inode_lock_nested(udir, I_MUTEX_PARENT);
-	newdentry = lookup_one_len(dentry->d_name.name, upperdir,
-				   dentry->d_name.len);
+	newdentry = ovl_create_real(udir,
+				    lookup_one_len(dentry->d_name.name,
+						   upperdir,
+						   dentry->d_name.len),
+				    attr);
 	err = PTR_ERR(newdentry);
 	if (IS_ERR(newdentry))
 		goto out_unlock;
-	err = ovl_create_real(udir, newdentry, attr);
-	if (err)
-		goto out_dput;
 
 	if (ovl_type_merge(dentry->d_parent) && d_is_dir(newdentry)) {
 		/* Setting opaque here is just an optimization, allow to fail */
@@ -258,9 +274,7 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 	}
 
 	ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
-	newdentry = NULL;
-out_dput:
-	dput(newdentry);
+	err = 0;
 out_unlock:
 	inode_unlock(udir);
 	return err;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e72eec6d440f..4eb04da240b4 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -373,8 +373,8 @@ struct ovl_cattr {
 
 #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) })
 
-int ovl_create_real(struct inode *dir, struct dentry *newdentry,
-		    struct ovl_cattr *attr);
+struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry,
+			       struct ovl_cattr *attr);
 struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr);
 
 /* file.c */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ace95319bf1d..cd5c82f105d6 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -603,9 +603,10 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs,
 			goto retry;
 		}
 
-		err = ovl_create_real(dir, work, OVL_CATTR(attr.ia_mode));
-		if (err)
-			goto out_dput;
+		work = ovl_create_real(dir, work, OVL_CATTR(attr.ia_mode));
+		err = PTR_ERR(work);
+		if (IS_ERR(work))
+			goto out_err;
 
 		/*
 		 * Try to remove POSIX ACL xattrs from workdir.  We are good if:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 6/9] vfs: factor out iget5_prealloc()
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
                   ` (4 preceding siblings ...)
  2018-05-18  8:29 ` [PATCH v4 5/9] ovl: make ovl_create_real() cope with vfs_mkdir() safely Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 7/9] vfs: export alloc_inode() and destroy_inode() Amir Goldstein
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

From: Miklos Szeredi <miklos@szeredi.hu>

Split a helper out of iget5_locked() that takes a preallocated
inode.  This is needed by overlayfs, but also makes iget5_locked()
more readable.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/inode.c         | 127 ++++++++++++++++++++++++++++++-----------------------
 include/linux/fs.h |  12 ++++-
 2 files changed, 82 insertions(+), 57 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 9818c0f48cfa..6bb3950ae896 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1003,6 +1003,71 @@ void unlock_two_nondirectories(struct inode *inode1, struct inode *inode2)
 EXPORT_SYMBOL(unlock_two_nondirectories);
 
 /**
+ * iget5_prealloc - obtain an inode from a mounted file system
+ * @inode:	pre-allocated inode to use for insert to cache
+ * @hashval:	hash value (usually inode number) to get
+ * @test:	callback used for comparisons between inodes
+ * @set:	callback used to initialize a new struct inode
+ * @data:	opaque data pointer to pass to @test and @set
+ *
+ * Search for the inode specified by @hashval and @data in the inode cache,
+ * and if present it is return it with an increased reference count. This is
+ * a variant of iget5_locked() for callers that don't want to fail on memory
+ * allocation of inode.
+ *
+ * If the inode is not in cache, insert the pre-allocated inode to cache and
+ * return it locked, hashed, and with the I_NEW flag set. The file system gets
+ * to fill it in before unlocking it via unlock_new_inode().
+ *
+ * Note both @test and @set are called with the inode_hash_lock held, so can't
+ * sleep.
+ */
+struct inode *iget5_prealloc(struct inode *inode, unsigned long hashval,
+			     int (*test)(struct inode *, void *),
+			     int (*set)(struct inode *, void *), void *data)
+{
+	struct hlist_head *head = inode_hashtable + hash(inode->i_sb, hashval);
+	struct inode *old;
+
+again:
+	spin_lock(&inode_hash_lock);
+	old = find_inode(inode->i_sb, head, test, data);
+	if (unlikely(old)) {
+		/*
+		 * Uhhuh, somebody else created the same inode under us.
+		 * Use the old inode instead of the preallocated one.
+		 */
+		spin_unlock(&inode_hash_lock);
+		wait_on_inode(old);
+		if (unlikely(inode_unhashed(old))) {
+			iput(old);
+			goto again;
+		}
+		return old;
+	}
+
+	if (unlikely(set(inode, data))) {
+		inode = NULL;
+		goto unlock;
+	}
+
+	/*
+	 * Return the locked inode with I_NEW set, the
+	 * caller is responsible for filling in the contents
+	 */
+	spin_lock(&inode->i_lock);
+	inode->i_state = I_NEW;
+	hlist_add_head(&inode->i_hash, head);
+	spin_unlock(&inode->i_lock);
+	inode_sb_list_add(inode);
+unlock:
+	spin_unlock(&inode_hash_lock);
+
+	return inode;
+}
+EXPORT_SYMBOL(iget5_prealloc);
+
+/**
  * iget5_locked - obtain an inode from a mounted file system
  * @sb:		super block of file system
  * @hashval:	hash value (usually inode number) to get
@@ -1026,66 +1091,18 @@ struct inode *iget5_locked(struct super_block *sb, unsigned long hashval,
 		int (*test)(struct inode *, void *),
 		int (*set)(struct inode *, void *), void *data)
 {
-	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
-	struct inode *inode;
-again:
-	spin_lock(&inode_hash_lock);
-	inode = find_inode(sb, head, test, data);
-	spin_unlock(&inode_hash_lock);
+	struct inode *inode = ilookup5(sb, hashval, test, data);
 
-	if (inode) {
-		wait_on_inode(inode);
-		if (unlikely(inode_unhashed(inode))) {
-			iput(inode);
-			goto again;
-		}
-		return inode;
-	}
-
-	inode = alloc_inode(sb);
-	if (inode) {
-		struct inode *old;
-
-		spin_lock(&inode_hash_lock);
-		/* We released the lock, so.. */
-		old = find_inode(sb, head, test, data);
-		if (!old) {
-			if (set(inode, data))
-				goto set_failed;
-
-			spin_lock(&inode->i_lock);
-			inode->i_state = I_NEW;
-			hlist_add_head(&inode->i_hash, head);
-			spin_unlock(&inode->i_lock);
-			inode_sb_list_add(inode);
-			spin_unlock(&inode_hash_lock);
+	if (!inode) {
+		struct inode *new = alloc_inode(sb);
 
-			/* Return the locked inode with I_NEW set, the
-			 * caller is responsible for filling in the contents
-			 */
-			return inode;
-		}
-
-		/*
-		 * Uhhuh, somebody else created the same inode under
-		 * us. Use the old inode instead of the one we just
-		 * allocated.
-		 */
-		spin_unlock(&inode_hash_lock);
-		destroy_inode(inode);
-		inode = old;
-		wait_on_inode(inode);
-		if (unlikely(inode_unhashed(inode))) {
-			iput(inode);
-			goto again;
+		if (new) {
+			inode = iget5_prealloc(new, hashval, test, set, data);
+			if (inode != new)
+				destroy_inode(new);
 		}
 	}
 	return inode;
-
-set_failed:
-	spin_unlock(&inode_hash_lock);
-	destroy_inode(inode);
-	return NULL;
 }
 EXPORT_SYMBOL(iget5_locked);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8b8d793b3774..3114c1fe5c83 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2880,9 +2880,17 @@ extern struct inode *ilookup5_nowait(struct super_block *sb,
 extern struct inode *ilookup5(struct super_block *sb, unsigned long hashval,
 		int (*test)(struct inode *, void *), void *data);
 extern struct inode *ilookup(struct super_block *sb, unsigned long ino);
+extern struct inode *iget5_prealloc(struct inode *inode, unsigned long hashval,
+		int (*test)(struct inode *, void *),
+		int (*set)(struct inode *, void *),
+		void *data);
+extern struct inode *iget5_locked(struct super_block *sb,
+		unsigned long hashval,
+		int (*test)(struct inode *, void *),
+		int (*set)(struct inode *, void *),
+		void *data);
+extern struct inode *iget_locked(struct super_block *sb, unsigned long ino);
 
-extern struct inode * iget5_locked(struct super_block *, unsigned long, int (*test)(struct inode *, void *), int (*set)(struct inode *, void *), void *);
-extern struct inode * iget_locked(struct super_block *, unsigned long);
 extern struct inode *find_inode_nowait(struct super_block *,
 				       unsigned long,
 				       int (*match)(struct inode *,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 7/9] vfs: export alloc_inode() and destroy_inode()
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
                   ` (5 preceding siblings ...)
  2018-05-18  8:29 ` [PATCH v4 6/9] vfs: factor out iget5_prealloc() Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18 15:02   ` Al Viro
  2018-05-18  8:29 ` [PATCH v4 8/9] ovl: Pass argument to ovl_get_inode() in a structure Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode Amir Goldstein
  8 siblings, 1 reply; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

They are needed by overlayfs.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/inode.c         | 6 ++++--
 include/linux/fs.h | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 6bb3950ae896..5171807da7e8 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -201,7 +201,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 }
 EXPORT_SYMBOL(inode_init_always);
 
-static struct inode *alloc_inode(struct super_block *sb)
+struct inode *alloc_inode(struct super_block *sb)
 {
 	struct inode *inode;
 
@@ -223,6 +223,7 @@ static struct inode *alloc_inode(struct super_block *sb)
 
 	return inode;
 }
+EXPORT_SYMBOL(alloc_inode);
 
 void free_inode_nonrcu(struct inode *inode)
 {
@@ -258,7 +259,7 @@ static void i_callback(struct rcu_head *head)
 	kmem_cache_free(inode_cachep, inode);
 }
 
-static void destroy_inode(struct inode *inode)
+void destroy_inode(struct inode *inode)
 {
 	BUG_ON(!list_empty(&inode->i_lru));
 	__destroy_inode(inode);
@@ -267,6 +268,7 @@ static void destroy_inode(struct inode *inode)
 	else
 		call_rcu(&inode->i_rcu, i_callback);
 }
+EXPORT_SYMBOL(destroy_inode);
 
 /**
  * drop_nlink - directly drop an inode's link count
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3114c1fe5c83..7559f51a14d3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2911,6 +2911,8 @@ extern void __iget(struct inode * inode);
 extern void iget_failed(struct inode *);
 extern void clear_inode(struct inode *);
 extern void __destroy_inode(struct inode *);
+extern void destroy_inode(struct inode *inode);
+extern struct inode *alloc_inode(struct super_block *sb);
 extern struct inode *new_inode_pseudo(struct super_block *sb);
 extern struct inode *new_inode(struct super_block *sb);
 extern void free_inode_nonrcu(struct inode *inode);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 8/9] ovl: Pass argument to ovl_get_inode() in a structure
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
                   ` (6 preceding siblings ...)
  2018-05-18  8:29 ` [PATCH v4 7/9] vfs: export alloc_inode() and destroy_inode() Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18  8:29 ` [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode Amir Goldstein
  8 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

From: Vivek Goyal <vgoyal@redhat.com>

ovl_get_inode() right now has 5 parameters. Soon this patch series will
add 2 more and suddenly argument list starts looking too long.

Hence pass arguments to ovl_get_inode() in a structure and it looks
little cleaner.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/export.c    |  8 +++++++-
 fs/overlayfs/inode.c     | 20 +++++++++++---------
 fs/overlayfs/namei.c     | 10 ++++++++--
 fs/overlayfs/overlayfs.h | 11 ++++++++---
 4 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 425a94672300..9941ece61a14 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -300,12 +300,18 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
 	struct dentry *dentry;
 	struct inode *inode;
 	struct ovl_entry *oe;
+	struct ovl_inode_params oip = {
+		.lowerpath = lowerpath,
+		.index = index,
+		.numlower = !!lower
+	};
 
 	/* We get overlay directory dentries with ovl_lookup_real() */
 	if (d_is_dir(upper ?: lower))
 		return ERR_PTR(-EIO);
 
-	inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
+	oip.upperdentry = dget(upper);
+	inode = ovl_get_inode(sb, &oip);
 	if (IS_ERR(inode)) {
 		dput(upper);
 		return ERR_CAST(inode);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 7abcf96e94fc..3aa2ea447436 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -792,15 +792,17 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
 	return true;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower)
+struct inode *ovl_get_inode(struct super_block *sb,
+			    struct ovl_inode_params *oip)
 {
+	struct dentry *upperdentry = oip->upperdentry;
+	struct ovl_path *lowerpath = oip->lowerpath;
 	struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL;
 	struct inode *inode;
 	struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
-	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, index);
-	int fsid = bylower ? lowerpath->layer->fsid : 0;
+	bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
+					oip->index);
+	int fsid = bylower ? oip->lowerpath->layer->fsid : 0;
 	bool is_dir;
 	unsigned long ino = 0;
 
@@ -817,8 +819,8 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 						      upperdentry);
 		unsigned int nlink = is_dir ? 1 : realinode->i_nlink;
 
-		inode = iget5_locked(sb, (unsigned long) key,
-				     ovl_inode_test, ovl_inode_set, key);
+		inode = iget5_locked(sb, (unsigned long) key, ovl_inode_test,
+				     ovl_inode_set, key);
 		if (!inode)
 			goto out_nomem;
 		if (!(inode->i_state & I_NEW)) {
@@ -854,12 +856,12 @@ struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
 	if (upperdentry && ovl_is_impuredir(upperdentry))
 		ovl_set_flag(OVL_IMPURE, inode);
 
-	if (index)
+	if (oip->index)
 		ovl_set_flag(OVL_INDEX, inode);
 
 	/* Check for non-merge dir that may have whiteouts */
 	if (is_dir) {
-		if (((upperdentry && lowerdentry) || numlower > 1) ||
+		if (((upperdentry && lowerdentry) || oip->numlower > 1) ||
 		    ovl_check_origin_xattr(upperdentry ?: lowerdentry)) {
 			ovl_set_flag(OVL_WHITEOUTS, inode);
 		}
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 2dba29eadde6..08801b45df00 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1004,8 +1004,14 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		upperdentry = dget(index);
 
 	if (upperdentry || ctr) {
-		inode = ovl_get_inode(dentry->d_sb, upperdentry, stack, index,
-				      ctr);
+		struct ovl_inode_params oip = {
+			.upperdentry = upperdentry,
+			.lowerpath = stack,
+			.index = index,
+			.numlower = ctr,
+		};
+
+		inode = ovl_get_inode(dentry->d_sb, &oip);
 		err = PTR_ERR(inode);
 		if (IS_ERR(inode))
 			goto out_free_oe;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 4eb04da240b4..dfb25b9b26a8 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -334,12 +334,17 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
 int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
+struct ovl_inode_params {
+	struct dentry *upperdentry;
+	struct ovl_path *lowerpath;
+	struct dentry *index;
+	unsigned int numlower;
+};
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
 struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
 			       bool is_upper);
-struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry,
-			    struct ovl_path *lowerpath, struct dentry *index,
-			    unsigned int numlower);
+struct inode *ovl_get_inode(struct super_block *sb,
+			    struct ovl_inode_params *oip);
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
                   ` (7 preceding siblings ...)
  2018-05-18  8:29 ` [PATCH v4 8/9] ovl: Pass argument to ovl_get_inode() in a structure Amir Goldstein
@ 2018-05-18  8:29 ` Amir Goldstein
  2018-05-18 10:14   ` Miklos Szeredi
  2018-05-18 15:03   ` Al Viro
  8 siblings, 2 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18  8:29 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, linux-unionfs

Currently, there is a small window where ovl_obtain_alias() can
race with ovl_instantiate() and create two different overlay inodes
with the same underlying real non-dir non-hardlink inode.

The race requires an adversary to guess the file handle of the
yet to be created upper inode and decode the guessed file handle
after ovl_creat_real(), but before ovl_instantiate().
This race does not affect overlay directory inodes, because those
are decoded via ovl_lookup_real() and not with ovl_obtain_alias().

This patch fixes the race, by using iget5_prealloc() to add a newly
created inode to cache.

If the newly created inode apears to already exist in cache (hashed
by the same real upper inode), we instantiate the dentry with the old
inode and drop the new inode, instead of silently not hashing the new
inode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c       | 81 +++++++++++++++++++++++++++++++++++-------------
 fs/overlayfs/inode.c     | 12 +++++--
 fs/overlayfs/overlayfs.h |  1 +
 3 files changed, 70 insertions(+), 24 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 6f7400be3fc8..833528d3e351 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -217,24 +217,57 @@ static int ovl_set_opaque(struct dentry *dentry, struct dentry *upperdentry)
 	return ovl_set_opaque_xerr(dentry, upperdentry, -EIO);
 }
 
-/* Common operations required to be done after creation of file on upper */
-static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
-			    struct dentry *newdentry, bool hardlink)
+/*
+ * Common operations required to be done after creation of file on upper.
+ * If @hardlink is false, then @inode is a pre-allocated inode, not yet on
+ * overlay's inode_sb_list.  If @newdentry is instantiated with @inode it
+ * takes a reference on @inode.
+ */
+static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
+			   struct dentry *newdentry, bool hardlink)
 {
+	struct ovl_inode_params oip = {
+		.upperdentry = newdentry,
+		.newinode = inode,
+	};
+
 	ovl_dir_modified(dentry->d_parent, false);
-	ovl_copyattr(d_inode(newdentry), inode);
 	ovl_dentry_set_upper_alias(dentry);
 	if (!hardlink) {
-		ovl_inode_update(inode, newdentry);
+		/*
+		 * ovl_obtain_alias() can be called after ovl_create_real()
+		 * and before we get here, so we may get an inode from cache
+		 * with the same real upperdentry that is not the inode we
+		 * pre-allocated.  In this case we will use the cached inode
+		 * to instantiate the new dentry.
+		 *
+		 * XXX: if we ever use ovl_obtain_alias() to decode directory
+		 * file handles, need to use ovl_get_inode_locked() and
+		 * d_instantiate_new() here to prevent from creating two
+		 * hashed directory inode aliases.
+		 */
+		inode = ovl_get_inode(dentry->d_sb, &oip);
+		if (IS_ERR(inode))
+			return PTR_ERR(inode);
 	} else {
 		WARN_ON(ovl_inode_real(inode) != d_inode(newdentry));
 		dput(newdentry);
 		inc_nlink(inode);
 	}
+
+	ihold(inode);
 	d_instantiate(dentry, inode);
+	if (inode != oip.newinode) {
+		pr_warn_ratelimited("overlayfs: newly created inode found in cache (%pd2)\n",
+				    dentry);
+		iput(inode);
+	}
+
 	/* Force lookup of new upper hardlink to find its lower */
 	if (hardlink)
 		d_drop(dentry);
+
+	return 0;
 }
 
 static bool ovl_type_merge(struct dentry *dentry)
@@ -273,11 +306,17 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
 		ovl_set_opaque(dentry, newdentry);
 	}
 
-	ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
-	err = 0;
+	err = ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
+	if (err)
+		goto out_cleanup;
 out_unlock:
 	inode_unlock(udir);
 	return err;
+
+out_cleanup:
+	ovl_cleanup(udir, newdentry);
+	dput(newdentry);
+	goto out_unlock;
 }
 
 static struct dentry *ovl_clear_empty(struct dentry *dentry,
@@ -462,10 +501,9 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 		if (err)
 			goto out_cleanup;
 	}
-	ovl_instantiate(dentry, inode, newdentry, hardlink);
-	newdentry = NULL;
-out_dput2:
-	dput(newdentry);
+	err = ovl_instantiate(dentry, inode, newdentry, hardlink);
+	if (err)
+		goto out_cleanup;
 out_dput:
 	dput(upper);
 out_unlock:
@@ -479,7 +517,8 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
 
 out_cleanup:
 	ovl_cleanup(wdir, newdentry);
-	goto out_dput2;
+	dput(newdentry);
+	goto out_dput;
 }
 
 static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
@@ -547,8 +586,9 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
 	if (err)
 		goto out;
 
+	/* Preallocate inode to be used by ovl_get_inode() */
 	err = -ENOMEM;
-	inode = ovl_new_inode(dentry->d_sb, mode, rdev);
+	inode = alloc_inode(dentry->d_sb);
 	if (!inode)
 		goto out_drop_write;
 
@@ -556,9 +596,12 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
 	attr.mode = inode->i_mode;
 
 	err = ovl_create_or_link(dentry, inode, &attr, false);
-	if (err)
-		iput(inode);
 
+	/* Did we ended up using the preallocated inode? */
+	if (inode == d_inode(dentry))
+		iput(inode);
+	else
+		destroy_inode(inode);
 out_drop_write:
 	ovl_drop_write(dentry);
 out:
@@ -597,7 +640,6 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
 {
 	int err;
 	bool locked = false;
-	struct inode *inode;
 
 	err = ovl_want_write(old);
 	if (err)
@@ -611,14 +653,9 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
 	if (err)
 		goto out_drop_write;
 
-	inode = d_inode(old);
-	ihold(inode);
-
-	err = ovl_create_or_link(new, inode,
+	err = ovl_create_or_link(new, d_inode(old),
 			&(struct ovl_cattr) {.hardlink = ovl_dentry_upper(old)},
 			ovl_type_origin(old));
-	if (err)
-		iput(inode);
 
 	ovl_nlink_end(old, locked);
 out_drop_write:
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 3aa2ea447436..5d017ea06ea1 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -792,6 +792,15 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
 	return true;
 }
 
+static struct inode *ovl_iget5(struct super_block *sb, struct inode *newinode,
+			       struct inode *key)
+{
+	return newinode ? iget5_prealloc(newinode, (unsigned long) key,
+					 ovl_inode_test, ovl_inode_set, key) :
+			  iget5_locked(sb, (unsigned long) key,
+				       ovl_inode_test, ovl_inode_set, key);
+}
+
 struct inode *ovl_get_inode(struct super_block *sb,
 			    struct ovl_inode_params *oip)
 {
@@ -819,8 +828,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
 						      upperdentry);
 		unsigned int nlink = is_dir ? 1 : realinode->i_nlink;
 
-		inode = iget5_locked(sb, (unsigned long) key, ovl_inode_test,
-				     ovl_inode_set, key);
+		inode = ovl_iget5(sb, oip->newinode, key);
 		if (!inode)
 			goto out_nomem;
 		if (!(inode->i_state & I_NEW)) {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index dfb25b9b26a8..330dc0467f40 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -335,6 +335,7 @@ int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
 struct ovl_inode_params {
+	struct inode *newinode;
 	struct dentry *upperdentry;
 	struct ovl_path *lowerpath;
 	struct dentry *index;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18  8:29 ` [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode Amir Goldstein
@ 2018-05-18 10:14   ` Miklos Szeredi
  2018-05-18 14:08     ` Amir Goldstein
  2018-05-18 15:03   ` Al Viro
  1 sibling, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-18 10:14 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 10:29 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> Currently, there is a small window where ovl_obtain_alias() can
> race with ovl_instantiate() and create two different overlay inodes
> with the same underlying real non-dir non-hardlink inode.
>
> The race requires an adversary to guess the file handle of the
> yet to be created upper inode and decode the guessed file handle
> after ovl_creat_real(), but before ovl_instantiate().
> This race does not affect overlay directory inodes, because those
> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>
> This patch fixes the race, by using iget5_prealloc() to add a newly
> created inode to cache.
>
> If the newly created inode apears to already exist in cache (hashed
> by the same real upper inode), we instantiate the dentry with the old
> inode and drop the new inode, instead of silently not hashing the new
> inode.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/dir.c       | 81 +++++++++++++++++++++++++++++++++++-------------
>  fs/overlayfs/inode.c     | 12 +++++--
>  fs/overlayfs/overlayfs.h |  1 +
>  3 files changed, 70 insertions(+), 24 deletions(-)
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 6f7400be3fc8..833528d3e351 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -217,24 +217,57 @@ static int ovl_set_opaque(struct dentry *dentry, struct dentry *upperdentry)
>         return ovl_set_opaque_xerr(dentry, upperdentry, -EIO);
>  }
>
> -/* Common operations required to be done after creation of file on upper */
> -static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
> -                           struct dentry *newdentry, bool hardlink)
> +/*
> + * Common operations required to be done after creation of file on upper.
> + * If @hardlink is false, then @inode is a pre-allocated inode, not yet on
> + * overlay's inode_sb_list.  If @newdentry is instantiated with @inode it
> + * takes a reference on @inode.
> + */
> +static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
> +                          struct dentry *newdentry, bool hardlink)
>  {
> +       struct ovl_inode_params oip = {
> +               .upperdentry = newdentry,
> +               .newinode = inode,
> +       };
> +
>         ovl_dir_modified(dentry->d_parent, false);
> -       ovl_copyattr(d_inode(newdentry), inode);
>         ovl_dentry_set_upper_alias(dentry);
>         if (!hardlink) {
> -               ovl_inode_update(inode, newdentry);
> +               /*
> +                * ovl_obtain_alias() can be called after ovl_create_real()
> +                * and before we get here, so we may get an inode from cache
> +                * with the same real upperdentry that is not the inode we
> +                * pre-allocated.  In this case we will use the cached inode
> +                * to instantiate the new dentry.
> +                *
> +                * XXX: if we ever use ovl_obtain_alias() to decode directory
> +                * file handles, need to use ovl_get_inode_locked() and
> +                * d_instantiate_new() here to prevent from creating two
> +                * hashed directory inode aliases.
> +                */
> +               inode = ovl_get_inode(dentry->d_sb, &oip);
> +               if (IS_ERR(inode))
> +                       return PTR_ERR(inode);
>         } else {
>                 WARN_ON(ovl_inode_real(inode) != d_inode(newdentry));
>                 dput(newdentry);
>                 inc_nlink(inode);
>         }
> +
> +       ihold(inode);

I don't get it.   We can get rid of this ihold...


>         d_instantiate(dentry, inode);
> +       if (inode != oip.newinode) {
> +               pr_warn_ratelimited("overlayfs: newly created inode found in cache (%pd2)\n",
> +                                   dentry);
> +               iput(inode);

... and this iput ...

> +       }
> +
>         /* Force lookup of new upper hardlink to find its lower */
>         if (hardlink)
>                 d_drop(dentry);
> +
> +       return 0;
>  }
>
>  static bool ovl_type_merge(struct dentry *dentry)
> @@ -273,11 +306,17 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>                 ovl_set_opaque(dentry, newdentry);
>         }
>
> -       ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
> -       err = 0;
> +       err = ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
> +       if (err)
> +               goto out_cleanup;
>  out_unlock:
>         inode_unlock(udir);
>         return err;
> +
> +out_cleanup:
> +       ovl_cleanup(udir, newdentry);
> +       dput(newdentry);
> +       goto out_unlock;
>  }
>
>  static struct dentry *ovl_clear_empty(struct dentry *dentry,
> @@ -462,10 +501,9 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>                 if (err)
>                         goto out_cleanup;
>         }
> -       ovl_instantiate(dentry, inode, newdentry, hardlink);
> -       newdentry = NULL;
> -out_dput2:
> -       dput(newdentry);
> +       err = ovl_instantiate(dentry, inode, newdentry, hardlink);
> +       if (err)
> +               goto out_cleanup;
>  out_dput:
>         dput(upper);
>  out_unlock:
> @@ -479,7 +517,8 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>
>  out_cleanup:
>         ovl_cleanup(wdir, newdentry);
> -       goto out_dput2;
> +       dput(newdentry);
> +       goto out_dput;
>  }
>
>  static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
> @@ -547,8 +586,9 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
>         if (err)
>                 goto out;
>
> +       /* Preallocate inode to be used by ovl_get_inode() */
>         err = -ENOMEM;
> -       inode = ovl_new_inode(dentry->d_sb, mode, rdev);
> +       inode = alloc_inode(dentry->d_sb);
>         if (!inode)
>                 goto out_drop_write;
>
> @@ -556,9 +596,12 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
>         attr.mode = inode->i_mode;
>
>         err = ovl_create_or_link(dentry, inode, &attr, false);
> -       if (err)
> -               iput(inode);
>
> +       /* Did we ended up using the preallocated inode? */
> +       if (inode == d_inode(dentry))
> +               iput(inode);

... and this iput ...

> +       else
> +               destroy_inode(inode);
>  out_drop_write:
>         ovl_drop_write(dentry);
>  out:
> @@ -597,7 +640,6 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
>  {
>         int err;
>         bool locked = false;
> -       struct inode *inode;
>
>         err = ovl_want_write(old);
>         if (err)
> @@ -611,14 +653,9 @@ static int ovl_link(struct dentry *old, struct inode *newdir,
>         if (err)
>                 goto out_drop_write;
>
> -       inode = d_inode(old);
> -       ihold(inode);

... and leave this ihold ...
> -
> -       err = ovl_create_or_link(new, inode,
> +       err = ovl_create_or_link(new, d_inode(old),
>                         &(struct ovl_cattr) {.hardlink = ovl_dentry_upper(old)},
>                         ovl_type_origin(old));
> -       if (err)
> -               iput(inode);

... and this iput.

>
>         ovl_nlink_end(old, locked);
>  out_drop_write:
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 3aa2ea447436..5d017ea06ea1 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -792,6 +792,15 @@ static bool ovl_hash_bylower(struct super_block *sb, struct dentry *upper,
>         return true;
>  }
>
> +static struct inode *ovl_iget5(struct super_block *sb, struct inode *newinode,
> +                              struct inode *key)
> +{
> +       return newinode ? iget5_prealloc(newinode, (unsigned long) key,
> +                                        ovl_inode_test, ovl_inode_set, key) :
> +                         iget5_locked(sb, (unsigned long) key,
> +                                      ovl_inode_test, ovl_inode_set, key);
> +}
> +
>  struct inode *ovl_get_inode(struct super_block *sb,
>                             struct ovl_inode_params *oip)
>  {
> @@ -819,8 +828,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
>                                                       upperdentry);
>                 unsigned int nlink = is_dir ? 1 : realinode->i_nlink;
>
> -               inode = iget5_locked(sb, (unsigned long) key, ovl_inode_test,
> -                                    ovl_inode_set, key);
> +               inode = ovl_iget5(sb, oip->newinode, key);
>                 if (!inode)
>                         goto out_nomem;
>                 if (!(inode->i_state & I_NEW)) {
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index dfb25b9b26a8..330dc0467f40 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -335,6 +335,7 @@ int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
>  bool ovl_is_private_xattr(const char *name);
>
>  struct ovl_inode_params {
> +       struct inode *newinode;
>         struct dentry *upperdentry;
>         struct ovl_path *lowerpath;
>         struct dentry *index;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 10:14   ` Miklos Szeredi
@ 2018-05-18 14:08     ` Amir Goldstein
  2018-05-18 14:24       ` Miklos Szeredi
  0 siblings, 1 reply; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18 14:08 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 1:14 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, May 18, 2018 at 10:29 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Currently, there is a small window where ovl_obtain_alias() can
>> race with ovl_instantiate() and create two different overlay inodes
>> with the same underlying real non-dir non-hardlink inode.
>>
>> The race requires an adversary to guess the file handle of the
>> yet to be created upper inode and decode the guessed file handle
>> after ovl_creat_real(), but before ovl_instantiate().
>> This race does not affect overlay directory inodes, because those
>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>
>> This patch fixes the race, by using iget5_prealloc() to add a newly
>> created inode to cache.
>>
>> If the newly created inode apears to already exist in cache (hashed
>> by the same real upper inode), we instantiate the dentry with the old
>> inode and drop the new inode, instead of silently not hashing the new
>> inode.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/dir.c       | 81 +++++++++++++++++++++++++++++++++++-------------
>>  fs/overlayfs/inode.c     | 12 +++++--
>>  fs/overlayfs/overlayfs.h |  1 +
>>  3 files changed, 70 insertions(+), 24 deletions(-)
>>
>> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
>> index 6f7400be3fc8..833528d3e351 100644
>> --- a/fs/overlayfs/dir.c
>> +++ b/fs/overlayfs/dir.c
>> @@ -217,24 +217,57 @@ static int ovl_set_opaque(struct dentry *dentry, struct dentry *upperdentry)
>>         return ovl_set_opaque_xerr(dentry, upperdentry, -EIO);
>>  }
>>
>> -/* Common operations required to be done after creation of file on upper */
>> -static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
>> -                           struct dentry *newdentry, bool hardlink)
>> +/*
>> + * Common operations required to be done after creation of file on upper.
>> + * If @hardlink is false, then @inode is a pre-allocated inode, not yet on
>> + * overlay's inode_sb_list.  If @newdentry is instantiated with @inode it
>> + * takes a reference on @inode.
>> + */
>> +static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
>> +                          struct dentry *newdentry, bool hardlink)
>>  {
>> +       struct ovl_inode_params oip = {
>> +               .upperdentry = newdentry,
>> +               .newinode = inode,
>> +       };
>> +
>>         ovl_dir_modified(dentry->d_parent, false);
>> -       ovl_copyattr(d_inode(newdentry), inode);
>>         ovl_dentry_set_upper_alias(dentry);
>>         if (!hardlink) {
>> -               ovl_inode_update(inode, newdentry);
>> +               /*
>> +                * ovl_obtain_alias() can be called after ovl_create_real()
>> +                * and before we get here, so we may get an inode from cache
>> +                * with the same real upperdentry that is not the inode we
>> +                * pre-allocated.  In this case we will use the cached inode
>> +                * to instantiate the new dentry.
>> +                *
>> +                * XXX: if we ever use ovl_obtain_alias() to decode directory
>> +                * file handles, need to use ovl_get_inode_locked() and
>> +                * d_instantiate_new() here to prevent from creating two
>> +                * hashed directory inode aliases.
>> +                */
>> +               inode = ovl_get_inode(dentry->d_sb, &oip);
>> +               if (IS_ERR(inode))
>> +                       return PTR_ERR(inode);
>>         } else {
>>                 WARN_ON(ovl_inode_real(inode) != d_inode(newdentry));
>>                 dput(newdentry);
>>                 inc_nlink(inode);
>>         }
>> +
>> +       ihold(inode);
>
> I don't get it.   We can get rid of this ihold...
>
>
>>         d_instantiate(dentry, inode);
>> +       if (inode != oip.newinode) {
>> +               pr_warn_ratelimited("overlayfs: newly created inode found in cache (%pd2)\n",
>> +                                   dentry);
>> +               iput(inode);
>
> ... and this iput ...
>
>> +       }
>> +
>>         /* Force lookup of new upper hardlink to find its lower */
>>         if (hardlink)
>>                 d_drop(dentry);
>> +
>> +       return 0;
>>  }
>>
>>  static bool ovl_type_merge(struct dentry *dentry)
>> @@ -273,11 +306,17 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>>                 ovl_set_opaque(dentry, newdentry);
>>         }
>>
>> -       ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
>> -       err = 0;
>> +       err = ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
>> +       if (err)
>> +               goto out_cleanup;
>>  out_unlock:
>>         inode_unlock(udir);
>>         return err;
>> +
>> +out_cleanup:
>> +       ovl_cleanup(udir, newdentry);
>> +       dput(newdentry);
>> +       goto out_unlock;
>>  }
>>
>>  static struct dentry *ovl_clear_empty(struct dentry *dentry,
>> @@ -462,10 +501,9 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>>                 if (err)
>>                         goto out_cleanup;
>>         }
>> -       ovl_instantiate(dentry, inode, newdentry, hardlink);
>> -       newdentry = NULL;
>> -out_dput2:
>> -       dput(newdentry);
>> +       err = ovl_instantiate(dentry, inode, newdentry, hardlink);
>> +       if (err)
>> +               goto out_cleanup;
>>  out_dput:
>>         dput(upper);
>>  out_unlock:
>> @@ -479,7 +517,8 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>>
>>  out_cleanup:
>>         ovl_cleanup(wdir, newdentry);
>> -       goto out_dput2;
>> +       dput(newdentry);
>> +       goto out_dput;
>>  }
>>
>>  static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>> @@ -547,8 +586,9 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
>>         if (err)
>>                 goto out;
>>
>> +       /* Preallocate inode to be used by ovl_get_inode() */
>>         err = -ENOMEM;
>> -       inode = ovl_new_inode(dentry->d_sb, mode, rdev);
>> +       inode = alloc_inode(dentry->d_sb);
>>         if (!inode)
>>                 goto out_drop_write;
>>
>> @@ -556,9 +596,12 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
>>         attr.mode = inode->i_mode;
>>
>>         err = ovl_create_or_link(dentry, inode, &attr, false);
>> -       if (err)
>> -               iput(inode);
>>
>> +       /* Did we ended up using the preallocated inode? */
>> +       if (inode == d_inode(dentry))
>> +               iput(inode);
>
> ... and this iput ...
>

-       if (err)
-               iput(inode);
+       /* Did we end up using the preallocated inode? */
+       if (err || inode != d_inode(dentry))
+               destroy_inode(inode);

Pushed to ovl-fixes.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 14:08     ` Amir Goldstein
@ 2018-05-18 14:24       ` Miklos Szeredi
  0 siblings, 0 replies; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-18 14:24 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 4:08 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, May 18, 2018 at 1:14 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Fri, May 18, 2018 at 10:29 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> Currently, there is a small window where ovl_obtain_alias() can
>>> race with ovl_instantiate() and create two different overlay inodes
>>> with the same underlying real non-dir non-hardlink inode.
>>>
>>> The race requires an adversary to guess the file handle of the
>>> yet to be created upper inode and decode the guessed file handle
>>> after ovl_creat_real(), but before ovl_instantiate().
>>> This race does not affect overlay directory inodes, because those
>>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>>
>>> This patch fixes the race, by using iget5_prealloc() to add a newly
>>> created inode to cache.
>>>
>>> If the newly created inode apears to already exist in cache (hashed
>>> by the same real upper inode), we instantiate the dentry with the old
>>> inode and drop the new inode, instead of silently not hashing the new
>>> inode.
>>>
>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> ---
>>>  fs/overlayfs/dir.c       | 81 +++++++++++++++++++++++++++++++++++-------------
>>>  fs/overlayfs/inode.c     | 12 +++++--
>>>  fs/overlayfs/overlayfs.h |  1 +
>>>  3 files changed, 70 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
>>> index 6f7400be3fc8..833528d3e351 100644
>>> --- a/fs/overlayfs/dir.c
>>> +++ b/fs/overlayfs/dir.c
>>> @@ -217,24 +217,57 @@ static int ovl_set_opaque(struct dentry *dentry, struct dentry *upperdentry)
>>>         return ovl_set_opaque_xerr(dentry, upperdentry, -EIO);
>>>  }
>>>
>>> -/* Common operations required to be done after creation of file on upper */
>>> -static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
>>> -                           struct dentry *newdentry, bool hardlink)
>>> +/*
>>> + * Common operations required to be done after creation of file on upper.
>>> + * If @hardlink is false, then @inode is a pre-allocated inode, not yet on
>>> + * overlay's inode_sb_list.  If @newdentry is instantiated with @inode it
>>> + * takes a reference on @inode.
>>> + */
>>> +static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
>>> +                          struct dentry *newdentry, bool hardlink)
>>>  {
>>> +       struct ovl_inode_params oip = {
>>> +               .upperdentry = newdentry,
>>> +               .newinode = inode,
>>> +       };
>>> +
>>>         ovl_dir_modified(dentry->d_parent, false);
>>> -       ovl_copyattr(d_inode(newdentry), inode);
>>>         ovl_dentry_set_upper_alias(dentry);
>>>         if (!hardlink) {
>>> -               ovl_inode_update(inode, newdentry);
>>> +               /*
>>> +                * ovl_obtain_alias() can be called after ovl_create_real()
>>> +                * and before we get here, so we may get an inode from cache
>>> +                * with the same real upperdentry that is not the inode we
>>> +                * pre-allocated.  In this case we will use the cached inode
>>> +                * to instantiate the new dentry.
>>> +                *
>>> +                * XXX: if we ever use ovl_obtain_alias() to decode directory
>>> +                * file handles, need to use ovl_get_inode_locked() and
>>> +                * d_instantiate_new() here to prevent from creating two
>>> +                * hashed directory inode aliases.
>>> +                */
>>> +               inode = ovl_get_inode(dentry->d_sb, &oip);
>>> +               if (IS_ERR(inode))
>>> +                       return PTR_ERR(inode);
>>>         } else {
>>>                 WARN_ON(ovl_inode_real(inode) != d_inode(newdentry));
>>>                 dput(newdentry);
>>>                 inc_nlink(inode);
>>>         }
>>> +
>>> +       ihold(inode);
>>
>> I don't get it.   We can get rid of this ihold...
>>
>>
>>>         d_instantiate(dentry, inode);
>>> +       if (inode != oip.newinode) {
>>> +               pr_warn_ratelimited("overlayfs: newly created inode found in cache (%pd2)\n",
>>> +                                   dentry);
>>> +               iput(inode);
>>
>> ... and this iput ...
>>
>>> +       }
>>> +
>>>         /* Force lookup of new upper hardlink to find its lower */
>>>         if (hardlink)
>>>                 d_drop(dentry);
>>> +
>>> +       return 0;
>>>  }
>>>
>>>  static bool ovl_type_merge(struct dentry *dentry)
>>> @@ -273,11 +306,17 @@ static int ovl_create_upper(struct dentry *dentry, struct inode *inode,
>>>                 ovl_set_opaque(dentry, newdentry);
>>>         }
>>>
>>> -       ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
>>> -       err = 0;
>>> +       err = ovl_instantiate(dentry, inode, newdentry, !!attr->hardlink);
>>> +       if (err)
>>> +               goto out_cleanup;
>>>  out_unlock:
>>>         inode_unlock(udir);
>>>         return err;
>>> +
>>> +out_cleanup:
>>> +       ovl_cleanup(udir, newdentry);
>>> +       dput(newdentry);
>>> +       goto out_unlock;
>>>  }
>>>
>>>  static struct dentry *ovl_clear_empty(struct dentry *dentry,
>>> @@ -462,10 +501,9 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>>>                 if (err)
>>>                         goto out_cleanup;
>>>         }
>>> -       ovl_instantiate(dentry, inode, newdentry, hardlink);
>>> -       newdentry = NULL;
>>> -out_dput2:
>>> -       dput(newdentry);
>>> +       err = ovl_instantiate(dentry, inode, newdentry, hardlink);
>>> +       if (err)
>>> +               goto out_cleanup;
>>>  out_dput:
>>>         dput(upper);
>>>  out_unlock:
>>> @@ -479,7 +517,8 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
>>>
>>>  out_cleanup:
>>>         ovl_cleanup(wdir, newdentry);
>>> -       goto out_dput2;
>>> +       dput(newdentry);
>>> +       goto out_dput;
>>>  }
>>>
>>>  static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>>> @@ -547,8 +586,9 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
>>>         if (err)
>>>                 goto out;
>>>
>>> +       /* Preallocate inode to be used by ovl_get_inode() */
>>>         err = -ENOMEM;
>>> -       inode = ovl_new_inode(dentry->d_sb, mode, rdev);
>>> +       inode = alloc_inode(dentry->d_sb);
>>>         if (!inode)
>>>                 goto out_drop_write;
>>>
>>> @@ -556,9 +596,12 @@ static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,
>>>         attr.mode = inode->i_mode;
>>>
>>>         err = ovl_create_or_link(dentry, inode, &attr, false);
>>> -       if (err)
>>> -               iput(inode);
>>>
>>> +       /* Did we ended up using the preallocated inode? */
>>> +       if (inode == d_inode(dentry))
>>> +               iput(inode);
>>
>> ... and this iput ...
>>
>
> -       if (err)
> -               iput(inode);
> +       /* Did we end up using the preallocated inode? */
> +       if (err || inode != d_inode(dentry))
> +               destroy_inode(inode);

Minor nit: we don't even need the "err ||" since in the error case
d_inode(dentry) will be NULL.   I'll fix that up.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 7/9] vfs: export alloc_inode() and destroy_inode()
  2018-05-18  8:29 ` [PATCH v4 7/9] vfs: export alloc_inode() and destroy_inode() Amir Goldstein
@ 2018-05-18 15:02   ` Al Viro
  0 siblings, 0 replies; 24+ messages in thread
From: Al Viro @ 2018-05-18 15:02 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Vivek Goyal, linux-unionfs

On Fri, May 18, 2018 at 11:29:35AM +0300, Amir Goldstein wrote:
> They are needed by overlayfs.

NAK, and kindly abstain from patently false declarative statements
in commit messages.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18  8:29 ` [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode Amir Goldstein
  2018-05-18 10:14   ` Miklos Szeredi
@ 2018-05-18 15:03   ` Al Viro
  2018-05-18 15:11     ` Miklos Szeredi
  1 sibling, 1 reply; 24+ messages in thread
From: Al Viro @ 2018-05-18 15:03 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, Vivek Goyal, linux-unionfs

On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
> Currently, there is a small window where ovl_obtain_alias() can
> race with ovl_instantiate() and create two different overlay inodes
> with the same underlying real non-dir non-hardlink inode.
> 
> The race requires an adversary to guess the file handle of the
> yet to be created upper inode and decode the guessed file handle
> after ovl_creat_real(), but before ovl_instantiate().
> This race does not affect overlay directory inodes, because those
> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
> 
> This patch fixes the race, by using iget5_prealloc() to add a newly
> created inode to cache.

Mind explaining what the hell is wrong with insert_inode_locked4()?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:03   ` Al Viro
@ 2018-05-18 15:11     ` Miklos Szeredi
  2018-05-18 15:14       ` Miklos Szeredi
                         ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-18 15:11 UTC (permalink / raw)
  To: Al Viro; +Cc: Amir Goldstein, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
>> Currently, there is a small window where ovl_obtain_alias() can
>> race with ovl_instantiate() and create two different overlay inodes
>> with the same underlying real non-dir non-hardlink inode.
>>
>> The race requires an adversary to guess the file handle of the
>> yet to be created upper inode and decode the guessed file handle
>> after ovl_creat_real(), but before ovl_instantiate().
>> This race does not affect overlay directory inodes, because those
>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>
>> This patch fixes the race, by using iget5_prealloc() to add a newly
>> created inode to cache.
>
> Mind explaining what the hell is wrong with insert_inode_locked4()?

That it doesn't return the old inode if found.

Btw, these functions are eerily similar, and it'd be nice to reduce
the number of them.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:11     ` Miklos Szeredi
@ 2018-05-18 15:14       ` Miklos Szeredi
  2018-05-18 15:36       ` Amir Goldstein
  2018-05-18 15:40       ` Al Viro
  2 siblings, 0 replies; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-18 15:14 UTC (permalink / raw)
  To: Al Viro; +Cc: Amir Goldstein, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 5:11 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
>>> Currently, there is a small window where ovl_obtain_alias() can
>>> race with ovl_instantiate() and create two different overlay inodes
>>> with the same underlying real non-dir non-hardlink inode.
>>>
>>> The race requires an adversary to guess the file handle of the
>>> yet to be created upper inode and decode the guessed file handle
>>> after ovl_creat_real(), but before ovl_instantiate().
>>> This race does not affect overlay directory inodes, because those
>>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>>
>>> This patch fixes the race, by using iget5_prealloc() to add a newly
>>> created inode to cache.
>>
>> Mind explaining what the hell is wrong with insert_inode_locked4()?
>
> That it doesn't return the old inode if found.
>
> Btw, these functions are eerily similar, and it'd be nice to reduce
> the number of them.

Does inode_sb_list_add() need inode_hash_lock?  I don't see why it
would, it has it's own locking...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:11     ` Miklos Szeredi
  2018-05-18 15:14       ` Miklos Szeredi
@ 2018-05-18 15:36       ` Amir Goldstein
  2018-05-18 15:57         ` Miklos Szeredi
  2018-05-18 15:40       ` Al Viro
  2 siblings, 1 reply; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18 15:36 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 6:11 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
>>> Currently, there is a small window where ovl_obtain_alias() can
>>> race with ovl_instantiate() and create two different overlay inodes
>>> with the same underlying real non-dir non-hardlink inode.
>>>
>>> The race requires an adversary to guess the file handle of the
>>> yet to be created upper inode and decode the guessed file handle
>>> after ovl_creat_real(), but before ovl_instantiate().
>>> This race does not affect overlay directory inodes, because those
>>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>>
>>> This patch fixes the race, by using iget5_prealloc() to add a newly
>>> created inode to cache.
>>
>> Mind explaining what the hell is wrong with insert_inode_locked4()?
>
> That it doesn't return the old inode if found.
>

FYI, I have set a side a version I was working on before iget5_prealloc()
that uses insert_inode_locked5 (runner up for ugliest function name):

+int insert_inode_locked4(struct inode *inode, unsigned long hashval,
+               int (*test)(struct inode *, void *), void *data)
+{
+       struct inode *old = insert_inode_locked5(inode, hashval, test, data);

+       if (old) {
+               iput(old);
+               return -EBUSY;
+       }
+
+       return 0;
+}
+EXPORT_SYMBOL(insert_inode_locked4);

Let me know if you want me to resurrect it.

Thanks,
Amir.



> Btw, these functions are eerily similar, and it'd be nice to reduce
> the number of them.
>
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:11     ` Miklos Szeredi
  2018-05-18 15:14       ` Miklos Szeredi
  2018-05-18 15:36       ` Amir Goldstein
@ 2018-05-18 15:40       ` Al Viro
  2018-05-18 16:01         ` Miklos Szeredi
  2 siblings, 1 reply; 24+ messages in thread
From: Al Viro @ 2018-05-18 15:40 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Amir Goldstein, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 05:11:20PM +0200, Miklos Szeredi wrote:
> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
> >> Currently, there is a small window where ovl_obtain_alias() can
> >> race with ovl_instantiate() and create two different overlay inodes
> >> with the same underlying real non-dir non-hardlink inode.
> >>
> >> The race requires an adversary to guess the file handle of the
> >> yet to be created upper inode and decode the guessed file handle
> >> after ovl_creat_real(), but before ovl_instantiate().
> >> This race does not affect overlay directory inodes, because those
> >> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
> >>
> >> This patch fixes the race, by using iget5_prealloc() to add a newly
> >> created inode to cache.
> >
> > Mind explaining what the hell is wrong with insert_inode_locked4()?
> 
> That it doesn't return the old inode if found.
> 
> Btw, these functions are eerily similar, and it'd be nice to reduce
> the number of them.

IMO you are using the wrong model; just create the underlying object,
then do what lookup would for overlayfs inode creation, then
d_splice_alias() + dput() of return value (if non-IS_ERR).

Local filesystems are not a good match for what you have there; something
like NFS fits better.  FWIW, NFS patch I've got here is

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 73f8b43d988c..8aff171a680a 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1625,6 +1625,7 @@ int nfs_instantiate(struct dentry *dentry, struct nfs_fh *fhandle,
 	struct dentry *parent = dget_parent(dentry);
 	struct inode *dir = d_inode(parent);
 	struct inode *inode;
+	struct dentry *d;
 	int error = -EACCES;
 
 	d_drop(dentry);
@@ -1645,10 +1646,12 @@ int nfs_instantiate(struct dentry *dentry, struct nfs_fh *fhandle,
 			goto out_error;
 	}
 	inode = nfs_fhget(dentry->d_sb, fhandle, fattr, label);
-	error = PTR_ERR(inode);
-	if (IS_ERR(inode))
+	d = d_splice_alias(inode, dentry);
+	if (IS_ERR(d)) {
+		error = PTR_ERR(d);
 		goto out_error;
-	d_add(dentry, inode);
+	}
+	dput(d);
 out:
 	dput(parent);
 	return 0;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:36       ` Amir Goldstein
@ 2018-05-18 15:57         ` Miklos Szeredi
  2018-05-18 16:53           ` Amir Goldstein
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-18 15:57 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 5:36 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, May 18, 2018 at 6:11 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>> On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
>>>> Currently, there is a small window where ovl_obtain_alias() can
>>>> race with ovl_instantiate() and create two different overlay inodes
>>>> with the same underlying real non-dir non-hardlink inode.
>>>>
>>>> The race requires an adversary to guess the file handle of the
>>>> yet to be created upper inode and decode the guessed file handle
>>>> after ovl_creat_real(), but before ovl_instantiate().
>>>> This race does not affect overlay directory inodes, because those
>>>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>>>
>>>> This patch fixes the race, by using iget5_prealloc() to add a newly
>>>> created inode to cache.
>>>
>>> Mind explaining what the hell is wrong with insert_inode_locked4()?
>>
>> That it doesn't return the old inode if found.
>>
>
> FYI, I have set a side a version I was working on before iget5_prealloc()
> that uses insert_inode_locked5 (runner up for ugliest function name):
>
> +int insert_inode_locked4(struct inode *inode, unsigned long hashval,
> +               int (*test)(struct inode *, void *), void *data)
> +{
> +       struct inode *old = insert_inode_locked5(inode, hashval, test, data);
>
> +       if (old) {
> +               iput(old);
> +               return -EBUSY;
> +       }
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(insert_inode_locked4);

Can do exact same thing with iget5_prealloc(), just need to move
inode_sb_list_add() out to iget5_locked() (meaning, overlayfs can
continue to use new_inode()/iput() instead of having to do
alloc/destroy_inode()).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:40       ` Al Viro
@ 2018-05-18 16:01         ` Miklos Szeredi
  2018-05-22 15:21           ` Miklos Szeredi
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-18 16:01 UTC (permalink / raw)
  To: Al Viro; +Cc: Amir Goldstein, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 5:40 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, May 18, 2018 at 05:11:20PM +0200, Miklos Szeredi wrote:
>> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
>> >> Currently, there is a small window where ovl_obtain_alias() can
>> >> race with ovl_instantiate() and create two different overlay inodes
>> >> with the same underlying real non-dir non-hardlink inode.
>> >>
>> >> The race requires an adversary to guess the file handle of the
>> >> yet to be created upper inode and decode the guessed file handle
>> >> after ovl_creat_real(), but before ovl_instantiate().
>> >> This race does not affect overlay directory inodes, because those
>> >> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>> >>
>> >> This patch fixes the race, by using iget5_prealloc() to add a newly
>> >> created inode to cache.
>> >
>> > Mind explaining what the hell is wrong with insert_inode_locked4()?
>>
>> That it doesn't return the old inode if found.
>>
>> Btw, these functions are eerily similar, and it'd be nice to reduce
>> the number of them.
>
> IMO you are using the wrong model; just create the underlying object,
> then do what lookup would for overlayfs inode creation, then
> d_splice_alias() + dput() of return value (if non-IS_ERR).

That's fine.  Issue is with inode allocation.  I'd really like to
avoid dealing with failure after a successful create.  Doing the
iget5_locked() can fail with -ENOMEM, altough it's arguably a rare
occurrence.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 15:57         ` Miklos Szeredi
@ 2018-05-18 16:53           ` Amir Goldstein
  0 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-18 16:53 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 6:57 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, May 18, 2018 at 5:36 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Fri, May 18, 2018 at 6:11 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>>> On Fri, May 18, 2018 at 11:29:37AM +0300, Amir Goldstein wrote:
>>>>> Currently, there is a small window where ovl_obtain_alias() can
>>>>> race with ovl_instantiate() and create two different overlay inodes
>>>>> with the same underlying real non-dir non-hardlink inode.
>>>>>
>>>>> The race requires an adversary to guess the file handle of the
>>>>> yet to be created upper inode and decode the guessed file handle
>>>>> after ovl_creat_real(), but before ovl_instantiate().
>>>>> This race does not affect overlay directory inodes, because those
>>>>> are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
>>>>>
>>>>> This patch fixes the race, by using iget5_prealloc() to add a newly
>>>>> created inode to cache.
>>>>
>>>> Mind explaining what the hell is wrong with insert_inode_locked4()?
>>>
>>> That it doesn't return the old inode if found.
>>>
>>
>> FYI, I have set a side a version I was working on before iget5_prealloc()
>> that uses insert_inode_locked5 (runner up for ugliest function name):
>>
>> +int insert_inode_locked4(struct inode *inode, unsigned long hashval,
>> +               int (*test)(struct inode *, void *), void *data)
>> +{
>> +       struct inode *old = insert_inode_locked5(inode, hashval, test, data);
>>
>> +       if (old) {
>> +               iput(old);
>> +               return -EBUSY;
>> +       }
>> +
>> +       return 0;
>> +}
>> +EXPORT_SYMBOL(insert_inode_locked4);
>
> Can do exact same thing with iget5_prealloc(), just need to move
> inode_sb_list_add() out to iget5_locked() (meaning, overlayfs can
> continue to use new_inode()/iput() instead of having to do
> alloc/destroy_inode()).
>

Yeh, using alloc/destroy_inode() isn't pretty.
I'll let you untangle iget5_prealloc() and inode_sb_list.

Let me know if you have something to test.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-18 16:01         ` Miklos Szeredi
@ 2018-05-22 15:21           ` Miklos Szeredi
  2018-05-23  8:11             ` Amir Goldstein
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2018-05-22 15:21 UTC (permalink / raw)
  To: Al Viro; +Cc: Amir Goldstein, Vivek Goyal, overlayfs

On Fri, May 18, 2018 at 06:01:22PM +0200, Miklos Szeredi wrote:
> On Fri, May 18, 2018 at 5:40 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Fri, May 18, 2018 at 05:11:20PM +0200, Miklos Szeredi wrote:
> >> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:

> >> > Mind explaining what the hell is wrong with insert_inode_locked4()?
> >>
> >> That it doesn't return the old inode if found.
> >>
> >> Btw, these functions are eerily similar, and it'd be nice to reduce
> >> the number of them.

How about this one?

Same thing could be done with iget_locked()/insert_inode_locked(), but I also
wonder how much optimization do the "fast" versions get for lack of a get()
function...  Shouldn't we use the generic ones for the all cases?

Thanks,
Miklos
---

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Thu, 17 May 2018 10:53:05 +0200
Subject: vfs: factor out inode_insert5()

Split out common helper for race free insertion of an already allocated
inode into the cache.  Use this from iget5_locked() and
insert_inode_locked4().  Make iget5_locked() use new_inode()/iput() instead
of alloc_inode()/destroy_inode() directly.

Also export to modules for use by filesystems which want to preallocate an
inode before file/directory creation.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/inode.c         |  164 ++++++++++++++++++++++++-----------------------------
 include/linux/fs.h |    4 +
 2 files changed, 79 insertions(+), 89 deletions(-)

--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1003,6 +1003,70 @@ void unlock_two_nondirectories(struct in
 EXPORT_SYMBOL(unlock_two_nondirectories);
 
 /**
+ * inode_insert5 - obtain an inode from a mounted file system
+ * @inode:	pre-allocated inode to use for insert to cache
+ * @hashval:	hash value (usually inode number) to get
+ * @test:	callback used for comparisons between inodes
+ * @set:	callback used to initialize a new struct inode
+ * @data:	opaque data pointer to pass to @test and @set
+ *
+ * Search for the inode specified by @hashval and @data in the inode cache,
+ * and if present it is return it with an increased reference count. This is
+ * a variant of iget5_locked() for callers that don't want to fail on memory
+ * allocation of inode.
+ *
+ * If the inode is not in cache, insert the pre-allocated inode to cache and
+ * return it locked, hashed, and with the I_NEW flag set. The file system gets
+ * to fill it in before unlocking it via unlock_new_inode().
+ *
+ * Note both @test and @set are called with the inode_hash_lock held, so can't
+ * sleep.
+ */
+struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
+			    int (*test)(struct inode *, void *),
+			    int (*set)(struct inode *, void *), void *data)
+{
+	struct hlist_head *head = inode_hashtable + hash(inode->i_sb, hashval);
+	struct inode *old;
+
+again:
+	spin_lock(&inode_hash_lock);
+	old = find_inode(inode->i_sb, head, test, data);
+	if (unlikely(old)) {
+		/*
+		 * Uhhuh, somebody else created the same inode under us.
+		 * Use the old inode instead of the preallocated one.
+		 */
+		spin_unlock(&inode_hash_lock);
+		wait_on_inode(old);
+		if (unlikely(inode_unhashed(old))) {
+			iput(old);
+			goto again;
+		}
+		return old;
+	}
+
+	if (set && unlikely(set(inode, data))) {
+		inode = NULL;
+		goto unlock;
+	}
+
+	/*
+	 * Return the locked inode with I_NEW set, the
+	 * caller is responsible for filling in the contents
+	 */
+	spin_lock(&inode->i_lock);
+	inode->i_state |= I_NEW;
+	hlist_add_head(&inode->i_hash, head);
+	spin_unlock(&inode->i_lock);
+unlock:
+	spin_unlock(&inode_hash_lock);
+
+	return inode;
+}
+EXPORT_SYMBOL(inode_insert5);
+
+/**
  * iget5_locked - obtain an inode from a mounted file system
  * @sb:		super block of file system
  * @hashval:	hash value (usually inode number) to get
@@ -1026,66 +1090,18 @@ struct inode *iget5_locked(struct super_
 		int (*test)(struct inode *, void *),
 		int (*set)(struct inode *, void *), void *data)
 {
-	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
-	struct inode *inode;
-again:
-	spin_lock(&inode_hash_lock);
-	inode = find_inode(sb, head, test, data);
-	spin_unlock(&inode_hash_lock);
-
-	if (inode) {
-		wait_on_inode(inode);
-		if (unlikely(inode_unhashed(inode))) {
-			iput(inode);
-			goto again;
-		}
-		return inode;
-	}
+	struct inode *inode = ilookup5(sb, hashval, test, data);
 
-	inode = alloc_inode(sb);
-	if (inode) {
-		struct inode *old;
+	if (!inode) {
+		struct inode *new = new_inode(sb);
 
-		spin_lock(&inode_hash_lock);
-		/* We released the lock, so.. */
-		old = find_inode(sb, head, test, data);
-		if (!old) {
-			if (set(inode, data))
-				goto set_failed;
-
-			spin_lock(&inode->i_lock);
-			inode->i_state = I_NEW;
-			hlist_add_head(&inode->i_hash, head);
-			spin_unlock(&inode->i_lock);
-			inode_sb_list_add(inode);
-			spin_unlock(&inode_hash_lock);
-
-			/* Return the locked inode with I_NEW set, the
-			 * caller is responsible for filling in the contents
-			 */
-			return inode;
-		}
-
-		/*
-		 * Uhhuh, somebody else created the same inode under
-		 * us. Use the old inode instead of the one we just
-		 * allocated.
-		 */
-		spin_unlock(&inode_hash_lock);
-		destroy_inode(inode);
-		inode = old;
-		wait_on_inode(inode);
-		if (unlikely(inode_unhashed(inode))) {
-			iput(inode);
-			goto again;
+		if (new) {
+			inode = inode_insert5(new, hashval, test, set, data);
+			if (unlikely(inode != new))
+				iput(new);
 		}
 	}
 	return inode;
-
-set_failed:
-	spin_unlock(&inode_hash_lock);
-	destroy_inode(inode);
-	return NULL;
 }
 EXPORT_SYMBOL(iget5_locked);
 
@@ -1426,43 +1442,13 @@ EXPORT_SYMBOL(insert_inode_locked);
 int insert_inode_locked4(struct inode *inode, unsigned long hashval,
 		int (*test)(struct inode *, void *), void *data)
 {
-	struct super_block *sb = inode->i_sb;
-	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
+	struct inode *old = inode_insert5(inode, hashval, test, NULL, data);
 
-	while (1) {
-		struct inode *old = NULL;
-
-		spin_lock(&inode_hash_lock);
-		hlist_for_each_entry(old, head, i_hash) {
-			if (old->i_sb != sb)
-				continue;
-			if (!test(old, data))
-				continue;
-			spin_lock(&old->i_lock);
-			if (old->i_state & (I_FREEING|I_WILL_FREE)) {
-				spin_unlock(&old->i_lock);
-				continue;
-			}
-			break;
-		}
-		if (likely(!old)) {
-			spin_lock(&inode->i_lock);
-			inode->i_state |= I_NEW;
-			hlist_add_head(&inode->i_hash, head);
-			spin_unlock(&inode->i_lock);
-			spin_unlock(&inode_hash_lock);
-			return 0;
-		}
-		__iget(old);
-		spin_unlock(&old->i_lock);
-		spin_unlock(&inode_hash_lock);
-		wait_on_inode(old);
-		if (unlikely(!inode_unhashed(old))) {
-			iput(old);
-			return -EBUSY;
-		}
+	if (old != inode) {
 		iput(old);
+		return -EBUSY;
 	}
+	return 0;
 }
 EXPORT_SYMBOL(insert_inode_locked4);
 
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2879,6 +2879,10 @@ extern struct inode *ilookup5(struct sup
 		int (*test)(struct inode *, void *), void *data);
 extern struct inode *ilookup(struct super_block *sb, unsigned long ino);
 
+extern struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
+		int (*test)(struct inode *, void *),
+		int (*set)(struct inode *, void *),
+		void *data);
 extern struct inode * iget5_locked(struct super_block *, unsigned long, int (*test)(struct inode *, void *), int (*set)(struct inode *, void *), void *);
 extern struct inode * iget_locked(struct super_block *, unsigned long);
 extern struct inode *find_inode_nowait(struct super_block *,

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode
  2018-05-22 15:21           ` Miklos Szeredi
@ 2018-05-23  8:11             ` Amir Goldstein
  0 siblings, 0 replies; 24+ messages in thread
From: Amir Goldstein @ 2018-05-23  8:11 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, Vivek Goyal, overlayfs

On Tue, May 22, 2018 at 6:21 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, May 18, 2018 at 06:01:22PM +0200, Miklos Szeredi wrote:
>> On Fri, May 18, 2018 at 5:40 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> > On Fri, May 18, 2018 at 05:11:20PM +0200, Miklos Szeredi wrote:
>> >> On Fri, May 18, 2018 at 5:03 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
>> >> > Mind explaining what the hell is wrong with insert_inode_locked4()?
>> >>
>> >> That it doesn't return the old inode if found.
>> >>
>> >> Btw, these functions are eerily similar, and it'd be nice to reduce
>> >> the number of them.
>
> How about this one?
>

Pushed tested branch ovl-fixed.
For sanity check of iget5_locked() also ran xfstests -g quick on btrfs.

Thanks,
Amir.


> ---
>
> From: Miklos Szeredi <miklos@szeredi.hu>
> Date: Thu, 17 May 2018 10:53:05 +0200
> Subject: vfs: factor out inode_insert5()
>
> Split out common helper for race free insertion of an already allocated
> inode into the cache.  Use this from iget5_locked() and
> insert_inode_locked4().  Make iget5_locked() use new_inode()/iput() instead
> of alloc_inode()/destroy_inode() directly.
>
> Also export to modules for use by filesystems which want to preallocate an
> inode before file/directory creation.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/inode.c         |  164 ++++++++++++++++++++++++-----------------------------
>  include/linux/fs.h |    4 +
>  2 files changed, 79 insertions(+), 89 deletions(-)
>
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1003,6 +1003,70 @@ void unlock_two_nondirectories(struct in
>  EXPORT_SYMBOL(unlock_two_nondirectories);
>
>  /**
> + * inode_insert5 - obtain an inode from a mounted file system
> + * @inode:     pre-allocated inode to use for insert to cache
> + * @hashval:   hash value (usually inode number) to get
> + * @test:      callback used for comparisons between inodes
> + * @set:       callback used to initialize a new struct inode
> + * @data:      opaque data pointer to pass to @test and @set
> + *
> + * Search for the inode specified by @hashval and @data in the inode cache,
> + * and if present it is return it with an increased reference count. This is
> + * a variant of iget5_locked() for callers that don't want to fail on memory
> + * allocation of inode.
> + *
> + * If the inode is not in cache, insert the pre-allocated inode to cache and
> + * return it locked, hashed, and with the I_NEW flag set. The file system gets
> + * to fill it in before unlocking it via unlock_new_inode().
> + *
> + * Note both @test and @set are called with the inode_hash_lock held, so can't
> + * sleep.
> + */
> +struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
> +                           int (*test)(struct inode *, void *),
> +                           int (*set)(struct inode *, void *), void *data)
> +{
> +       struct hlist_head *head = inode_hashtable + hash(inode->i_sb, hashval);
> +       struct inode *old;
> +
> +again:
> +       spin_lock(&inode_hash_lock);
> +       old = find_inode(inode->i_sb, head, test, data);
> +       if (unlikely(old)) {
> +               /*
> +                * Uhhuh, somebody else created the same inode under us.
> +                * Use the old inode instead of the preallocated one.
> +                */
> +               spin_unlock(&inode_hash_lock);
> +               wait_on_inode(old);
> +               if (unlikely(inode_unhashed(old))) {
> +                       iput(old);
> +                       goto again;
> +               }
> +               return old;
> +       }
> +
> +       if (set && unlikely(set(inode, data))) {
> +               inode = NULL;
> +               goto unlock;
> +       }
> +
> +       /*
> +        * Return the locked inode with I_NEW set, the
> +        * caller is responsible for filling in the contents
> +        */
> +       spin_lock(&inode->i_lock);
> +       inode->i_state |= I_NEW;
> +       hlist_add_head(&inode->i_hash, head);
> +       spin_unlock(&inode->i_lock);
> +unlock:
> +       spin_unlock(&inode_hash_lock);
> +
> +       return inode;
> +}
> +EXPORT_SYMBOL(inode_insert5);
> +
> +/**
>   * iget5_locked - obtain an inode from a mounted file system
>   * @sb:                super block of file system
>   * @hashval:   hash value (usually inode number) to get
> @@ -1026,66 +1090,18 @@ struct inode *iget5_locked(struct super_
>                 int (*test)(struct inode *, void *),
>                 int (*set)(struct inode *, void *), void *data)
>  {
> -       struct hlist_head *head = inode_hashtable + hash(sb, hashval);
> -       struct inode *inode;
> -again:
> -       spin_lock(&inode_hash_lock);
> -       inode = find_inode(sb, head, test, data);
> -       spin_unlock(&inode_hash_lock);
> -
> -       if (inode) {
> -               wait_on_inode(inode);
> -               if (unlikely(inode_unhashed(inode))) {
> -                       iput(inode);
> -                       goto again;
> -               }
> -               return inode;
> -       }
> +       struct inode *inode = ilookup5(sb, hashval, test, data);
>
> -       inode = alloc_inode(sb);
> -       if (inode) {
> -               struct inode *old;
> +       if (!inode) {
> +               struct inode *new = new_inode(sb);
>
> -               spin_lock(&inode_hash_lock);
> -               /* We released the lock, so.. */
> -               old = find_inode(sb, head, test, data);
> -               if (!old) {
> -                       if (set(inode, data))
> -                               goto set_failed;
> -
> -                       spin_lock(&inode->i_lock);
> -                       inode->i_state = I_NEW;
> -                       hlist_add_head(&inode->i_hash, head);
> -                       spin_unlock(&inode->i_lock);
> -                       inode_sb_list_add(inode);
> -                       spin_unlock(&inode_hash_lock);
> -
> -                       /* Return the locked inode with I_NEW set, the
> -                        * caller is responsible for filling in the contents
> -                        */
> -                       return inode;
> -               }
> -
> -               /*
> -                * Uhhuh, somebody else created the same inode under
> -                * us. Use the old inode instead of the one we just
> -                * allocated.
> -                */
> -               spin_unlock(&inode_hash_lock);
> -               destroy_inode(inode);
> -               inode = old;
> -               wait_on_inode(inode);
> -               if (unlikely(inode_unhashed(inode))) {
> -                       iput(inode);
> -                       goto again;
> +               if (new) {
> +                       inode = inode_insert5(new, hashval, test, set, data);
> +                       if (unlikely(inode != new))
> +                               iput(new);
>                 }
>         }
>         return inode;
> -
> -set_failed:
> -       spin_unlock(&inode_hash_lock);
> -       destroy_inode(inode);
> -       return NULL;
>  }
>  EXPORT_SYMBOL(iget5_locked);
>
> @@ -1426,43 +1442,13 @@ EXPORT_SYMBOL(insert_inode_locked);
>  int insert_inode_locked4(struct inode *inode, unsigned long hashval,
>                 int (*test)(struct inode *, void *), void *data)
>  {
> -       struct super_block *sb = inode->i_sb;
> -       struct hlist_head *head = inode_hashtable + hash(sb, hashval);
> +       struct inode *old = inode_insert5(inode, hashval, test, NULL, data);
>
> -       while (1) {
> -               struct inode *old = NULL;
> -
> -               spin_lock(&inode_hash_lock);
> -               hlist_for_each_entry(old, head, i_hash) {
> -                       if (old->i_sb != sb)
> -                               continue;
> -                       if (!test(old, data))
> -                               continue;
> -                       spin_lock(&old->i_lock);
> -                       if (old->i_state & (I_FREEING|I_WILL_FREE)) {
> -                               spin_unlock(&old->i_lock);
> -                               continue;
> -                       }
> -                       break;
> -               }
> -               if (likely(!old)) {
> -                       spin_lock(&inode->i_lock);
> -                       inode->i_state |= I_NEW;
> -                       hlist_add_head(&inode->i_hash, head);
> -                       spin_unlock(&inode->i_lock);
> -                       spin_unlock(&inode_hash_lock);
> -                       return 0;
> -               }
> -               __iget(old);
> -               spin_unlock(&old->i_lock);
> -               spin_unlock(&inode_hash_lock);
> -               wait_on_inode(old);
> -               if (unlikely(!inode_unhashed(old))) {
> -                       iput(old);
> -                       return -EBUSY;
> -               }
> +       if (old != inode) {
>                 iput(old);
> +               return -EBUSY;
>         }
> +       return 0;
>  }
>  EXPORT_SYMBOL(insert_inode_locked4);
>
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2879,6 +2879,10 @@ extern struct inode *ilookup5(struct sup
>                 int (*test)(struct inode *, void *), void *data);
>  extern struct inode *ilookup(struct super_block *sb, unsigned long ino);
>
> +extern struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
> +               int (*test)(struct inode *, void *),
> +               int (*set)(struct inode *, void *),
> +               void *data);
>  extern struct inode * iget5_locked(struct super_block *, unsigned long, int (*test)(struct inode *, void *), int (*set)(struct inode *, void *), void *);
>  extern struct inode * iget_locked(struct super_block *, unsigned long);
>  extern struct inode *find_inode_nowait(struct super_block *,

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-05-23  8:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18  8:29 [PATCH v4 0/9] Overlayfs create object related fixes Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 1/9] ovl: remove WARN_ON() real inode attributes mismatch Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 2/9] ovl: strip debug argument from ovl_do_ helpers Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 3/9] ovl: struct cattr cleanups Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 4/9] ovl: create helper ovl_create_temp() Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 5/9] ovl: make ovl_create_real() cope with vfs_mkdir() safely Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 6/9] vfs: factor out iget5_prealloc() Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 7/9] vfs: export alloc_inode() and destroy_inode() Amir Goldstein
2018-05-18 15:02   ` Al Viro
2018-05-18  8:29 ` [PATCH v4 8/9] ovl: Pass argument to ovl_get_inode() in a structure Amir Goldstein
2018-05-18  8:29 ` [PATCH v4 9/9] ovl: use iget5_prealloc() to hash a newly created inode Amir Goldstein
2018-05-18 10:14   ` Miklos Szeredi
2018-05-18 14:08     ` Amir Goldstein
2018-05-18 14:24       ` Miklos Szeredi
2018-05-18 15:03   ` Al Viro
2018-05-18 15:11     ` Miklos Szeredi
2018-05-18 15:14       ` Miklos Szeredi
2018-05-18 15:36       ` Amir Goldstein
2018-05-18 15:57         ` Miklos Szeredi
2018-05-18 16:53           ` Amir Goldstein
2018-05-18 15:40       ` Al Viro
2018-05-18 16:01         ` Miklos Szeredi
2018-05-22 15:21           ` Miklos Szeredi
2018-05-23  8:11             ` Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.