All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] btrfs: inode creation cleanups and fixes
@ 2022-03-03 23:18 Omar Sandoval
  2022-03-03 23:18 ` [PATCH 01/12] btrfs: reserve correct number of items for unlink and rmdir Omar Sandoval
                   ` (12 more replies)
  0 siblings, 13 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

This series contains several cleanups and fixes for our inode creation
codepaths. The main motivation for this is preparation for fscrypt
support (in particular, setting up the fscrypt context at inode creation
time). But, it also reduces a lot of code duplication and fixes some
minor bugs, so it's worth getting in ahead of time.

Patch 12 is the main change, which unifies and simplifies all of our
inode creation codepaths. Patches 1-11 are small cleanups that I was
able to peel off of the big patch.

Thanks!

Omar Sandoval (12):
  btrfs: reserve correct number of items for unlink and rmdir
  btrfs: reserve correct number of items for rename
  btrfs: get rid of btrfs_add_nondir()
  btrfs: remove unnecessary btrfs_i_size_write(0) calls
  btrfs: remove unnecessary inode_set_bytes(0) call
  btrfs: remove unnecessary set_nlink() in btrfs_create_subvol_root()
  btrfs: remove unused mnt_userns parameter from __btrfs_set_acl
  btrfs: remove redundant name and name_len parameters to create_subvol
  btrfs: don't pass parent objectid to btrfs_new_inode() explicitly
  btrfs: move btrfs_get_free_objectid() call into btrfs_new_inode()
  btrfs: set inode flags earlier in btrfs_new_inode()
  btrfs: rework inode creation to fix several issues

 fs/btrfs/acl.c   |  39 +-
 fs/btrfs/ctree.h |  39 +-
 fs/btrfs/inode.c | 944 +++++++++++++++++++++++------------------------
 fs/btrfs/ioctl.c | 175 ++++-----
 fs/btrfs/props.c |  40 +-
 fs/btrfs/props.h |   4 -
 6 files changed, 581 insertions(+), 660 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/12] btrfs: reserve correct number of items for unlink and rmdir
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 02/12] btrfs: reserve correct number of items for rename Omar Sandoval
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

__btrfs_unlink_inode() calls btrfs_update_inode() on the parent
directory in order to update its size and sequence number. Make sure we
account for it.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2e7143ff5523..2fb8aa36a9ac 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4212,8 +4212,9 @@ static struct btrfs_trans_handle *__unlink_start_trans(struct inode *dir)
 	 * 1 for the dir index
 	 * 1 for the inode ref
 	 * 1 for the inode
+	 * 1 for the parent inode
 	 */
-	return btrfs_start_transaction_fallback_global_rsv(root, 5);
+	return btrfs_start_transaction_fallback_global_rsv(root, 6);
 }
 
 static int btrfs_unlink(struct inode *dir, struct dentry *dentry)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/12] btrfs: reserve correct number of items for rename
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
  2022-03-03 23:18 ` [PATCH 01/12] btrfs: reserve correct number of items for unlink and rmdir Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 03/12] btrfs: get rid of btrfs_add_nondir() Omar Sandoval
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

btrfs_rename() and btrfs_rename_exchange() don't account for enough
items. Replace the incorrect explanations with a specific breakdown of
the number of items and account them accurately.

Note that this glosses over RENAME_WHITEOUT because the next commit is
going to rework that, too.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 86 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 67 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2fb8aa36a9ac..be51630160f5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9058,6 +9058,7 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
 	struct btrfs_trans_handle *trans;
+	unsigned int trans_num_items;
 	struct btrfs_root *root = BTRFS_I(old_dir)->root;
 	struct btrfs_root *dest = BTRFS_I(new_dir)->root;
 	struct inode *new_inode = new_dentry->d_inode;
@@ -9089,14 +9090,37 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 		down_read(&fs_info->subvol_sem);
 
 	/*
-	 * We want to reserve the absolute worst case amount of items.  So if
-	 * both inodes are subvols and we need to unlink them then that would
-	 * require 4 item modifications, but if they are both normal inodes it
-	 * would require 5 item modifications, so we'll assume their normal
-	 * inodes.  So 5 * 2 is 10, plus 2 for the new links, so 12 total items
-	 * should cover the worst case number of items we'll modify.
+	 * For each inode:
+	 * 1 to remove old dir item
+	 * 1 to remove old dir index
+	 * 1 to add new dir item
+	 * 1 to add new dir index
+	 * 1 to update parent inode
+	 *
+	 * If the parents are the same, we only need to account for one
 	 */
-	trans = btrfs_start_transaction(root, 12);
+	trans_num_items = old_dir == new_dir ? 9 : 10;
+	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
+		/*
+		 * 1 to remove old root ref
+		 * 1 to remove old root backref
+		 * 1 to add new root ref
+		 * 1 to add new root backref
+		 */
+		trans_num_items += 4;
+	} else {
+		/*
+		 * 1 to update inode item
+		 * 1 to remove old inode ref
+		 * 1 to add new inode ref
+		 */
+		trans_num_items += 3;
+	}
+	if (new_ino == BTRFS_FIRST_FREE_OBJECTID)
+		trans_num_items += 4;
+	else
+		trans_num_items += 3;
+	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
 		goto out_notrans;
@@ -9375,21 +9399,45 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 	if (new_inode && S_ISREG(old_inode->i_mode) && new_inode->i_size)
 		filemap_flush(old_inode->i_mapping);
 
-	/* close the racy window with snapshot create/destroy ioctl */
-	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
+	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
+		/* close the racy window with snapshot create/destroy ioctl */
 		down_read(&fs_info->subvol_sem);
+		/*
+		 * 1 to remove old root ref
+		 * 1 to remove old root backref
+		 * 1 to add new root ref
+		 * 1 to add new root backref
+		 */
+		trans_num_items = 4;
+	} else {
+		/*
+		 * 1 to update inode
+		 * 1 to remove old inode ref
+		 * 1 to add new inode ref
+		 */
+		trans_num_items = 3;
+	}
 	/*
-	 * We want to reserve the absolute worst case amount of items.  So if
-	 * both inodes are subvols and we need to unlink them then that would
-	 * require 4 item modifications, but if they are both normal inodes it
-	 * would require 5 item modifications, so we'll assume they are normal
-	 * inodes.  So 5 * 2 is 10, plus 1 for the new link, so 11 total items
-	 * should cover the worst case number of items we'll modify.
-	 * If our rename has the whiteout flag, we need more 5 units for the
-	 * new inode (1 inode item, 1 inode ref, 2 dir items and 1 xattr item
-	 * when selinux is enabled).
+	 * 1 to remove old dir item
+	 * 1 to remove old dir index
+	 * 1 to update old parent inode
+	 * 1 to add new dir item
+	 * 1 to add new dir index
+	 * 1 to update new parent inode (if it's not the same as the old parent)
 	 */
-	trans_num_items = 11;
+	trans_num_items += 6;
+	if (new_dir != old_dir)
+		trans_num_items++;
+	if (new_inode) {
+		/*
+		 * 1 to update inode
+		 * 1 to remove inode ref
+		 * 1 to remove dir item
+		 * 1 to remove dir index
+		 * 1 to possibly add orphan item
+		 */
+		trans_num_items += 5;
+	}
 	if (flags & RENAME_WHITEOUT)
 		trans_num_items += 5;
 	trans = btrfs_start_transaction(root, trans_num_items);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/12] btrfs: get rid of btrfs_add_nondir()
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
  2022-03-03 23:18 ` [PATCH 01/12] btrfs: reserve correct number of items for unlink and rmdir Omar Sandoval
  2022-03-03 23:18 ` [PATCH 02/12] btrfs: reserve correct number of items for rename Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 04/12] btrfs: remove unnecessary btrfs_i_size_write(0) calls Omar Sandoval
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

This is a trivial wrapper around btrfs_add_link(). The only thing it
does other than moving arguments around is translating a > 0 return
value to -EEXIST. As far as I can tell, btrfs_add_link() won't return >
0 (and if it did, the existing callsites in, e.g., btrfs_mkdir() would
be broken). The check itself dates back to commit 2c90e5d65842 ("Btrfs:
still corruption hunting"), so it's probably left over from debugging.
Let's just get rid of btrfs_add_nondir().

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 33 +++++++++++----------------------
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index be51630160f5..9c838bdd51cc 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6355,18 +6355,6 @@ int btrfs_add_link(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-static int btrfs_add_nondir(struct btrfs_trans_handle *trans,
-			    struct btrfs_inode *dir, struct dentry *dentry,
-			    struct btrfs_inode *inode, int backref, u64 index)
-{
-	int err = btrfs_add_link(trans, dir, inode,
-				 dentry->d_name.name, dentry->d_name.len,
-				 backref, index);
-	if (err > 0)
-		err = -EEXIST;
-	return err;
-}
-
 static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 		       struct dentry *dentry, umode_t mode, dev_t rdev)
 {
@@ -6413,8 +6401,8 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	err = btrfs_add_nondir(trans, BTRFS_I(dir), dentry, BTRFS_I(inode),
-			0, index);
+	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
+			     dentry->d_name.name, dentry->d_name.len, 0, index);
 	if (err)
 		goto out_unlock;
 
@@ -6481,8 +6469,8 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	err = btrfs_add_nondir(trans, BTRFS_I(dir), dentry, BTRFS_I(inode),
-			0, index);
+	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
+			     dentry->d_name.name, dentry->d_name.len, 0, index);
 	if (err)
 		goto out_unlock;
 
@@ -6541,8 +6529,8 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
 	ihold(inode);
 	set_bit(BTRFS_INODE_COPY_EVERYTHING, &BTRFS_I(inode)->runtime_flags);
 
-	err = btrfs_add_nondir(trans, BTRFS_I(dir), dentry, BTRFS_I(inode),
-			1, index);
+	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
+			     dentry->d_name.name, dentry->d_name.len, 1, index);
 
 	if (err) {
 		drop_inode = 1;
@@ -9324,8 +9312,8 @@ static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 	if (ret)
 		goto out;
 
-	ret = btrfs_add_nondir(trans, BTRFS_I(dir), dentry,
-				BTRFS_I(inode), 0, index);
+	ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
+			     dentry->d_name.name, dentry->d_name.len, 0, index);
 	if (ret)
 		goto out;
 
@@ -9863,8 +9851,9 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	 * elsewhere above.
 	 */
 	if (!err)
-		err = btrfs_add_nondir(trans, BTRFS_I(dir), dentry,
-				BTRFS_I(inode), 0, index);
+		err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
+				     dentry->d_name.name, dentry->d_name.len, 0,
+				     index);
 	if (err)
 		goto out_unlock;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/12] btrfs: remove unnecessary btrfs_i_size_write(0) calls
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (2 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 03/12] btrfs: get rid of btrfs_add_nondir() Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 05/12] btrfs: remove unnecessary inode_set_bytes(0) call Omar Sandoval
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

btrfs_new_inode() always returns an inode with i_size and disk_i_size
set to 0 (via inode_init_always() and btrfs_alloc_inode(),
respectively). Remove the unnecessary calls to btrfs_i_size_write() in
btrfs_mkdir() and btrfs_create_subvol_root().

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9c838bdd51cc..244e8d6ed5e4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6606,7 +6606,6 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_fail;
 
-	btrfs_i_size_write(BTRFS_I(inode), 0);
 	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
 	if (err)
 		goto out_fail;
@@ -8787,7 +8786,6 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 	inode->i_fop = &btrfs_dir_file_operations;
 
 	set_nlink(inode, 1);
-	btrfs_i_size_write(BTRFS_I(inode), 0);
 	unlock_new_inode(inode);
 
 	err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/12] btrfs: remove unnecessary inode_set_bytes(0) call
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (3 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 04/12] btrfs: remove unnecessary btrfs_i_size_write(0) calls Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 06/12] btrfs: remove unnecessary set_nlink() in btrfs_create_subvol_root() Omar Sandoval
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

new_inode() always returns an inode with i_blocks and i_bytes set to 0
(via inode_init_always()). Remove the unnecessary call to
inode_set_bytes() in btrfs_new_inode().

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 244e8d6ed5e4..b7d54b0b2fb5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6207,7 +6207,6 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 		goto fail_unlock;
 
 	inode_init_owner(mnt_userns, inode, dir, mode);
-	inode_set_bytes(inode, 0);
 
 	inode->i_mtime = current_time(inode);
 	inode->i_atime = inode->i_mtime;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/12] btrfs: remove unnecessary set_nlink() in btrfs_create_subvol_root()
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (4 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 05/12] btrfs: remove unnecessary inode_set_bytes(0) call Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 07/12] btrfs: remove unused mnt_userns parameter from __btrfs_set_acl Omar Sandoval
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

btrfs_new_inode() already returns an inode with nlink set to 1 (via
inode_init_always()). Get rid of the unnecessary set.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b7d54b0b2fb5..a9dabe9e5500 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8784,7 +8784,6 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 	inode->i_op = &btrfs_dir_inode_operations;
 	inode->i_fop = &btrfs_dir_file_operations;
 
-	set_nlink(inode, 1);
 	unlock_new_inode(inode);
 
 	err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/12] btrfs: remove unused mnt_userns parameter from __btrfs_set_acl
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (5 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 06/12] btrfs: remove unnecessary set_nlink() in btrfs_create_subvol_root() Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 08/12] btrfs: remove redundant name and name_len parameters to create_subvol Omar Sandoval
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Commit 4a8b34afa9c9 ("btrfs: handle ACLs on idmapped mounts") added this
parameter but didn't use it. __btrfs_set_acl() is the low-level helper
that writes an ACL to disk. The higher-level btrfs_set_acl() is the one
that translates the ACL based on the user namespace.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/acl.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index 0a0d0eccee4e..a6909ec9bc38 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -56,7 +56,6 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu)
 }
 
 static int __btrfs_set_acl(struct btrfs_trans_handle *trans,
-			   struct user_namespace *mnt_userns,
 			   struct inode *inode, struct posix_acl *acl, int type)
 {
 	int ret, size = 0;
@@ -123,7 +122,7 @@ int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
 		if (ret)
 			return ret;
 	}
-	ret = __btrfs_set_acl(NULL, mnt_userns, inode, acl, type);
+	ret = __btrfs_set_acl(NULL, inode, acl, type);
 	if (ret)
 		inode->i_mode = old_mode;
 	return ret;
@@ -144,14 +143,14 @@ int btrfs_init_acl(struct btrfs_trans_handle *trans,
 		return ret;
 
 	if (default_acl) {
-		ret = __btrfs_set_acl(trans, &init_user_ns, inode, default_acl,
+		ret = __btrfs_set_acl(trans, inode, default_acl,
 				      ACL_TYPE_DEFAULT);
 		posix_acl_release(default_acl);
 	}
 
 	if (acl) {
 		if (!ret)
-			ret = __btrfs_set_acl(trans, &init_user_ns, inode, acl,
+			ret = __btrfs_set_acl(trans, inode, acl,
 					      ACL_TYPE_ACCESS);
 		posix_acl_release(acl);
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/12] btrfs: remove redundant name and name_len parameters to create_subvol
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (6 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 07/12] btrfs: remove unused mnt_userns parameter from __btrfs_set_acl Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:18 ` [PATCH 09/12] btrfs: don't pass parent objectid to btrfs_new_inode() explicitly Omar Sandoval
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

The passed dentry already contains the name.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/ioctl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 238cee5b5254..3cea5530ad83 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -546,9 +546,10 @@ int __pure btrfs_is_empty_uuid(u8 *uuid)
 
 static noinline int create_subvol(struct user_namespace *mnt_userns,
 				  struct inode *dir, struct dentry *dentry,
-				  const char *name, int namelen,
 				  struct btrfs_qgroup_inherit *inherit)
 {
+	const char *name = dentry->d_name.name;
+	int namelen = dentry->d_name.len;
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_trans_handle *trans;
 	struct btrfs_key key;
@@ -983,7 +984,7 @@ static noinline int btrfs_mksubvol(const struct path *parent,
 	if (snap_src)
 		error = create_snapshot(snap_src, dir, dentry, readonly, inherit);
 	else
-		error = create_subvol(mnt_userns, dir, dentry, name, namelen, inherit);
+		error = create_subvol(mnt_userns, dir, dentry, inherit);
 
 	if (!error)
 		fsnotify_mkdir(dir, dentry);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/12] btrfs: don't pass parent objectid to btrfs_new_inode() explicitly
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (7 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 08/12] btrfs: remove redundant name and name_len parameters to create_subvol Omar Sandoval
@ 2022-03-03 23:18 ` Omar Sandoval
  2022-03-03 23:19 ` [PATCH 10/12] btrfs: move btrfs_get_free_objectid() call into btrfs_new_inode() Omar Sandoval
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

For everything other than a subvolume root inode, we get the parent
objectid from the parent directory. For the subvolume root inode, the
parent objectid is the same as the inode's objectid. We can find this
within btrfs_new_inode() instead of passing it.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a9dabe9e5500..9c845c67421a 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6095,8 +6095,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 				     struct user_namespace *mnt_userns,
 				     struct inode *dir,
 				     const char *name, int name_len,
-				     u64 ref_objectid, u64 objectid,
-				     umode_t mode, u64 *index)
+				     u64 objectid, umode_t mode, u64 *index)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct inode *inode;
@@ -6182,7 +6181,10 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 		 */
 		key[1].objectid = objectid;
 		key[1].type = BTRFS_INODE_REF_KEY;
-		key[1].offset = ref_objectid;
+		if (dir)
+			key[1].offset = btrfs_ino(BTRFS_I(dir));
+		else
+			key[1].offset = objectid;
 
 		sizes[1] = name_len + sizeof(*ref);
 	}
@@ -6380,7 +6382,7 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
-			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
+			objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -6444,7 +6446,7 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
-			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
+			objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -6589,7 +6591,7 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
-			btrfs_ino(BTRFS_I(dir)), objectid,
+			objectid,
 			S_IFDIR | mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
@@ -8776,7 +8778,7 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 		return err;
 
 	inode = btrfs_new_inode(trans, new_root, mnt_userns, NULL, "..", 2,
-				ino, ino,
+				ino,
 				S_IFDIR | (~current_umask() & S_IRWXUGO),
 				&index);
 	if (IS_ERR(inode))
@@ -9289,7 +9291,6 @@ static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 				dentry->d_name.name,
 				dentry->d_name.len,
-				btrfs_ino(BTRFS_I(dir)),
 				objectid,
 				S_IFCHR | WHITEOUT_MODE,
 				&index);
@@ -9783,7 +9784,7 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 				dentry->d_name.name, dentry->d_name.len,
-				btrfs_ino(BTRFS_I(dir)), objectid,
+				objectid,
 				S_IFLNK | S_IRWXUGO, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
@@ -10134,7 +10135,7 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 		goto out;
 
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir, NULL, 0,
-			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
+			objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
 		inode = NULL;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/12] btrfs: move btrfs_get_free_objectid() call into btrfs_new_inode()
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (8 preceding siblings ...)
  2022-03-03 23:18 ` [PATCH 09/12] btrfs: don't pass parent objectid to btrfs_new_inode() explicitly Omar Sandoval
@ 2022-03-03 23:19 ` Omar Sandoval
  2022-03-03 23:19 ` [PATCH 11/12] btrfs: set inode flags earlier in btrfs_new_inode() Omar Sandoval
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:19 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Every call of btrfs_new_inode() is immediately preceded by a call to
btrfs_get_free_objectid(). Since getting an inode number is part of
creating a new inode, this is better off being moved into
btrfs_new_inode(). While we're here, get rid of the comment about
reclaiming inode numbers, since we only did that when using the ino
cache, which was removed by commit 5297199a8bca ("btrfs: remove inode
number cache feature").

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 58 +++++++++---------------------------------------
 1 file changed, 11 insertions(+), 47 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9c845c67421a..289bb5176e09 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6095,13 +6095,14 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 				     struct user_namespace *mnt_userns,
 				     struct inode *dir,
 				     const char *name, int name_len,
-				     u64 objectid, umode_t mode, u64 *index)
+				     umode_t mode, u64 *index)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct inode *inode;
 	struct btrfs_inode_item *inode_item;
 	struct btrfs_key *location;
 	struct btrfs_path *path;
+	u64 objectid;
 	struct btrfs_inode_ref *ref;
 	struct btrfs_key key[2];
 	u32 sizes[2];
@@ -6129,10 +6130,12 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	if (!name)
 		set_nlink(inode, 0);
 
-	/*
-	 * we have to initialize this early, so we can reclaim the inode
-	 * number if we fail afterwards in this function.
-	 */
+	ret = btrfs_get_free_objectid(root, &objectid);
+	if (ret) {
+		btrfs_free_path(path);
+		iput(inode);
+		return ERR_PTR(ret);
+	}
 	inode->i_ino = objectid;
 
 	if (dir && name) {
@@ -6364,7 +6367,6 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct inode *inode = NULL;
 	int err;
-	u64 objectid;
 	u64 index = 0;
 
 	/*
@@ -6376,13 +6378,9 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	err = btrfs_get_free_objectid(root, &objectid);
-	if (err)
-		goto out_unlock;
-
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
-			objectid, mode, &index);
+			mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -6428,7 +6426,6 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct inode *inode = NULL;
 	int err;
-	u64 objectid;
 	u64 index = 0;
 
 	/*
@@ -6440,13 +6437,9 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	err = btrfs_get_free_objectid(root, &objectid);
-	if (err)
-		goto out_unlock;
-
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
-			objectid, mode, &index);
+			mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -6573,7 +6566,6 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	int err = 0;
-	u64 objectid = 0;
 	u64 index = 0;
 
 	/*
@@ -6585,13 +6577,8 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	err = btrfs_get_free_objectid(root, &objectid);
-	if (err)
-		goto out_fail;
-
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
-			objectid,
 			S_IFDIR | mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
@@ -8771,14 +8758,8 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 	struct inode *inode;
 	int err;
 	u64 index = 0;
-	u64 ino;
-
-	err = btrfs_get_free_objectid(new_root, &ino);
-	if (err < 0)
-		return err;
 
 	inode = btrfs_new_inode(trans, new_root, mnt_userns, NULL, "..", 2,
-				ino,
 				S_IFDIR | (~current_umask() & S_IRWXUGO),
 				&index);
 	if (IS_ERR(inode))
@@ -9281,17 +9262,11 @@ static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 {
 	int ret;
 	struct inode *inode;
-	u64 objectid;
 	u64 index;
 
-	ret = btrfs_get_free_objectid(root, &objectid);
-	if (ret)
-		return ret;
-
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 				dentry->d_name.name,
 				dentry->d_name.len,
-				objectid,
 				S_IFCHR | WHITEOUT_MODE,
 				&index);
 
@@ -9755,7 +9730,6 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_key key;
 	struct inode *inode = NULL;
 	int err;
-	u64 objectid;
 	u64 index = 0;
 	int name_len;
 	int datasize;
@@ -9778,13 +9752,8 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	err = btrfs_get_free_objectid(root, &objectid);
-	if (err)
-		goto out_unlock;
-
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 				dentry->d_name.name, dentry->d_name.len,
-				objectid,
 				S_IFLNK | S_IRWXUGO, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
@@ -10119,7 +10088,6 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct inode *inode = NULL;
-	u64 objectid;
 	u64 index;
 	int ret = 0;
 
@@ -10130,12 +10098,8 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
 
-	ret = btrfs_get_free_objectid(root, &objectid);
-	if (ret)
-		goto out;
-
 	inode = btrfs_new_inode(trans, root, mnt_userns, dir, NULL, 0,
-			objectid, mode, &index);
+			mode, &index);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
 		inode = NULL;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 11/12] btrfs: set inode flags earlier in btrfs_new_inode()
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (9 preceding siblings ...)
  2022-03-03 23:19 ` [PATCH 10/12] btrfs: move btrfs_get_free_objectid() call into btrfs_new_inode() Omar Sandoval
@ 2022-03-03 23:19 ` Omar Sandoval
  2022-03-03 23:19 ` [PATCH 12/12] btrfs: rework inode creation to fix several issues Omar Sandoval
  2022-03-07 14:41 ` [PATCH 00/12] btrfs: inode creation cleanups and fixes Sweet Tea Dorminy
  12 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:19 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

btrfs_new_inode() inherits the inode flags from the parent directory and
the mount options _after_ we fill the inode item. This works because all
of the callers of btrfs_new_inode() make further changes to the inode
and then call btrfs_update_inode(). It'd be better to fully initialize
the inode once to avoid the extra update, so as a first step, set the
inode flags _before_ filling the inode item.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 289bb5176e09..c47bdada2440 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6161,6 +6161,16 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	BTRFS_I(inode)->generation = trans->transid;
 	inode->i_generation = BTRFS_I(inode)->generation;
 
+	btrfs_inherit_iflags(inode, dir);
+
+	if (S_ISREG(mode)) {
+		if (btrfs_test_opt(fs_info, NODATASUM))
+			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
+		if (btrfs_test_opt(fs_info, NODATACOW))
+			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATACOW |
+				BTRFS_INODE_NODATASUM;
+	}
+
 	/*
 	 * We could have gotten an inode number from somebody who was fsynced
 	 * and then removed in this same transaction, so let's just set full
@@ -6236,16 +6246,6 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(path->nodes[0]);
 	btrfs_free_path(path);
 
-	btrfs_inherit_iflags(inode, dir);
-
-	if (S_ISREG(mode)) {
-		if (btrfs_test_opt(fs_info, NODATASUM))
-			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
-		if (btrfs_test_opt(fs_info, NODATACOW))
-			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATACOW |
-				BTRFS_INODE_NODATASUM;
-	}
-
 	inode_tree_add(inode);
 
 	trace_btrfs_inode_new(inode);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 12/12] btrfs: rework inode creation to fix several issues
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (10 preceding siblings ...)
  2022-03-03 23:19 ` [PATCH 11/12] btrfs: set inode flags earlier in btrfs_new_inode() Omar Sandoval
@ 2022-03-03 23:19 ` Omar Sandoval
  2022-03-07 12:28   ` David Sterba
  2022-03-07 13:25   ` Sweet Tea Dorminy
  2022-03-07 14:41 ` [PATCH 00/12] btrfs: inode creation cleanups and fixes Sweet Tea Dorminy
  12 siblings, 2 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-03 23:19 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Inode creation currently has several issues:

* Parts of the creation code path are duplicated between every operation
  that creates an inode: create, mknod, mkdir, RENAME_WHITEOUT, symlink,
  O_TMPFILE, and subvolume creation. This makes it hard to fix bugs or
  add features to inode creation (in particular, preparatory work for
  fscrypt).
* Subvolume creation in particular duplicates code in a way that
  diverges from the usual inode creation behavior when it comes to
  inherting flags, properties, permissions, and ACLs.
* We insert an inode item first and then modify the inode and update the
  item again. This is a silly inefficiency.
* Creation does not account for the compression property, ACLs, or the
  parent inode item when starting the transaction.
* We can leak an inode on disk in some error cases. For example, in
  btrfs_create(), if btrfs_new_inode() succeeds, then we have inserted
  an inode item and its inode ref. However, if something after that
  fails (e.g., btrfs_init_inode_security()), then we end the transaction
  and then decrement the link count on the inode. If the transaction is
  committed and the system crashes before the failed inode is deleted,
  then we leak that inode on disk.

I tried to fix these issues in their own separate patches, but they're
all interconnected and much easier to fix all in one go. The solution is
to unify all of the inode creation code paths into a single code path
with two steps:

1. btrfs_new_inode_prepare(), which does the correct accounting and
   security preparations (currently POSIX ACLs, but this is where
   fscrypt context setup will go).
2. btrfs_create_new_inode(), which does the actual B-tree modifications
   for the inode and its name.

This is a straightforward fix for the code duplication issue, the
unnecessary inode update issue, and the accounting issue. It explicitly
does _not_ change the existing non-standard subvolume behavior around
flags, permissions, ACLs, etc., but it makes those differences more
clear in the code and documents them so that we can discuss whether they
should be changed. Finally, it fixes the inode leak issue by aborting
the transaction when we can't recover more gracefully. This shouldn't be
an issue now that we're accounting space correctly.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/acl.c   |  36 +--
 fs/btrfs/ctree.h |  39 ++-
 fs/btrfs/inode.c | 801 +++++++++++++++++++++++------------------------
 fs/btrfs/ioctl.c | 174 +++++-----
 fs/btrfs/props.c |  40 +--
 fs/btrfs/props.h |   4 -
 6 files changed, 508 insertions(+), 586 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index a6909ec9bc38..548d6a5477b4 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -55,8 +55,8 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu)
 	return acl;
 }
 
-static int __btrfs_set_acl(struct btrfs_trans_handle *trans,
-			   struct inode *inode, struct posix_acl *acl, int type)
+int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
+		    struct posix_acl *acl, int type)
 {
 	int ret, size = 0;
 	const char *name;
@@ -127,35 +127,3 @@ int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
 		inode->i_mode = old_mode;
 	return ret;
 }
-
-int btrfs_init_acl(struct btrfs_trans_handle *trans,
-		   struct inode *inode, struct inode *dir)
-{
-	struct posix_acl *default_acl, *acl;
-	int ret = 0;
-
-	/* this happens with subvols */
-	if (!dir)
-		return 0;
-
-	ret = posix_acl_create(dir, &inode->i_mode, &default_acl, &acl);
-	if (ret)
-		return ret;
-
-	if (default_acl) {
-		ret = __btrfs_set_acl(trans, inode, default_acl,
-				      ACL_TYPE_DEFAULT);
-		posix_acl_release(default_acl);
-	}
-
-	if (acl) {
-		if (!ret)
-			ret = __btrfs_set_acl(trans, inode, acl,
-					      ACL_TYPE_ACCESS);
-		posix_acl_release(acl);
-	}
-
-	if (!default_acl && !acl)
-		cache_no_acl(inode);
-	return ret;
-}
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4db17bd05a21..6fa63dfac573 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3254,10 +3254,30 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
 int btrfs_set_extent_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 			      unsigned int extra_bits,
 			      struct extent_state **cached_state);
-int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *new_root,
-			     struct btrfs_root *parent_root,
-			     struct user_namespace *mnt_userns);
+
+struct btrfs_new_inode_args {
+	/* Input */
+	struct inode *dir;
+	struct dentry *dentry;
+	struct inode *inode;
+	bool orphan;
+	bool subvol;
+
+	/*
+	 * Output from btrfs_new_inode_prepare(), input to
+	 * btrfs_create_new_inode().
+	 */
+	struct posix_acl *default_acl;
+	struct posix_acl *acl;
+};
+int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
+			    unsigned int *trans_num_items);
+int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
+			   struct btrfs_new_inode_args *args);
+void btrfs_new_inode_args_free(struct btrfs_new_inode_args *args);
+struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
+				     struct inode *dir);
+
  void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
 			       unsigned *bits);
 void btrfs_clear_delalloc_extent(struct inode *inode,
@@ -3815,15 +3835,16 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag)
 struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu);
 int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
 		  struct posix_acl *acl, int type);
-int btrfs_init_acl(struct btrfs_trans_handle *trans,
-		   struct inode *inode, struct inode *dir);
+int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
+		    struct posix_acl *acl, int type);
 #else
 #define btrfs_get_acl NULL
 #define btrfs_set_acl NULL
-static inline int btrfs_init_acl(struct btrfs_trans_handle *trans,
-				 struct inode *inode, struct inode *dir)
+static inline int __btrfs_set_acl(struct btrfs_trans_handle *trans,
+				  struct inode *inode, struct posix_acl *acl,
+				  int type)
 {
-	return 0;
+	return -EOPNOTSUPP;
 }
 #endif
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c47bdada2440..2f17e0598664 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -223,14 +223,26 @@ static int btrfs_dirty_inode(struct inode *inode);
 
 static int btrfs_init_inode_security(struct btrfs_trans_handle *trans,
 				     struct inode *inode,  struct inode *dir,
-				     const struct qstr *qstr)
+				     const struct qstr *qstr,
+				     struct posix_acl *default_acl,
+				     struct posix_acl *acl)
 {
 	int err;
 
-	err = btrfs_init_acl(trans, inode, dir);
-	if (!err)
-		err = btrfs_xattr_security_init(trans, inode, dir, qstr);
-	return err;
+	if (default_acl) {
+		err = __btrfs_set_acl(trans, inode, default_acl,
+				      ACL_TYPE_DEFAULT);
+		if (err)
+			return err;
+	}
+	if (acl) {
+		err = __btrfs_set_acl(trans, inode, acl, ACL_TYPE_ACCESS);
+		if (err)
+			return err;
+	}
+	if (!default_acl && !acl)
+		cache_no_acl(inode);
+	return btrfs_xattr_security_init(trans, inode, dir, qstr);
 }
 
 /*
@@ -6059,6 +6071,49 @@ static int btrfs_insert_inode_locked(struct inode *inode)
 		   btrfs_find_actor, &args);
 }
 
+int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
+			    unsigned int *trans_num_items)
+{
+	struct inode *dir = args->dir;
+	struct inode *inode = args->inode;
+	int ret;
+
+	ret = posix_acl_create(dir, &inode->i_mode, &args->default_acl,
+			       &args->acl);
+	if (ret)
+		return ret;
+
+	*trans_num_items = 1; /* 1 to add inode item */
+	if (BTRFS_I(dir)->prop_compress)
+		(*trans_num_items)++; /* 1 to add compression property */
+	if (args->default_acl)
+		(*trans_num_items)++; /* 1 to add default ACL xattr */
+	if (args->acl)
+		(*trans_num_items)++; /* 1 to add access ACL xattr */
+#ifdef CONFIG_SECURITY
+	if (dir->i_security)
+		(*trans_num_items)++; /* 1 to add LSM xattr */
+#endif
+	if (args->orphan) {
+		(*trans_num_items)++; /* 1 to add orphan item */
+	} else {
+		/*
+		 * 1 to add inode ref
+		 * 1 to add dir item
+		 * 1 to add dir index
+		 * 1 to update parent inode item
+		 */
+		*trans_num_items += 4;
+	}
+	return 0;
+}
+
+void btrfs_new_inode_args_free(struct btrfs_new_inode_args *args)
+{
+	posix_acl_release(args->acl);
+	posix_acl_release(args->default_acl);
+}
+
 /*
  * Inherit flags from the parent inode.
  *
@@ -6068,9 +6123,6 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 {
 	unsigned int flags;
 
-	if (!dir)
-		return;
-
 	flags = BTRFS_I(dir)->flags;
 
 	if (flags & BTRFS_INODE_NOCOMPRESS) {
@@ -6090,65 +6142,52 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 	btrfs_sync_inode_flags_to_i_flags(inode);
 }
 
-static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
-				     struct btrfs_root *root,
-				     struct user_namespace *mnt_userns,
-				     struct inode *dir,
-				     const char *name, int name_len,
-				     umode_t mode, u64 *index)
+int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
+			   struct btrfs_new_inode_args *args)
 {
-	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct inode *inode;
-	struct btrfs_inode_item *inode_item;
-	struct btrfs_key *location;
+	struct inode *dir = args->dir;
+	struct inode *inode = args->inode;
+	const char *name = args->orphan ? NULL : args->dentry->d_name.name;
+	int name_len = args->orphan ? 0 : args->dentry->d_name.len;
+	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_path *path;
 	u64 objectid;
-	struct btrfs_inode_ref *ref;
+	struct btrfs_key *location;
+	struct btrfs_root *root;
 	struct btrfs_key key[2];
 	u32 sizes[2];
 	struct btrfs_item_batch batch;
+	struct btrfs_inode_item *inode_item;
+	struct btrfs_inode_ref *ref;
 	unsigned long ptr;
-	unsigned int nofs_flag;
 	int ret;
 
 	path = btrfs_alloc_path();
 	if (!path)
-		return ERR_PTR(-ENOMEM);
+		return -ENOMEM;
 
-	nofs_flag = memalloc_nofs_save();
-	inode = new_inode(fs_info->sb);
-	memalloc_nofs_restore(nofs_flag);
-	if (!inode) {
-		btrfs_free_path(path);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	/*
-	 * O_TMPFILE, set link count to 0, so that after this point,
-	 * we fill in an inode item with the correct link count.
-	 */
-	if (!name)
-		set_nlink(inode, 0);
+	if (!args->subvol)
+		BTRFS_I(inode)->root = btrfs_grab_root(BTRFS_I(dir)->root);
+	root = BTRFS_I(inode)->root;
 
 	ret = btrfs_get_free_objectid(root, &objectid);
-	if (ret) {
-		btrfs_free_path(path);
-		iput(inode);
-		return ERR_PTR(ret);
-	}
+	if (ret)
+		goto out;
 	inode->i_ino = objectid;
 
-	if (dir && name) {
+	if (args->orphan) {
+		/*
+		 * O_TMPFILE, set link count to 0, so that after this point, we
+		 * fill in an inode item with the correct link count.
+		 */
+		set_nlink(inode, 0);
+	} else {
 		trace_btrfs_inode_request(dir);
 
-		ret = btrfs_set_inode_index(BTRFS_I(dir), index);
-		if (ret) {
-			btrfs_free_path(path);
-			iput(inode);
-			return ERR_PTR(ret);
-		}
-	} else if (dir) {
-		*index = 0;
+		ret = btrfs_set_inode_index(BTRFS_I(dir),
+					    &BTRFS_I(inode)->dir_index);
+		if (ret)
+			goto out;
 	}
 	/*
 	 * index_cnt is ignored for everything but a dir,
@@ -6156,14 +6195,18 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	 * number
 	 */
 	BTRFS_I(inode)->index_cnt = 2;
-	BTRFS_I(inode)->dir_index = *index;
-	BTRFS_I(inode)->root = btrfs_grab_root(root);
 	BTRFS_I(inode)->generation = trans->transid;
 	inode->i_generation = BTRFS_I(inode)->generation;
 
-	btrfs_inherit_iflags(inode, dir);
+	/*
+	 * Subvolumes don't inherit flags from their parent directory.
+	 * Originally this was probably by accident, but we probably can't
+	 * change it now.
+	 */
+	if (!args->subvol)
+		btrfs_inherit_iflags(inode, dir);
 
-	if (S_ISREG(mode)) {
+	if (S_ISREG(inode->i_mode)) {
 		if (btrfs_test_opt(fs_info, NODATASUM))
 			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
 		if (btrfs_test_opt(fs_info, NODATACOW))
@@ -6171,6 +6214,57 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 				BTRFS_INODE_NODATASUM;
 	}
 
+	location = &BTRFS_I(inode)->location;
+	location->objectid = objectid;
+	location->offset = 0;
+	location->type = BTRFS_INODE_ITEM_KEY;
+
+	ret = btrfs_insert_inode_locked(inode);
+	if (ret < 0) {
+		if (!args->orphan)
+			BTRFS_I(dir)->index_cnt--;
+		goto out;
+	}
+
+	if (args->subvol) {
+		struct inode *parent;
+
+		/*
+		 * Subvolumes inherit properties from their parent subvolume,
+		 * not the directory they were created in.
+		 */
+		parent = btrfs_iget(fs_info->sb, BTRFS_FIRST_FREE_OBJECTID,
+				    BTRFS_I(dir)->root);
+		if (IS_ERR(parent)) {
+			ret = PTR_ERR(parent);
+		} else {
+			ret = btrfs_inode_inherit_props(trans, inode, parent);
+			iput(parent);
+		}
+	} else {
+		ret = btrfs_inode_inherit_props(trans, inode, dir);
+	}
+	if (ret) {
+		btrfs_err(fs_info,
+			  "error inheriting props for ino %llu (root %llu): %d",
+			  btrfs_ino(BTRFS_I(inode)), root->root_key.objectid,
+			  ret);
+	}
+
+	/*
+	 * Subvolumes don't inherit ACLs or get passed to the LSM. This is
+	 * probably a bug.
+	 */
+	if (!args->subvol) {
+		ret = btrfs_init_inode_security(trans, inode, dir,
+						&args->dentry->d_name,
+						args->default_acl, args->acl);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			goto discard;
+		}
+	}
+
 	/*
 	 * We could have gotten an inode number from somebody who was fsynced
 	 * and then removed in this same transaction, so let's just set full
@@ -6185,7 +6279,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 
 	sizes[0] = sizeof(struct btrfs_inode_item);
 
-	if (name) {
+	if (!args->orphan) {
 		/*
 		 * Start new inodes with an inode_ref. This is slightly more
 		 * efficient for small numbers of hard links since they will
@@ -6194,57 +6288,61 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 		 */
 		key[1].objectid = objectid;
 		key[1].type = BTRFS_INODE_REF_KEY;
-		if (dir)
-			key[1].offset = btrfs_ino(BTRFS_I(dir));
-		else
+		if (args->subvol) {
 			key[1].offset = objectid;
-
-		sizes[1] = name_len + sizeof(*ref);
-	}
-
-	location = &BTRFS_I(inode)->location;
-	location->objectid = objectid;
-	location->offset = 0;
-	location->type = BTRFS_INODE_ITEM_KEY;
-
-	ret = btrfs_insert_inode_locked(inode);
-	if (ret < 0) {
-		iput(inode);
-		goto fail;
+			sizes[1] = 2 + sizeof(*ref);
+		} else {
+			key[1].offset = btrfs_ino(BTRFS_I(dir));
+			sizes[1] = name_len + sizeof(*ref);
+		}
 	}
 
 	batch.keys = &key[0];
 	batch.data_sizes = &sizes[0];
-	batch.total_data_size = sizes[0] + (name ? sizes[1] : 0);
-	batch.nr = name ? 2 : 1;
+	batch.total_data_size = sizes[0] + (args->orphan ? 0 : sizes[1]);
+	batch.nr = args->orphan ? 1 : 2;
 	ret = btrfs_insert_empty_items(trans, root, path, &batch);
-	if (ret != 0)
-		goto fail_unlock;
-
-	inode_init_owner(mnt_userns, inode, dir, mode);
+	if (ret != 0) {
+		btrfs_abort_transaction(trans, ret);
+		goto discard;
+	}
 
 	inode->i_mtime = current_time(inode);
 	inode->i_atime = inode->i_mtime;
 	inode->i_ctime = inode->i_mtime;
 	BTRFS_I(inode)->i_otime = inode->i_mtime;
 
+	/*
+	 * We're going to fill the inode item now, so at this point the inode
+	 * must be fully initialized.
+	 */
+
 	inode_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
 				  struct btrfs_inode_item);
 	memzero_extent_buffer(path->nodes[0], (unsigned long)inode_item,
 			     sizeof(*inode_item));
 	fill_inode_item(trans, path->nodes[0], inode_item, inode);
 
-	if (name) {
+	if (!args->orphan) {
 		ref = btrfs_item_ptr(path->nodes[0], path->slots[0] + 1,
 				     struct btrfs_inode_ref);
-		btrfs_set_inode_ref_name_len(path->nodes[0], ref, name_len);
-		btrfs_set_inode_ref_index(path->nodes[0], ref, *index);
 		ptr = (unsigned long)(ref + 1);
-		write_extent_buffer(path->nodes[0], name, ptr, name_len);
+		if (args->subvol) {
+			btrfs_set_inode_ref_name_len(path->nodes[0], ref, 2);
+			btrfs_set_inode_ref_index(path->nodes[0], ref, 0);
+			write_extent_buffer(path->nodes[0], "..", ptr, 2);
+		} else {
+			btrfs_set_inode_ref_name_len(path->nodes[0], ref,
+						     name_len);
+			btrfs_set_inode_ref_index(path->nodes[0], ref,
+						  BTRFS_I(inode)->dir_index);
+			write_extent_buffer(path->nodes[0], name, ptr,
+					    name_len);
+		}
 	}
 
 	btrfs_mark_buffer_dirty(path->nodes[0]);
-	btrfs_free_path(path);
+	btrfs_release_path(path);
 
 	inode_tree_add(inode);
 
@@ -6253,21 +6351,26 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 
 	btrfs_update_root_times(trans, root);
 
-	ret = btrfs_inode_inherit_props(trans, inode, dir);
-	if (ret)
-		btrfs_err(fs_info,
-			  "error inheriting props for ino %llu (root %llu): %d",
-			btrfs_ino(BTRFS_I(inode)), root->root_key.objectid, ret);
+	if (args->orphan) {
+		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
+	} else {
+		ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode), name,
+				     name_len, 0, BTRFS_I(inode)->dir_index);
+	}
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto discard;
+	}
 
-	return inode;
+	ret = 0;
+	goto out;
 
-fail_unlock:
+discard:
+	ihold(inode);
 	discard_new_inode(inode);
-fail:
-	if (dir && name)
-		BTRFS_I(dir)->index_cnt--;
+out:
 	btrfs_free_path(path);
-	return ERR_PTR(ret);
+	return ret;
 }
 
 /*
@@ -6359,125 +6462,72 @@ int btrfs_add_link(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
+static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
+			       struct inode *inode)
+{
+
+	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
+	struct btrfs_root *root = BTRFS_I(dir)->root;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+		.inode = inode,
+	};
+	unsigned int trans_num_items;
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
+	if (ret)
+		goto out_inode;
+
+	trans = btrfs_start_transaction(root, trans_num_items);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto out_new_inode_args;
+	}
+
+	ret = btrfs_create_new_inode(trans, &new_inode_args);
+	if (!ret)
+		d_instantiate_new(dentry, inode);
+
+	btrfs_end_transaction(trans);
+	btrfs_btree_balance_dirty(fs_info);
+out_new_inode_args:
+	btrfs_new_inode_args_free(&new_inode_args);
+out_inode:
+	if (ret)
+		iput(inode);
+	return ret;
+}
+
 static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 		       struct dentry *dentry, umode_t mode, dev_t rdev)
 {
-	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode = NULL;
-	int err;
-	u64 index = 0;
+	struct inode *inode;
 
-	/*
-	 * 2 for inode item and ref
-	 * 2 for dir items
-	 * 1 for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
-
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-			dentry->d_name.name, dentry->d_name.len,
-			mode, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
-		inode = NULL;
-		goto out_unlock;
-	}
-
-	/*
-	* If the active LSM wants to access the inode during
-	* d_instantiate it needs these. Smack checks to see
-	* if the filesystem supports xattrs by looking at the
-	* ops vector.
-	*/
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
 	inode->i_op = &btrfs_special_inode_operations;
 	init_special_inode(inode, inode->i_mode, rdev);
-
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
-	if (err)
-		goto out_unlock;
-
-	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			     dentry->d_name.name, dentry->d_name.len, 0, index);
-	if (err)
-		goto out_unlock;
-
-	btrfs_update_inode(trans, root, BTRFS_I(inode));
-	d_instantiate_new(dentry, inode);
-
-out_unlock:
-	btrfs_end_transaction(trans);
-	btrfs_btree_balance_dirty(fs_info);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
-	return err;
+	return btrfs_create_common(dir, dentry, inode);
 }
 
 static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 			struct dentry *dentry, umode_t mode, bool excl)
 {
-	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode = NULL;
-	int err;
-	u64 index = 0;
+	struct inode *inode;
 
-	/*
-	 * 2 for inode item and ref
-	 * 2 for dir items
-	 * 1 for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
-
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-			dentry->d_name.name, dentry->d_name.len,
-			mode, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
-		inode = NULL;
-		goto out_unlock;
-	}
-	/*
-	* If the active LSM wants to access the inode during
-	* d_instantiate it needs these. Smack checks to see
-	* if the filesystem supports xattrs by looking at the
-	* ops vector.
-	*/
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
 	inode->i_fop = &btrfs_file_operations;
 	inode->i_op = &btrfs_file_inode_operations;
 	inode->i_mapping->a_ops = &btrfs_aops;
-
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
-	if (err)
-		goto out_unlock;
-
-	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	if (err)
-		goto out_unlock;
-
-	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			     dentry->d_name.name, dentry->d_name.len, 0, index);
-	if (err)
-		goto out_unlock;
-
-	d_instantiate_new(dentry, inode);
-
-out_unlock:
-	btrfs_end_transaction(trans);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
-	btrfs_btree_balance_dirty(fs_info);
-	return err;
+	return btrfs_create_common(dir, dentry, inode);
 }
 
 static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
@@ -6561,59 +6611,15 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
 static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 		       struct dentry *dentry, umode_t mode)
 {
-	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct inode *inode = NULL;
-	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root = BTRFS_I(dir)->root;
-	int err = 0;
-	u64 index = 0;
+	struct inode *inode;
 
-	/*
-	 * 2 items for inode and ref
-	 * 2 items for dir items
-	 * 1 for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
-
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-			dentry->d_name.name, dentry->d_name.len,
-			S_IFDIR | mode, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
-		inode = NULL;
-		goto out_fail;
-	}
-
-	/* these must be set before we unlock the inode */
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, S_IFDIR | mode);
 	inode->i_op = &btrfs_dir_inode_operations;
 	inode->i_fop = &btrfs_dir_file_operations;
-
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
-	if (err)
-		goto out_fail;
-
-	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	if (err)
-		goto out_fail;
-
-	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			dentry->d_name.name,
-			dentry->d_name.len, 0, index);
-	if (err)
-		goto out_fail;
-
-	d_instantiate_new(dentry, inode);
-
-out_fail:
-	btrfs_end_transaction(trans);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
-	btrfs_btree_balance_dirty(fs_info);
-	return err;
+	return btrfs_create_common(dir, dentry, inode);
 }
 
 static noinline int uncompress_inline(struct btrfs_path *path,
@@ -8747,38 +8753,23 @@ static int btrfs_truncate(struct inode *inode, bool skip_writeback)
 	return ret;
 }
 
-/*
- * create a new subvolume directory/inode (helper for the ioctl).
- */
-int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *new_root,
-			     struct btrfs_root *parent_root,
-			     struct user_namespace *mnt_userns)
+struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
+				     struct inode *dir)
 {
 	struct inode *inode;
-	int err;
-	u64 index = 0;
 
-	inode = btrfs_new_inode(trans, new_root, mnt_userns, NULL, "..", 2,
-				S_IFDIR | (~current_umask() & S_IRWXUGO),
-				&index);
-	if (IS_ERR(inode))
-		return PTR_ERR(inode);
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return NULL;
+	/*
+	 * Subvolumes don't inherit the sgid bit or the parent's gid if the
+	 * parent's sgid bit is set. This is probably a bug.
+	 */
+	inode_init_owner(mnt_userns, inode, NULL,
+			 S_IFDIR | (~current_umask() & S_IRWXUGO));
 	inode->i_op = &btrfs_dir_inode_operations;
 	inode->i_fop = &btrfs_dir_file_operations;
-
-	unlock_new_inode(inode);
-
-	err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
-	if (err)
-		btrfs_err(new_root->fs_info,
-			  "error inheriting subvolume %llu properties: %d",
-			  new_root->root_key.objectid, err);
-
-	err = btrfs_update_inode(trans, new_root, BTRFS_I(inode));
-
-	iput(inode);
-	return err;
+	return inode;
 }
 
 struct inode *btrfs_alloc_inode(struct super_block *sb)
@@ -9254,49 +9245,19 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 	return ret;
 }
 
-static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
-				     struct btrfs_root *root,
-				     struct user_namespace *mnt_userns,
-				     struct inode *dir,
-				     struct dentry *dentry)
+static struct inode *new_whiteout_inode(struct user_namespace *mnt_userns,
+					struct inode *dir)
 {
-	int ret;
 	struct inode *inode;
-	u64 index;
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-				dentry->d_name.name,
-				dentry->d_name.len,
-				S_IFCHR | WHITEOUT_MODE,
-				&index);
-
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
-		return ret;
+	inode = new_inode(dir->i_sb);
+	if (inode) {
+		inode_init_owner(mnt_userns, inode, dir,
+				 S_IFCHR | WHITEOUT_MODE);
+		inode->i_op = &btrfs_special_inode_operations;
+		init_special_inode(inode, inode->i_mode, WHITEOUT_DEV);
 	}
-
-	inode->i_op = &btrfs_special_inode_operations;
-	init_special_inode(inode, inode->i_mode,
-		WHITEOUT_DEV);
-
-	ret = btrfs_init_inode_security(trans, inode, dir,
-				&dentry->d_name);
-	if (ret)
-		goto out;
-
-	ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			     dentry->d_name.name, dentry->d_name.len, 0, index);
-	if (ret)
-		goto out;
-
-	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
-out:
-	unlock_new_inode(inode);
-	if (ret)
-		inode_dec_link_count(inode);
-	iput(inode);
-
-	return ret;
+	return inode;
 }
 
 static int btrfs_rename(struct user_namespace *mnt_userns,
@@ -9305,6 +9266,10 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 			unsigned int flags)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
+	struct btrfs_new_inode_args whiteout_args = {
+		.dir = old_dir,
+		.dentry = old_dentry,
+	};
 	struct btrfs_trans_handle *trans;
 	unsigned int trans_num_items;
 	struct btrfs_root *root = BTRFS_I(old_dir)->root;
@@ -9359,6 +9324,18 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 	if (new_inode && S_ISREG(old_inode->i_mode) && new_inode->i_size)
 		filemap_flush(old_inode->i_mapping);
 
+	if (flags & RENAME_WHITEOUT) {
+		whiteout_args.inode = new_whiteout_inode(mnt_userns, old_dir);
+		if (!whiteout_args.inode)
+			return -ENOMEM;
+		ret = btrfs_new_inode_prepare(&whiteout_args, &trans_num_items);
+		if (ret)
+			goto out_whiteout_inode;
+	} else {
+		/* 1 to update the old parent inode. */
+		trans_num_items = 1;
+	}
+
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
 		/* close the racy window with snapshot create/destroy ioctl */
 		down_read(&fs_info->subvol_sem);
@@ -9368,24 +9345,25 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 		 * 1 to add new root ref
 		 * 1 to add new root backref
 		 */
-		trans_num_items = 4;
+		trans_num_items += 4;
 	} else {
 		/*
 		 * 1 to update inode
 		 * 1 to remove old inode ref
 		 * 1 to add new inode ref
 		 */
-		trans_num_items = 3;
+		trans_num_items += 3;
 	}
 	/*
 	 * 1 to remove old dir item
 	 * 1 to remove old dir index
-	 * 1 to update old parent inode
 	 * 1 to add new dir item
 	 * 1 to add new dir index
-	 * 1 to update new parent inode (if it's not the same as the old parent)
 	 */
-	trans_num_items += 6;
+	trans_num_items += 4;
+	/*
+	 * 1 to update new parent inode if it's not the same as the old parent
+	 */
 	if (new_dir != old_dir)
 		trans_num_items++;
 	if (new_inode) {
@@ -9398,8 +9376,6 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 		 */
 		trans_num_items += 5;
 	}
-	if (flags & RENAME_WHITEOUT)
-		trans_num_items += 5;
 	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
@@ -9495,12 +9471,14 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 				   rename_ctx.index, new_dentry->d_parent);
 
 	if (flags & RENAME_WHITEOUT) {
-		ret = btrfs_whiteout_for_rename(trans, root, mnt_userns,
-						old_dir, old_dentry);
-
+		ret = btrfs_create_new_inode(trans, &whiteout_args);
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
 			goto out_fail;
+		} else {
+			unlock_new_inode(whiteout_args.inode);
+			iput(whiteout_args.inode);
+			whiteout_args.inode = NULL;
 		}
 	}
 out_fail:
@@ -9509,7 +9487,11 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 out_notrans:
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
 		up_read(&fs_info->subvol_sem);
-
+	if (flags & RENAME_WHITEOUT)
+		btrfs_new_inode_args_free(&whiteout_args);
+out_whiteout_inode:
+	if (flags & RENAME_WHITEOUT)
+		iput(whiteout_args.inode);
 	return ret;
 }
 
@@ -9724,13 +9706,17 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 			 struct dentry *dentry, const char *symname)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
+	struct inode *inode;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+	};
+	unsigned int trans_num_items;
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct btrfs_path *path;
 	struct btrfs_key key;
-	struct inode *inode = NULL;
 	int err;
-	u64 index = 0;
 	int name_len;
 	int datasize;
 	unsigned long ptr;
@@ -9741,44 +9727,40 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	if (name_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info))
 		return -ENAMETOOLONG;
 
-	/*
-	 * 2 items for inode item and ref
-	 * 2 items for dir items
-	 * 1 item for updating parent inode item
-	 * 1 item for the inline extent item
-	 * 1 item for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 7);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, S_IFLNK | S_IRWXUGO);
+	inode->i_op = &btrfs_symlink_inode_operations;
+	inode_nohighmem(inode);
+	inode->i_mapping->a_ops = &btrfs_aops;
+	btrfs_i_size_write(BTRFS_I(inode), name_len);
+	inode_set_bytes(inode, name_len);
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-				dentry->d_name.name, dentry->d_name.len,
-				S_IFLNK | S_IRWXUGO, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
-		inode = NULL;
-		goto out_unlock;
+	new_inode_args.inode = inode;
+	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
+	if (err)
+		goto out_inode;
+	/* 1 additional item for the inline extent */
+	trans_num_items++;
+
+	trans = btrfs_start_transaction(root, trans_num_items);
+	if (IS_ERR(trans)) {
+		err = PTR_ERR(trans);
+		goto out_new_inode_args;
 	}
 
-	/*
-	* If the active LSM wants to access the inode during
-	* d_instantiate it needs these. Smack checks to see
-	* if the filesystem supports xattrs by looking at the
-	* ops vector.
-	*/
-	inode->i_fop = &btrfs_file_operations;
-	inode->i_op = &btrfs_file_inode_operations;
-	inode->i_mapping->a_ops = &btrfs_aops;
-
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
+	err = btrfs_create_new_inode(trans, &new_inode_args);
 	if (err)
-		goto out_unlock;
+		goto out;
 
 	path = btrfs_alloc_path();
 	if (!path) {
 		err = -ENOMEM;
-		goto out_unlock;
+		btrfs_abort_transaction(trans, err);
+		discard_new_inode(inode);
+		inode = NULL;
+		goto out;
 	}
 	key.objectid = btrfs_ino(BTRFS_I(inode));
 	key.offset = 0;
@@ -9787,8 +9769,11 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	err = btrfs_insert_empty_item(trans, root, path, &key,
 				      datasize);
 	if (err) {
+		btrfs_abort_transaction(trans, err);
 		btrfs_free_path(path);
-		goto out_unlock;
+		discard_new_inode(inode);
+		inode = NULL;
+		goto out;
 	}
 	leaf = path->nodes[0];
 	ei = btrfs_item_ptr(leaf, path->slots[0],
@@ -9806,32 +9791,16 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_free_path(path);
 
-	inode->i_op = &btrfs_symlink_inode_operations;
-	inode_nohighmem(inode);
-	inode_set_bytes(inode, name_len);
-	btrfs_i_size_write(BTRFS_I(inode), name_len);
-	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	/*
-	 * Last step, add directory indexes for our symlink inode. This is the
-	 * last step to avoid extra cleanup of these indexes if an error happens
-	 * elsewhere above.
-	 */
-	if (!err)
-		err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-				     dentry->d_name.name, dentry->d_name.len, 0,
-				     index);
-	if (err)
-		goto out_unlock;
-
 	d_instantiate_new(dentry, inode);
-
-out_unlock:
+	err = 0;
+out:
 	btrfs_end_transaction(trans);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
 	btrfs_btree_balance_dirty(fs_info);
+out_new_inode_args:
+	btrfs_new_inode_args_free(&new_inode_args);
+out_inode:
+	if (err)
+		iput(inode);
 	return err;
 }
 
@@ -10085,59 +10054,61 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 			 struct dentry *dentry, umode_t mode)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode = NULL;
-	u64 index;
-	int ret = 0;
-
-	/*
-	 * 5 units required for adding orphan entry
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
-
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir, NULL, 0,
-			mode, &index);
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
-		inode = NULL;
-		goto out;
-	}
+	struct inode *inode;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+		.orphan = true,
+	};
+	unsigned int trans_num_items;
+	struct btrfs_trans_handle *trans;
+	int ret;
 
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
 	inode->i_fop = &btrfs_file_operations;
 	inode->i_op = &btrfs_file_inode_operations;
 
 	inode->i_mapping->a_ops = &btrfs_aops;
 
-	ret = btrfs_init_inode_security(trans, inode, dir, NULL);
+	new_inode_args.inode = inode;
+	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
 	if (ret)
-		goto out;
+		goto out_inode;
 
-	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	if (ret)
-		goto out;
-	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
-	if (ret)
-		goto out;
+	trans = btrfs_start_transaction(root, trans_num_items);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto out_new_inode_args;
+	}
+
+	ret = btrfs_create_new_inode(trans, &new_inode_args);
 
 	/*
-	 * We set number of links to 0 in btrfs_new_inode(), and here we set
-	 * it to 1 because d_tmpfile() will issue a warning if the count is 0,
-	 * through:
+	 * We set number of links to 0 in btrfs_create_new_inode(), and here we
+	 * set it to 1 because d_tmpfile() will issue a warning if the count is
+	 * 0, through:
 	 *
 	 *    d_tmpfile() -> inode_dec_link_count() -> drop_nlink()
 	 */
 	set_nlink(inode, 1);
-	d_tmpfile(dentry, inode);
-	unlock_new_inode(inode);
-	mark_inode_dirty(inode);
-out:
+
+	if (!ret) {
+		d_tmpfile(dentry, inode);
+		unlock_new_inode(inode);
+		mark_inode_dirty(inode);
+	}
+
 	btrfs_end_transaction(trans);
-	if (ret && inode)
-		discard_new_inode(inode);
 	btrfs_btree_balance_dirty(fs_info);
+out_new_inode_args:
+	btrfs_new_inode_args_free(&new_inode_args);
+out_inode:
+	if (ret)
+		iput(inode);
 	return ret;
 }
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3cea5530ad83..4d217cff7da5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -544,13 +544,38 @@ int __pure btrfs_is_empty_uuid(u8 *uuid)
 	return 1;
 }
 
+/*
+ * Calculate the number of transaction items to reserve for creating a subvolume
+ * or snapshot, not including the inode, directory entries, or parent directory.
+ */
+static unsigned int create_subvol_num_items(struct btrfs_qgroup_inherit *inherit)
+{
+	/*
+	 * 1 to add root block
+	 * 1 to add root item
+	 * 1 to add root ref
+	 * 1 to add root backref
+	 * 1 to add UUID item
+	 * 1 to add qgroup info
+	 * 1 to add qgroup limit
+	 * (Ideally the last two would only be accounted if qgroups are enabled,
+	 * but that can change between now and the time we would insert them)
+	 */
+	unsigned int num_items = 7;
+
+	if (inherit) {
+		/* 2 to add qgroup relations for each inherited qgroup */
+		num_items += 2 * inherit->num_qgroups;
+	}
+	return num_items;
+}
+
 static noinline int create_subvol(struct user_namespace *mnt_userns,
 				  struct inode *dir, struct dentry *dentry,
 				  struct btrfs_qgroup_inherit *inherit)
 {
-	const char *name = dentry->d_name.name;
-	int namelen = dentry->d_name.len;
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
+	unsigned int trans_num_items;
 	struct btrfs_trans_handle *trans;
 	struct btrfs_key key;
 	struct btrfs_root_item *root_item;
@@ -560,11 +585,14 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	struct btrfs_root *new_root;
 	struct btrfs_block_rsv block_rsv;
 	struct timespec64 cur_time = current_time(dir);
-	struct inode *inode;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+		.subvol = true,
+	};
 	int ret;
-	dev_t anon_dev = 0;
+	dev_t anon_dev;
 	u64 objectid;
-	u64 index = 0;
 
 	root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
 	if (!root_item)
@@ -572,11 +600,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 
 	ret = btrfs_get_free_objectid(fs_info->tree_root, &objectid);
 	if (ret)
-		goto fail_free;
-
-	ret = get_anon_bdev(&anon_dev);
-	if (ret < 0)
-		goto fail_free;
+		goto out_root_item;
 
 	/*
 	 * Don't create subvolume whose level is not zero. Or qgroup will be
@@ -584,36 +608,47 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	 */
 	if (btrfs_qgroup_level(objectid)) {
 		ret = -ENOSPC;
-		goto fail_free;
+		goto out_root_item;
 	}
 
-	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
-	/*
-	 * The same as the snapshot creation, please see the comment
-	 * of create_snapshot().
-	 */
-	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 8, false);
+	ret = get_anon_bdev(&anon_dev);
+	if (ret < 0)
+		goto out_root_item;
+
+	new_inode_args.inode = btrfs_new_subvol_inode(mnt_userns, dir);
+	if (!new_inode_args.inode) {
+		ret = -ENOMEM;
+		goto out_anon_dev;
+	}
+	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
 	if (ret)
-		goto fail_free;
+		goto out_inode;
+	trans_num_items += create_subvol_num_items(inherit);
+
+	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
+	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv,
+					       trans_num_items, false);
+	if (ret)
+		goto out_new_inode_args;
 
 	trans = btrfs_start_transaction(root, 0);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
 		btrfs_subvolume_release_metadata(root, &block_rsv);
-		goto fail_free;
+		goto out_new_inode_args;
 	}
 	trans->block_rsv = &block_rsv;
 	trans->bytes_reserved = block_rsv.size;
 
 	ret = btrfs_qgroup_inherit(trans, 0, objectid, inherit);
 	if (ret)
-		goto fail;
+		goto out;
 
 	leaf = btrfs_alloc_tree_block(trans, root, 0, objectid, NULL, 0, 0, 0,
 				      BTRFS_NESTING_NORMAL);
 	if (IS_ERR(leaf)) {
 		ret = PTR_ERR(leaf);
-		goto fail;
+		goto out;
 	}
 
 	btrfs_mark_buffer_dirty(leaf);
@@ -668,75 +703,47 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		btrfs_tree_unlock(leaf);
 		btrfs_free_tree_block(trans, objectid, leaf, 0, 1);
 		free_extent_buffer(leaf);
-		goto fail;
+		goto out;
 	}
 
 	free_extent_buffer(leaf);
 	leaf = NULL;
 
-	key.offset = (u64)-1;
 	new_root = btrfs_get_new_fs_root(fs_info, objectid, anon_dev);
 	if (IS_ERR(new_root)) {
-		free_anon_bdev(anon_dev);
 		ret = PTR_ERR(new_root);
 		btrfs_abort_transaction(trans, ret);
-		goto fail;
+		goto out;
 	}
-	/* Freeing will be done in btrfs_put_root() of new_root */
+	/* anon_dev is owned by new_root now. */
 	anon_dev = 0;
+	BTRFS_I(new_inode_args.inode)->root = new_root;
+	/* ... and new_root is owned by new_inode.inode now. */
 
 	ret = btrfs_record_root_in_trans(trans, new_root);
 	if (ret) {
 		btrfs_put_root(new_root);
 		btrfs_abort_transaction(trans, ret);
-		goto fail;
-	}
-
-	ret = btrfs_create_subvol_root(trans, new_root, root, mnt_userns);
-	btrfs_put_root(new_root);
-	if (ret) {
-		/* We potentially lose an unused inode item here */
-		btrfs_abort_transaction(trans, ret);
-		goto fail;
-	}
-
-	/*
-	 * insert the directory item
-	 */
-	ret = btrfs_set_inode_index(BTRFS_I(dir), &index);
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto fail;
-	}
-
-	ret = btrfs_insert_dir_item(trans, name, namelen, BTRFS_I(dir), &key,
-				    BTRFS_FT_DIR, index);
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto fail;
-	}
-
-	btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
-	ret = btrfs_update_inode(trans, root, BTRFS_I(dir));
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto fail;
-	}
-
-	ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
-				 btrfs_ino(BTRFS_I(dir)), index, name, namelen);
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto fail;
+		goto out;
 	}
 
 	ret = btrfs_uuid_tree_add(trans, root_item->uuid,
 				  BTRFS_UUID_KEY_SUBVOL, objectid);
-	if (ret)
+	if (ret) {
 		btrfs_abort_transaction(trans, ret);
+		goto out;
+	}
 
-fail:
-	kfree(root_item);
+	ret = btrfs_create_new_inode(trans, &new_inode_args);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto out;
+	}
+
+	d_instantiate_new(dentry, new_inode_args.inode);
+	new_inode_args.inode = NULL;
+
+out:
 	trans->block_rsv = NULL;
 	trans->bytes_reserved = 0;
 	btrfs_subvolume_release_metadata(root, &block_rsv);
@@ -745,18 +752,14 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		btrfs_end_transaction(trans);
 	else
 		ret = btrfs_commit_transaction(trans);
-
-	if (!ret) {
-		inode = btrfs_lookup_dentry(dir, dentry);
-		if (IS_ERR(inode))
-			return PTR_ERR(inode);
-		d_instantiate(dentry, inode);
-	}
-	return ret;
-
-fail_free:
+out_new_inode_args:
+	btrfs_new_inode_args_free(&new_inode_args);
+out_inode:
+	iput(new_inode_args.inode);
+out_anon_dev:
 	if (anon_dev)
 		free_anon_bdev(anon_dev);
+out_root_item:
 	kfree(root_item);
 	return ret;
 }
@@ -769,6 +772,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	struct inode *inode;
 	struct btrfs_pending_snapshot *pending_snapshot;
 	struct btrfs_trans_handle *trans;
+	int trans_num_items;
 	int ret;
 
 	/* We do not support snapshotting right now. */
@@ -805,16 +809,14 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	btrfs_init_block_rsv(&pending_snapshot->block_rsv,
 			     BTRFS_BLOCK_RSV_TEMP);
 	/*
-	 * 1 - parent dir inode
-	 * 2 - dir entries
-	 * 1 - root item
-	 * 2 - root ref/backref
-	 * 1 - root of snapshot
-	 * 1 - UUID item
+	 * 1 to add dir item
+	 * 1 to add dir index
+	 * 1 to update parent inode item
 	 */
+	trans_num_items = create_subvol_num_items(inherit) + 3;
 	ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root,
-					&pending_snapshot->block_rsv, 8,
-					false);
+					       &pending_snapshot->block_rsv,
+					       trans_num_items, false);
 	if (ret)
 		goto free_pending;
 
diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index 1a6d2d5b4b33..f5565c296898 100644
--- a/fs/btrfs/props.c
+++ b/fs/btrfs/props.c
@@ -334,9 +334,8 @@ static struct prop_handler prop_handlers[] = {
 	},
 };
 
-static int inherit_props(struct btrfs_trans_handle *trans,
-			 struct inode *inode,
-			 struct inode *parent)
+int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
+			      struct inode *inode, struct inode *parent)
 {
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
@@ -408,41 +407,6 @@ static int inherit_props(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
-int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
-			      struct inode *inode,
-			      struct inode *dir)
-{
-	if (!dir)
-		return 0;
-
-	return inherit_props(trans, inode, dir);
-}
-
-int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
-			       struct btrfs_root *root,
-			       struct btrfs_root *parent_root)
-{
-	struct super_block *sb = root->fs_info->sb;
-	struct inode *parent_inode, *child_inode;
-	int ret;
-
-	parent_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, parent_root);
-	if (IS_ERR(parent_inode))
-		return PTR_ERR(parent_inode);
-
-	child_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, root);
-	if (IS_ERR(child_inode)) {
-		iput(parent_inode);
-		return PTR_ERR(child_inode);
-	}
-
-	ret = inherit_props(trans, child_inode, parent_inode);
-	iput(child_inode);
-	iput(parent_inode);
-
-	return ret;
-}
-
 void __init btrfs_props_init(void)
 {
 	int i;
diff --git a/fs/btrfs/props.h b/fs/btrfs/props.h
index 40b2c65b518c..1dcd5daa3b22 100644
--- a/fs/btrfs/props.h
+++ b/fs/btrfs/props.h
@@ -21,8 +21,4 @@ int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
 			      struct inode *inode,
 			      struct inode *dir);
 
-int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
-			       struct btrfs_root *root,
-			       struct btrfs_root *parent_root);
-
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 12/12] btrfs: rework inode creation to fix several issues
  2022-03-03 23:19 ` [PATCH 12/12] btrfs: rework inode creation to fix several issues Omar Sandoval
@ 2022-03-07 12:28   ` David Sterba
  2022-03-10  0:18     ` Omar Sandoval
  2022-03-07 13:25   ` Sweet Tea Dorminy
  1 sibling, 1 reply; 18+ messages in thread
From: David Sterba @ 2022-03-07 12:28 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Thu, Mar 03, 2022 at 03:19:02PM -0800, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>

>  fs/btrfs/acl.c   |  36 +--
>  fs/btrfs/ctree.h |  39 ++-
>  fs/btrfs/inode.c | 801 +++++++++++++++++++++++------------------------
>  fs/btrfs/ioctl.c | 174 +++++-----
>  fs/btrfs/props.c |  40 +--
>  fs/btrfs/props.h |   4 -
>  6 files changed, 508 insertions(+), 586 deletions(-)

Can this be split into more patches? All fine from 1 to 11 and now this,
it's just too much code change and I don't want to take risk by yet
another rewrite.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 12/12] btrfs: rework inode creation to fix several issues
  2022-03-03 23:19 ` [PATCH 12/12] btrfs: rework inode creation to fix several issues Omar Sandoval
  2022-03-07 12:28   ` David Sterba
@ 2022-03-07 13:25   ` Sweet Tea Dorminy
  2022-03-10  0:20     ` Omar Sandoval
  1 sibling, 1 reply; 18+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-07 13:25 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team

I like the changes overall. However, I was hoping:

a) would it be possible to just refactor to use 
btrfs_new_inode_prepare() first and then fixup the non-refactoring parts?

b) the naming is a bit confusing to me: I don't usually think of 
..free() as the inverse action of ...prepare(). Also, ...free() feels 
weird to be taking a stack pointer. Perhaps _{init,uninit}() or 
_{prepare,destroy}() might be a clearer set of names?

Thanks!


Sweet Tea

On 3/3/22 18:19, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Inode creation currently has several issues:
>
> * Parts of the creation code path are duplicated between every operation
>    that creates an inode: create, mknod, mkdir, RENAME_WHITEOUT, symlink,
>    O_TMPFILE, and subvolume creation. This makes it hard to fix bugs or
>    add features to inode creation (in particular, preparatory work for
>    fscrypt).
> * Subvolume creation in particular duplicates code in a way that
>    diverges from the usual inode creation behavior when it comes to
>    inherting flags, properties, permissions, and ACLs.
> * We insert an inode item first and then modify the inode and update the
>    item again. This is a silly inefficiency.
> * Creation does not account for the compression property, ACLs, or the
>    parent inode item when starting the transaction.
> * We can leak an inode on disk in some error cases. For example, in
>    btrfs_create(), if btrfs_new_inode() succeeds, then we have inserted
>    an inode item and its inode ref. However, if something after that
>    fails (e.g., btrfs_init_inode_security()), then we end the transaction
>    and then decrement the link count on the inode. If the transaction is
>    committed and the system crashes before the failed inode is deleted,
>    then we leak that inode on disk.
>
> I tried to fix these issues in their own separate patches, but they're
> all interconnected and much easier to fix all in one go. The solution is
> to unify all of the inode creation code paths into a single code path
> with two steps:
>
> 1. btrfs_new_inode_prepare(), which does the correct accounting and
>     security preparations (currently POSIX ACLs, but this is where
>     fscrypt context setup will go).
> 2. btrfs_create_new_inode(), which does the actual B-tree modifications
>     for the inode and its name.
>
> This is a straightforward fix for the code duplication issue, the
> unnecessary inode update issue, and the accounting issue. It explicitly
> does _not_ change the existing non-standard subvolume behavior around
> flags, permissions, ACLs, etc., but it makes those differences more
> clear in the code and documents them so that we can discuss whether they
> should be changed. Finally, it fixes the inode leak issue by aborting
> the transaction when we can't recover more gracefully. This shouldn't be
> an issue now that we're accounting space correctly.
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>   fs/btrfs/acl.c   |  36 +--
>   fs/btrfs/ctree.h |  39 ++-
>   fs/btrfs/inode.c | 801 +++++++++++++++++++++++------------------------
>   fs/btrfs/ioctl.c | 174 +++++-----
>   fs/btrfs/props.c |  40 +--
>   fs/btrfs/props.h |   4 -
>   6 files changed, 508 insertions(+), 586 deletions(-)
>
> diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
> index a6909ec9bc38..548d6a5477b4 100644
> --- a/fs/btrfs/acl.c
> +++ b/fs/btrfs/acl.c
> @@ -55,8 +55,8 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu)
>   	return acl;
>   }
>   
> -static int __btrfs_set_acl(struct btrfs_trans_handle *trans,
> -			   struct inode *inode, struct posix_acl *acl, int type)
> +int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
> +		    struct posix_acl *acl, int type)
>   {
>   	int ret, size = 0;
>   	const char *name;
> @@ -127,35 +127,3 @@ int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
>   		inode->i_mode = old_mode;
>   	return ret;
>   }
> -
> -int btrfs_init_acl(struct btrfs_trans_handle *trans,
> -		   struct inode *inode, struct inode *dir)
> -{
> -	struct posix_acl *default_acl, *acl;
> -	int ret = 0;
> -
> -	/* this happens with subvols */
> -	if (!dir)
> -		return 0;
> -
> -	ret = posix_acl_create(dir, &inode->i_mode, &default_acl, &acl);
> -	if (ret)
> -		return ret;
> -
> -	if (default_acl) {
> -		ret = __btrfs_set_acl(trans, inode, default_acl,
> -				      ACL_TYPE_DEFAULT);
> -		posix_acl_release(default_acl);
> -	}
> -
> -	if (acl) {
> -		if (!ret)
> -			ret = __btrfs_set_acl(trans, inode, acl,
> -					      ACL_TYPE_ACCESS);
> -		posix_acl_release(acl);
> -	}
> -
> -	if (!default_acl && !acl)
> -		cache_no_acl(inode);
> -	return ret;
> -}
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 4db17bd05a21..6fa63dfac573 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3254,10 +3254,30 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
>   int btrfs_set_extent_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
>   			      unsigned int extra_bits,
>   			      struct extent_state **cached_state);
> -int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
> -			     struct btrfs_root *new_root,
> -			     struct btrfs_root *parent_root,
> -			     struct user_namespace *mnt_userns);
> +
> +struct btrfs_new_inode_args {
> +	/* Input */
> +	struct inode *dir;
> +	struct dentry *dentry;
> +	struct inode *inode;
> +	bool orphan;
> +	bool subvol;
> +
> +	/*
> +	 * Output from btrfs_new_inode_prepare(), input to
> +	 * btrfs_create_new_inode().
> +	 */
> +	struct posix_acl *default_acl;
> +	struct posix_acl *acl;
> +};
> +int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
> +			    unsigned int *trans_num_items);
> +int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
> +			   struct btrfs_new_inode_args *args);
> +void btrfs_new_inode_args_free(struct btrfs_new_inode_args *args);
> +struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
> +				     struct inode *dir);
> +
>    void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
>   			       unsigned *bits);
>   void btrfs_clear_delalloc_extent(struct inode *inode,
> @@ -3815,15 +3835,16 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag)
>   struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu);
>   int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
>   		  struct posix_acl *acl, int type);
> -int btrfs_init_acl(struct btrfs_trans_handle *trans,
> -		   struct inode *inode, struct inode *dir);
> +int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
> +		    struct posix_acl *acl, int type);
>   #else
>   #define btrfs_get_acl NULL
>   #define btrfs_set_acl NULL
> -static inline int btrfs_init_acl(struct btrfs_trans_handle *trans,
> -				 struct inode *inode, struct inode *dir)
> +static inline int __btrfs_set_acl(struct btrfs_trans_handle *trans,
> +				  struct inode *inode, struct posix_acl *acl,
> +				  int type)
>   {
> -	return 0;
> +	return -EOPNOTSUPP;
>   }
>   #endif
>   
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index c47bdada2440..2f17e0598664 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -223,14 +223,26 @@ static int btrfs_dirty_inode(struct inode *inode);
>   
>   static int btrfs_init_inode_security(struct btrfs_trans_handle *trans,
>   				     struct inode *inode,  struct inode *dir,
> -				     const struct qstr *qstr)
> +				     const struct qstr *qstr,
> +				     struct posix_acl *default_acl,
> +				     struct posix_acl *acl)
>   {
>   	int err;
>   
> -	err = btrfs_init_acl(trans, inode, dir);
> -	if (!err)
> -		err = btrfs_xattr_security_init(trans, inode, dir, qstr);
> -	return err;
> +	if (default_acl) {
> +		err = __btrfs_set_acl(trans, inode, default_acl,
> +				      ACL_TYPE_DEFAULT);
> +		if (err)
> +			return err;
> +	}
> +	if (acl) {
> +		err = __btrfs_set_acl(trans, inode, acl, ACL_TYPE_ACCESS);
> +		if (err)
> +			return err;
> +	}
> +	if (!default_acl && !acl)
> +		cache_no_acl(inode);
> +	return btrfs_xattr_security_init(trans, inode, dir, qstr);
>   }
>   
>   /*
> @@ -6059,6 +6071,49 @@ static int btrfs_insert_inode_locked(struct inode *inode)
>   		   btrfs_find_actor, &args);
>   }
>   
> +int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
> +			    unsigned int *trans_num_items)
> +{
> +	struct inode *dir = args->dir;
> +	struct inode *inode = args->inode;
> +	int ret;
> +
> +	ret = posix_acl_create(dir, &inode->i_mode, &args->default_acl,
> +			       &args->acl);
> +	if (ret)
> +		return ret;
> +
> +	*trans_num_items = 1; /* 1 to add inode item */
> +	if (BTRFS_I(dir)->prop_compress)
> +		(*trans_num_items)++; /* 1 to add compression property */
> +	if (args->default_acl)
> +		(*trans_num_items)++; /* 1 to add default ACL xattr */
> +	if (args->acl)
> +		(*trans_num_items)++; /* 1 to add access ACL xattr */
> +#ifdef CONFIG_SECURITY
> +	if (dir->i_security)
> +		(*trans_num_items)++; /* 1 to add LSM xattr */
> +#endif
> +	if (args->orphan) {
> +		(*trans_num_items)++; /* 1 to add orphan item */
> +	} else {
> +		/*
> +		 * 1 to add inode ref
> +		 * 1 to add dir item
> +		 * 1 to add dir index
> +		 * 1 to update parent inode item
> +		 */
> +		*trans_num_items += 4;
> +	}
> +	return 0;
> +}
> +
> +void btrfs_new_inode_args_free(struct btrfs_new_inode_args *args)
> +{
> +	posix_acl_release(args->acl);
> +	posix_acl_release(args->default_acl);
> +}
> +
>   /*
>    * Inherit flags from the parent inode.
>    *
> @@ -6068,9 +6123,6 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
>   {
>   	unsigned int flags;
>   
> -	if (!dir)
> -		return;
> -
>   	flags = BTRFS_I(dir)->flags;
>   
>   	if (flags & BTRFS_INODE_NOCOMPRESS) {
> @@ -6090,65 +6142,52 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
>   	btrfs_sync_inode_flags_to_i_flags(inode);
>   }
>   
> -static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
> -				     struct btrfs_root *root,
> -				     struct user_namespace *mnt_userns,
> -				     struct inode *dir,
> -				     const char *name, int name_len,
> -				     umode_t mode, u64 *index)
> +int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
> +			   struct btrfs_new_inode_args *args)
>   {
> -	struct btrfs_fs_info *fs_info = root->fs_info;
> -	struct inode *inode;
> -	struct btrfs_inode_item *inode_item;
> -	struct btrfs_key *location;
> +	struct inode *dir = args->dir;
> +	struct inode *inode = args->inode;
> +	const char *name = args->orphan ? NULL : args->dentry->d_name.name;
> +	int name_len = args->orphan ? 0 : args->dentry->d_name.len;
> +	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
>   	struct btrfs_path *path;
>   	u64 objectid;
> -	struct btrfs_inode_ref *ref;
> +	struct btrfs_key *location;
> +	struct btrfs_root *root;
>   	struct btrfs_key key[2];
>   	u32 sizes[2];
>   	struct btrfs_item_batch batch;
> +	struct btrfs_inode_item *inode_item;
> +	struct btrfs_inode_ref *ref;
>   	unsigned long ptr;
> -	unsigned int nofs_flag;
>   	int ret;
>   
>   	path = btrfs_alloc_path();
>   	if (!path)
> -		return ERR_PTR(-ENOMEM);
> +		return -ENOMEM;
>   
> -	nofs_flag = memalloc_nofs_save();
> -	inode = new_inode(fs_info->sb);
> -	memalloc_nofs_restore(nofs_flag);
> -	if (!inode) {
> -		btrfs_free_path(path);
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	/*
> -	 * O_TMPFILE, set link count to 0, so that after this point,
> -	 * we fill in an inode item with the correct link count.
> -	 */
> -	if (!name)
> -		set_nlink(inode, 0);
> +	if (!args->subvol)
> +		BTRFS_I(inode)->root = btrfs_grab_root(BTRFS_I(dir)->root);
> +	root = BTRFS_I(inode)->root;
>   
>   	ret = btrfs_get_free_objectid(root, &objectid);
> -	if (ret) {
> -		btrfs_free_path(path);
> -		iput(inode);
> -		return ERR_PTR(ret);
> -	}
> +	if (ret)
> +		goto out;
>   	inode->i_ino = objectid;
>   
> -	if (dir && name) {
> +	if (args->orphan) {
> +		/*
> +		 * O_TMPFILE, set link count to 0, so that after this point, we
> +		 * fill in an inode item with the correct link count.
> +		 */
> +		set_nlink(inode, 0);
> +	} else {
>   		trace_btrfs_inode_request(dir);
>   
> -		ret = btrfs_set_inode_index(BTRFS_I(dir), index);
> -		if (ret) {
> -			btrfs_free_path(path);
> -			iput(inode);
> -			return ERR_PTR(ret);
> -		}
> -	} else if (dir) {
> -		*index = 0;
> +		ret = btrfs_set_inode_index(BTRFS_I(dir),
> +					    &BTRFS_I(inode)->dir_index);
> +		if (ret)
> +			goto out;
>   	}
>   	/*
>   	 * index_cnt is ignored for everything but a dir,
> @@ -6156,14 +6195,18 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
>   	 * number
>   	 */
>   	BTRFS_I(inode)->index_cnt = 2;
> -	BTRFS_I(inode)->dir_index = *index;
> -	BTRFS_I(inode)->root = btrfs_grab_root(root);
>   	BTRFS_I(inode)->generation = trans->transid;
>   	inode->i_generation = BTRFS_I(inode)->generation;
>   
> -	btrfs_inherit_iflags(inode, dir);
> +	/*
> +	 * Subvolumes don't inherit flags from their parent directory.
> +	 * Originally this was probably by accident, but we probably can't
> +	 * change it now.
> +	 */
> +	if (!args->subvol)
> +		btrfs_inherit_iflags(inode, dir);
>   
> -	if (S_ISREG(mode)) {
> +	if (S_ISREG(inode->i_mode)) {
>   		if (btrfs_test_opt(fs_info, NODATASUM))
>   			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
>   		if (btrfs_test_opt(fs_info, NODATACOW))
> @@ -6171,6 +6214,57 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
>   				BTRFS_INODE_NODATASUM;
>   	}
>   
> +	location = &BTRFS_I(inode)->location;
> +	location->objectid = objectid;
> +	location->offset = 0;
> +	location->type = BTRFS_INODE_ITEM_KEY;
> +
> +	ret = btrfs_insert_inode_locked(inode);
> +	if (ret < 0) {
> +		if (!args->orphan)
> +			BTRFS_I(dir)->index_cnt--;
> +		goto out;
> +	}
> +
> +	if (args->subvol) {
> +		struct inode *parent;
> +
> +		/*
> +		 * Subvolumes inherit properties from their parent subvolume,
> +		 * not the directory they were created in.
> +		 */
> +		parent = btrfs_iget(fs_info->sb, BTRFS_FIRST_FREE_OBJECTID,
> +				    BTRFS_I(dir)->root);
> +		if (IS_ERR(parent)) {
> +			ret = PTR_ERR(parent);
> +		} else {
> +			ret = btrfs_inode_inherit_props(trans, inode, parent);
> +			iput(parent);
> +		}
> +	} else {
> +		ret = btrfs_inode_inherit_props(trans, inode, dir);
> +	}
> +	if (ret) {
> +		btrfs_err(fs_info,
> +			  "error inheriting props for ino %llu (root %llu): %d",
> +			  btrfs_ino(BTRFS_I(inode)), root->root_key.objectid,
> +			  ret);
> +	}
> +
> +	/*
> +	 * Subvolumes don't inherit ACLs or get passed to the LSM. This is
> +	 * probably a bug.
> +	 */
> +	if (!args->subvol) {
> +		ret = btrfs_init_inode_security(trans, inode, dir,
> +						&args->dentry->d_name,
> +						args->default_acl, args->acl);
> +		if (ret) {
> +			btrfs_abort_transaction(trans, ret);
> +			goto discard;
> +		}
> +	}
> +
>   	/*
>   	 * We could have gotten an inode number from somebody who was fsynced
>   	 * and then removed in this same transaction, so let's just set full
> @@ -6185,7 +6279,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
>   
>   	sizes[0] = sizeof(struct btrfs_inode_item);
>   
> -	if (name) {
> +	if (!args->orphan) {
>   		/*
>   		 * Start new inodes with an inode_ref. This is slightly more
>   		 * efficient for small numbers of hard links since they will
> @@ -6194,57 +6288,61 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
>   		 */
>   		key[1].objectid = objectid;
>   		key[1].type = BTRFS_INODE_REF_KEY;
> -		if (dir)
> -			key[1].offset = btrfs_ino(BTRFS_I(dir));
> -		else
> +		if (args->subvol) {
>   			key[1].offset = objectid;
> -
> -		sizes[1] = name_len + sizeof(*ref);
> -	}
> -
> -	location = &BTRFS_I(inode)->location;
> -	location->objectid = objectid;
> -	location->offset = 0;
> -	location->type = BTRFS_INODE_ITEM_KEY;
> -
> -	ret = btrfs_insert_inode_locked(inode);
> -	if (ret < 0) {
> -		iput(inode);
> -		goto fail;
> +			sizes[1] = 2 + sizeof(*ref);
> +		} else {
> +			key[1].offset = btrfs_ino(BTRFS_I(dir));
> +			sizes[1] = name_len + sizeof(*ref);
> +		}
>   	}
>   
>   	batch.keys = &key[0];
>   	batch.data_sizes = &sizes[0];
> -	batch.total_data_size = sizes[0] + (name ? sizes[1] : 0);
> -	batch.nr = name ? 2 : 1;
> +	batch.total_data_size = sizes[0] + (args->orphan ? 0 : sizes[1]);
> +	batch.nr = args->orphan ? 1 : 2;
>   	ret = btrfs_insert_empty_items(trans, root, path, &batch);
> -	if (ret != 0)
> -		goto fail_unlock;
> -
> -	inode_init_owner(mnt_userns, inode, dir, mode);
> +	if (ret != 0) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto discard;
> +	}
>   
>   	inode->i_mtime = current_time(inode);
>   	inode->i_atime = inode->i_mtime;
>   	inode->i_ctime = inode->i_mtime;
>   	BTRFS_I(inode)->i_otime = inode->i_mtime;
>   
> +	/*
> +	 * We're going to fill the inode item now, so at this point the inode
> +	 * must be fully initialized.
> +	 */
> +
>   	inode_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
>   				  struct btrfs_inode_item);
>   	memzero_extent_buffer(path->nodes[0], (unsigned long)inode_item,
>   			     sizeof(*inode_item));
>   	fill_inode_item(trans, path->nodes[0], inode_item, inode);
>   
> -	if (name) {
> +	if (!args->orphan) {
>   		ref = btrfs_item_ptr(path->nodes[0], path->slots[0] + 1,
>   				     struct btrfs_inode_ref);
> -		btrfs_set_inode_ref_name_len(path->nodes[0], ref, name_len);
> -		btrfs_set_inode_ref_index(path->nodes[0], ref, *index);
>   		ptr = (unsigned long)(ref + 1);
> -		write_extent_buffer(path->nodes[0], name, ptr, name_len);
> +		if (args->subvol) {
> +			btrfs_set_inode_ref_name_len(path->nodes[0], ref, 2);
> +			btrfs_set_inode_ref_index(path->nodes[0], ref, 0);
> +			write_extent_buffer(path->nodes[0], "..", ptr, 2);
> +		} else {
> +			btrfs_set_inode_ref_name_len(path->nodes[0], ref,
> +						     name_len);
> +			btrfs_set_inode_ref_index(path->nodes[0], ref,
> +						  BTRFS_I(inode)->dir_index);
> +			write_extent_buffer(path->nodes[0], name, ptr,
> +					    name_len);
> +		}
>   	}
>   
>   	btrfs_mark_buffer_dirty(path->nodes[0]);
> -	btrfs_free_path(path);
> +	btrfs_release_path(path);
>   
>   	inode_tree_add(inode);
>   
> @@ -6253,21 +6351,26 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
>   
>   	btrfs_update_root_times(trans, root);
>   
> -	ret = btrfs_inode_inherit_props(trans, inode, dir);
> -	if (ret)
> -		btrfs_err(fs_info,
> -			  "error inheriting props for ino %llu (root %llu): %d",
> -			btrfs_ino(BTRFS_I(inode)), root->root_key.objectid, ret);
> +	if (args->orphan) {
> +		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> +	} else {
> +		ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode), name,
> +				     name_len, 0, BTRFS_I(inode)->dir_index);
> +	}
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto discard;
> +	}
>   
> -	return inode;
> +	ret = 0;
> +	goto out;
>   
> -fail_unlock:
> +discard:
> +	ihold(inode);
>   	discard_new_inode(inode);
> -fail:
> -	if (dir && name)
> -		BTRFS_I(dir)->index_cnt--;
> +out:
>   	btrfs_free_path(path);
> -	return ERR_PTR(ret);
> +	return ret;
>   }
>   
>   /*
> @@ -6359,125 +6462,72 @@ int btrfs_add_link(struct btrfs_trans_handle *trans,
>   	return ret;
>   }
>   
> +static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
> +			       struct inode *inode)
> +{
> +
> +	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> +	struct btrfs_root *root = BTRFS_I(dir)->root;
> +	struct btrfs_new_inode_args new_inode_args = {
> +		.dir = dir,
> +		.dentry = dentry,
> +		.inode = inode,
> +	};
> +	unsigned int trans_num_items;
> +	struct btrfs_trans_handle *trans;
> +	int ret;
> +
> +	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
> +	if (ret)
> +		goto out_inode;
> +
> +	trans = btrfs_start_transaction(root, trans_num_items);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		goto out_new_inode_args;
> +	}
> +
> +	ret = btrfs_create_new_inode(trans, &new_inode_args);
> +	if (!ret)
> +		d_instantiate_new(dentry, inode);
> +
> +	btrfs_end_transaction(trans);
> +	btrfs_btree_balance_dirty(fs_info);
> +out_new_inode_args:
> +	btrfs_new_inode_args_free(&new_inode_args);
> +out_inode:
> +	if (ret)
> +		iput(inode);
> +	return ret;
> +}
> +
>   static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
>   		       struct dentry *dentry, umode_t mode, dev_t rdev)
>   {
> -	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> -	struct btrfs_trans_handle *trans;
> -	struct btrfs_root *root = BTRFS_I(dir)->root;
> -	struct inode *inode = NULL;
> -	int err;
> -	u64 index = 0;
> +	struct inode *inode;
>   
> -	/*
> -	 * 2 for inode item and ref
> -	 * 2 for dir items
> -	 * 1 for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> -
> -	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
> -			dentry->d_name.name, dentry->d_name.len,
> -			mode, &index);
> -	if (IS_ERR(inode)) {
> -		err = PTR_ERR(inode);
> -		inode = NULL;
> -		goto out_unlock;
> -	}
> -
> -	/*
> -	* If the active LSM wants to access the inode during
> -	* d_instantiate it needs these. Smack checks to see
> -	* if the filesystem supports xattrs by looking at the
> -	* ops vector.
> -	*/
> +	inode = new_inode(dir->i_sb);
> +	if (!inode)
> +		return -ENOMEM;
> +	inode_init_owner(mnt_userns, inode, dir, mode);
>   	inode->i_op = &btrfs_special_inode_operations;
>   	init_special_inode(inode, inode->i_mode, rdev);
> -
> -	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
> -	if (err)
> -		goto out_unlock;
> -
> -	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -			     dentry->d_name.name, dentry->d_name.len, 0, index);
> -	if (err)
> -		goto out_unlock;
> -
> -	btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	d_instantiate_new(dentry, inode);
> -
> -out_unlock:
> -	btrfs_end_transaction(trans);
> -	btrfs_btree_balance_dirty(fs_info);
> -	if (err && inode) {
> -		inode_dec_link_count(inode);
> -		discard_new_inode(inode);
> -	}
> -	return err;
> +	return btrfs_create_common(dir, dentry, inode);
>   }
>   
>   static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
>   			struct dentry *dentry, umode_t mode, bool excl)
>   {
> -	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> -	struct btrfs_trans_handle *trans;
> -	struct btrfs_root *root = BTRFS_I(dir)->root;
> -	struct inode *inode = NULL;
> -	int err;
> -	u64 index = 0;
> +	struct inode *inode;
>   
> -	/*
> -	 * 2 for inode item and ref
> -	 * 2 for dir items
> -	 * 1 for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> -
> -	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
> -			dentry->d_name.name, dentry->d_name.len,
> -			mode, &index);
> -	if (IS_ERR(inode)) {
> -		err = PTR_ERR(inode);
> -		inode = NULL;
> -		goto out_unlock;
> -	}
> -	/*
> -	* If the active LSM wants to access the inode during
> -	* d_instantiate it needs these. Smack checks to see
> -	* if the filesystem supports xattrs by looking at the
> -	* ops vector.
> -	*/
> +	inode = new_inode(dir->i_sb);
> +	if (!inode)
> +		return -ENOMEM;
> +	inode_init_owner(mnt_userns, inode, dir, mode);
>   	inode->i_fop = &btrfs_file_operations;
>   	inode->i_op = &btrfs_file_inode_operations;
>   	inode->i_mapping->a_ops = &btrfs_aops;
> -
> -	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
> -	if (err)
> -		goto out_unlock;
> -
> -	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	if (err)
> -		goto out_unlock;
> -
> -	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -			     dentry->d_name.name, dentry->d_name.len, 0, index);
> -	if (err)
> -		goto out_unlock;
> -
> -	d_instantiate_new(dentry, inode);
> -
> -out_unlock:
> -	btrfs_end_transaction(trans);
> -	if (err && inode) {
> -		inode_dec_link_count(inode);
> -		discard_new_inode(inode);
> -	}
> -	btrfs_btree_balance_dirty(fs_info);
> -	return err;
> +	return btrfs_create_common(dir, dentry, inode);
>   }
>   
>   static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
> @@ -6561,59 +6611,15 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
>   static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
>   		       struct dentry *dentry, umode_t mode)
>   {
> -	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> -	struct inode *inode = NULL;
> -	struct btrfs_trans_handle *trans;
> -	struct btrfs_root *root = BTRFS_I(dir)->root;
> -	int err = 0;
> -	u64 index = 0;
> +	struct inode *inode;
>   
> -	/*
> -	 * 2 items for inode and ref
> -	 * 2 items for dir items
> -	 * 1 for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> -
> -	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
> -			dentry->d_name.name, dentry->d_name.len,
> -			S_IFDIR | mode, &index);
> -	if (IS_ERR(inode)) {
> -		err = PTR_ERR(inode);
> -		inode = NULL;
> -		goto out_fail;
> -	}
> -
> -	/* these must be set before we unlock the inode */
> +	inode = new_inode(dir->i_sb);
> +	if (!inode)
> +		return -ENOMEM;
> +	inode_init_owner(mnt_userns, inode, dir, S_IFDIR | mode);
>   	inode->i_op = &btrfs_dir_inode_operations;
>   	inode->i_fop = &btrfs_dir_file_operations;
> -
> -	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
> -	if (err)
> -		goto out_fail;
> -
> -	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	if (err)
> -		goto out_fail;
> -
> -	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -			dentry->d_name.name,
> -			dentry->d_name.len, 0, index);
> -	if (err)
> -		goto out_fail;
> -
> -	d_instantiate_new(dentry, inode);
> -
> -out_fail:
> -	btrfs_end_transaction(trans);
> -	if (err && inode) {
> -		inode_dec_link_count(inode);
> -		discard_new_inode(inode);
> -	}
> -	btrfs_btree_balance_dirty(fs_info);
> -	return err;
> +	return btrfs_create_common(dir, dentry, inode);
>   }
>   
>   static noinline int uncompress_inline(struct btrfs_path *path,
> @@ -8747,38 +8753,23 @@ static int btrfs_truncate(struct inode *inode, bool skip_writeback)
>   	return ret;
>   }
>   
> -/*
> - * create a new subvolume directory/inode (helper for the ioctl).
> - */
> -int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
> -			     struct btrfs_root *new_root,
> -			     struct btrfs_root *parent_root,
> -			     struct user_namespace *mnt_userns)
> +struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
> +				     struct inode *dir)
>   {
>   	struct inode *inode;
> -	int err;
> -	u64 index = 0;
>   
> -	inode = btrfs_new_inode(trans, new_root, mnt_userns, NULL, "..", 2,
> -				S_IFDIR | (~current_umask() & S_IRWXUGO),
> -				&index);
> -	if (IS_ERR(inode))
> -		return PTR_ERR(inode);
> +	inode = new_inode(dir->i_sb);
> +	if (!inode)
> +		return NULL;
> +	/*
> +	 * Subvolumes don't inherit the sgid bit or the parent's gid if the
> +	 * parent's sgid bit is set. This is probably a bug.
> +	 */
> +	inode_init_owner(mnt_userns, inode, NULL,
> +			 S_IFDIR | (~current_umask() & S_IRWXUGO));
>   	inode->i_op = &btrfs_dir_inode_operations;
>   	inode->i_fop = &btrfs_dir_file_operations;
> -
> -	unlock_new_inode(inode);
> -
> -	err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
> -	if (err)
> -		btrfs_err(new_root->fs_info,
> -			  "error inheriting subvolume %llu properties: %d",
> -			  new_root->root_key.objectid, err);
> -
> -	err = btrfs_update_inode(trans, new_root, BTRFS_I(inode));
> -
> -	iput(inode);
> -	return err;
> +	return inode;
>   }
>   
>   struct inode *btrfs_alloc_inode(struct super_block *sb)
> @@ -9254,49 +9245,19 @@ static int btrfs_rename_exchange(struct inode *old_dir,
>   	return ret;
>   }
>   
> -static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
> -				     struct btrfs_root *root,
> -				     struct user_namespace *mnt_userns,
> -				     struct inode *dir,
> -				     struct dentry *dentry)
> +static struct inode *new_whiteout_inode(struct user_namespace *mnt_userns,
> +					struct inode *dir)
>   {
> -	int ret;
>   	struct inode *inode;
> -	u64 index;
>   
> -	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
> -				dentry->d_name.name,
> -				dentry->d_name.len,
> -				S_IFCHR | WHITEOUT_MODE,
> -				&index);
> -
> -	if (IS_ERR(inode)) {
> -		ret = PTR_ERR(inode);
> -		return ret;
> +	inode = new_inode(dir->i_sb);
> +	if (inode) {
> +		inode_init_owner(mnt_userns, inode, dir,
> +				 S_IFCHR | WHITEOUT_MODE);
> +		inode->i_op = &btrfs_special_inode_operations;
> +		init_special_inode(inode, inode->i_mode, WHITEOUT_DEV);
>   	}
> -
> -	inode->i_op = &btrfs_special_inode_operations;
> -	init_special_inode(inode, inode->i_mode,
> -		WHITEOUT_DEV);
> -
> -	ret = btrfs_init_inode_security(trans, inode, dir,
> -				&dentry->d_name);
> -	if (ret)
> -		goto out;
> -
> -	ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -			     dentry->d_name.name, dentry->d_name.len, 0, index);
> -	if (ret)
> -		goto out;
> -
> -	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -out:
> -	unlock_new_inode(inode);
> -	if (ret)
> -		inode_dec_link_count(inode);
> -	iput(inode);
> -
> -	return ret;
> +	return inode;
>   }
>   
>   static int btrfs_rename(struct user_namespace *mnt_userns,
> @@ -9305,6 +9266,10 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   			unsigned int flags)
>   {
>   	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
> +	struct btrfs_new_inode_args whiteout_args = {
> +		.dir = old_dir,
> +		.dentry = old_dentry,
> +	};
>   	struct btrfs_trans_handle *trans;
>   	unsigned int trans_num_items;
>   	struct btrfs_root *root = BTRFS_I(old_dir)->root;
> @@ -9359,6 +9324,18 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   	if (new_inode && S_ISREG(old_inode->i_mode) && new_inode->i_size)
>   		filemap_flush(old_inode->i_mapping);
>   
> +	if (flags & RENAME_WHITEOUT) {
> +		whiteout_args.inode = new_whiteout_inode(mnt_userns, old_dir);
> +		if (!whiteout_args.inode)
> +			return -ENOMEM;
> +		ret = btrfs_new_inode_prepare(&whiteout_args, &trans_num_items);
> +		if (ret)
> +			goto out_whiteout_inode;
> +	} else {
> +		/* 1 to update the old parent inode. */
> +		trans_num_items = 1;
> +	}
> +
>   	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
>   		/* close the racy window with snapshot create/destroy ioctl */
>   		down_read(&fs_info->subvol_sem);
> @@ -9368,24 +9345,25 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   		 * 1 to add new root ref
>   		 * 1 to add new root backref
>   		 */
> -		trans_num_items = 4;
> +		trans_num_items += 4;
>   	} else {
>   		/*
>   		 * 1 to update inode
>   		 * 1 to remove old inode ref
>   		 * 1 to add new inode ref
>   		 */
> -		trans_num_items = 3;
> +		trans_num_items += 3;
>   	}
>   	/*
>   	 * 1 to remove old dir item
>   	 * 1 to remove old dir index
> -	 * 1 to update old parent inode
>   	 * 1 to add new dir item
>   	 * 1 to add new dir index
> -	 * 1 to update new parent inode (if it's not the same as the old parent)
>   	 */
> -	trans_num_items += 6;
> +	trans_num_items += 4;
> +	/*
> +	 * 1 to update new parent inode if it's not the same as the old parent
> +	 */
>   	if (new_dir != old_dir)
>   		trans_num_items++;
>   	if (new_inode) {
> @@ -9398,8 +9376,6 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   		 */
>   		trans_num_items += 5;
>   	}
> -	if (flags & RENAME_WHITEOUT)
> -		trans_num_items += 5;
>   	trans = btrfs_start_transaction(root, trans_num_items);
>   	if (IS_ERR(trans)) {
>   		ret = PTR_ERR(trans);
> @@ -9495,12 +9471,14 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   				   rename_ctx.index, new_dentry->d_parent);
>   
>   	if (flags & RENAME_WHITEOUT) {
> -		ret = btrfs_whiteout_for_rename(trans, root, mnt_userns,
> -						old_dir, old_dentry);
> -
> +		ret = btrfs_create_new_inode(trans, &whiteout_args);
>   		if (ret) {
>   			btrfs_abort_transaction(trans, ret);
>   			goto out_fail;
> +		} else {
> +			unlock_new_inode(whiteout_args.inode);
> +			iput(whiteout_args.inode);
> +			whiteout_args.inode = NULL;
>   		}
>   	}
>   out_fail:
> @@ -9509,7 +9487,11 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   out_notrans:
>   	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
>   		up_read(&fs_info->subvol_sem);
> -
> +	if (flags & RENAME_WHITEOUT)
> +		btrfs_new_inode_args_free(&whiteout_args);
> +out_whiteout_inode:
> +	if (flags & RENAME_WHITEOUT)
> +		iput(whiteout_args.inode);
>   	return ret;
>   }
>   
> @@ -9724,13 +9706,17 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   			 struct dentry *dentry, const char *symname)
>   {
>   	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> +	struct inode *inode;
> +	struct btrfs_new_inode_args new_inode_args = {
> +		.dir = dir,
> +		.dentry = dentry,
> +	};
> +	unsigned int trans_num_items;
>   	struct btrfs_trans_handle *trans;
>   	struct btrfs_root *root = BTRFS_I(dir)->root;
>   	struct btrfs_path *path;
>   	struct btrfs_key key;
> -	struct inode *inode = NULL;
>   	int err;
> -	u64 index = 0;
>   	int name_len;
>   	int datasize;
>   	unsigned long ptr;
> @@ -9741,44 +9727,40 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	if (name_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info))
>   		return -ENAMETOOLONG;
>   
> -	/*
> -	 * 2 items for inode item and ref
> -	 * 2 items for dir items
> -	 * 1 item for updating parent inode item
> -	 * 1 item for the inline extent item
> -	 * 1 item for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 7);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> +	inode = new_inode(dir->i_sb);
> +	if (!inode)
> +		return -ENOMEM;
> +	inode_init_owner(mnt_userns, inode, dir, S_IFLNK | S_IRWXUGO);
> +	inode->i_op = &btrfs_symlink_inode_operations;
> +	inode_nohighmem(inode);
> +	inode->i_mapping->a_ops = &btrfs_aops;
> +	btrfs_i_size_write(BTRFS_I(inode), name_len);
> +	inode_set_bytes(inode, name_len);
>   
> -	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
> -				dentry->d_name.name, dentry->d_name.len,
> -				S_IFLNK | S_IRWXUGO, &index);
> -	if (IS_ERR(inode)) {
> -		err = PTR_ERR(inode);
> -		inode = NULL;
> -		goto out_unlock;
> +	new_inode_args.inode = inode;
> +	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
> +	if (err)
> +		goto out_inode;
> +	/* 1 additional item for the inline extent */
> +	trans_num_items++;
> +
> +	trans = btrfs_start_transaction(root, trans_num_items);
> +	if (IS_ERR(trans)) {
> +		err = PTR_ERR(trans);
> +		goto out_new_inode_args;
>   	}
>   
> -	/*
> -	* If the active LSM wants to access the inode during
> -	* d_instantiate it needs these. Smack checks to see
> -	* if the filesystem supports xattrs by looking at the
> -	* ops vector.
> -	*/
> -	inode->i_fop = &btrfs_file_operations;
> -	inode->i_op = &btrfs_file_inode_operations;
> -	inode->i_mapping->a_ops = &btrfs_aops;
> -
> -	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
> +	err = btrfs_create_new_inode(trans, &new_inode_args);
>   	if (err)
> -		goto out_unlock;
> +		goto out;
>   
>   	path = btrfs_alloc_path();
>   	if (!path) {
>   		err = -ENOMEM;
> -		goto out_unlock;
> +		btrfs_abort_transaction(trans, err);
> +		discard_new_inode(inode);
> +		inode = NULL;
> +		goto out;
>   	}
>   	key.objectid = btrfs_ino(BTRFS_I(inode));
>   	key.offset = 0;
> @@ -9787,8 +9769,11 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	err = btrfs_insert_empty_item(trans, root, path, &key,
>   				      datasize);
>   	if (err) {
> +		btrfs_abort_transaction(trans, err);
>   		btrfs_free_path(path);
> -		goto out_unlock;
> +		discard_new_inode(inode);
> +		inode = NULL;
> +		goto out;
>   	}
>   	leaf = path->nodes[0];
>   	ei = btrfs_item_ptr(leaf, path->slots[0],
> @@ -9806,32 +9791,16 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	btrfs_mark_buffer_dirty(leaf);
>   	btrfs_free_path(path);
>   
> -	inode->i_op = &btrfs_symlink_inode_operations;
> -	inode_nohighmem(inode);
> -	inode_set_bytes(inode, name_len);
> -	btrfs_i_size_write(BTRFS_I(inode), name_len);
> -	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	/*
> -	 * Last step, add directory indexes for our symlink inode. This is the
> -	 * last step to avoid extra cleanup of these indexes if an error happens
> -	 * elsewhere above.
> -	 */
> -	if (!err)
> -		err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -				     dentry->d_name.name, dentry->d_name.len, 0,
> -				     index);
> -	if (err)
> -		goto out_unlock;
> -
>   	d_instantiate_new(dentry, inode);
> -
> -out_unlock:
> +	err = 0;
> +out:
>   	btrfs_end_transaction(trans);
> -	if (err && inode) {
> -		inode_dec_link_count(inode);
> -		discard_new_inode(inode);
> -	}
>   	btrfs_btree_balance_dirty(fs_info);
> +out_new_inode_args:
> +	btrfs_new_inode_args_free(&new_inode_args);
> +out_inode:
> +	if (err)
> +		iput(inode);
>   	return err;
>   }
>   
> @@ -10085,59 +10054,61 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
>   			 struct dentry *dentry, umode_t mode)
>   {
>   	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> -	struct btrfs_trans_handle *trans;
>   	struct btrfs_root *root = BTRFS_I(dir)->root;
> -	struct inode *inode = NULL;
> -	u64 index;
> -	int ret = 0;
> -
> -	/*
> -	 * 5 units required for adding orphan entry
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> -
> -	inode = btrfs_new_inode(trans, root, mnt_userns, dir, NULL, 0,
> -			mode, &index);
> -	if (IS_ERR(inode)) {
> -		ret = PTR_ERR(inode);
> -		inode = NULL;
> -		goto out;
> -	}
> +	struct inode *inode;
> +	struct btrfs_new_inode_args new_inode_args = {
> +		.dir = dir,
> +		.dentry = dentry,
> +		.orphan = true,
> +	};
> +	unsigned int trans_num_items;
> +	struct btrfs_trans_handle *trans;
> +	int ret;
>   
> +	inode = new_inode(dir->i_sb);
> +	if (!inode)
> +		return -ENOMEM;
> +	inode_init_owner(mnt_userns, inode, dir, mode);
>   	inode->i_fop = &btrfs_file_operations;
>   	inode->i_op = &btrfs_file_inode_operations;
>   
>   	inode->i_mapping->a_ops = &btrfs_aops;
>   
> -	ret = btrfs_init_inode_security(trans, inode, dir, NULL);
> +	new_inode_args.inode = inode;
> +	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
>   	if (ret)
> -		goto out;
> +		goto out_inode;
>   
> -	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	if (ret)
> -		goto out;
> -	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> -	if (ret)
> -		goto out;
> +	trans = btrfs_start_transaction(root, trans_num_items);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		goto out_new_inode_args;
> +	}
> +
> +	ret = btrfs_create_new_inode(trans, &new_inode_args);
>   
>   	/*
> -	 * We set number of links to 0 in btrfs_new_inode(), and here we set
> -	 * it to 1 because d_tmpfile() will issue a warning if the count is 0,
> -	 * through:
> +	 * We set number of links to 0 in btrfs_create_new_inode(), and here we
> +	 * set it to 1 because d_tmpfile() will issue a warning if the count is
> +	 * 0, through:
>   	 *
>   	 *    d_tmpfile() -> inode_dec_link_count() -> drop_nlink()
>   	 */
>   	set_nlink(inode, 1);
> -	d_tmpfile(dentry, inode);
> -	unlock_new_inode(inode);
> -	mark_inode_dirty(inode);
> -out:
> +
> +	if (!ret) {
> +		d_tmpfile(dentry, inode);
> +		unlock_new_inode(inode);
> +		mark_inode_dirty(inode);
> +	}
> +
>   	btrfs_end_transaction(trans);
> -	if (ret && inode)
> -		discard_new_inode(inode);
>   	btrfs_btree_balance_dirty(fs_info);
> +out_new_inode_args:
> +	btrfs_new_inode_args_free(&new_inode_args);
> +out_inode:
> +	if (ret)
> +		iput(inode);
>   	return ret;
>   }
>   
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3cea5530ad83..4d217cff7da5 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -544,13 +544,38 @@ int __pure btrfs_is_empty_uuid(u8 *uuid)
>   	return 1;
>   }
>   
> +/*
> + * Calculate the number of transaction items to reserve for creating a subvolume
> + * or snapshot, not including the inode, directory entries, or parent directory.
> + */
> +static unsigned int create_subvol_num_items(struct btrfs_qgroup_inherit *inherit)
> +{
> +	/*
> +	 * 1 to add root block
> +	 * 1 to add root item
> +	 * 1 to add root ref
> +	 * 1 to add root backref
> +	 * 1 to add UUID item
> +	 * 1 to add qgroup info
> +	 * 1 to add qgroup limit
> +	 * (Ideally the last two would only be accounted if qgroups are enabled,
> +	 * but that can change between now and the time we would insert them)
> +	 */
> +	unsigned int num_items = 7;
> +
> +	if (inherit) {
> +		/* 2 to add qgroup relations for each inherited qgroup */
> +		num_items += 2 * inherit->num_qgroups;
> +	}
> +	return num_items;
> +}
> +
>   static noinline int create_subvol(struct user_namespace *mnt_userns,
>   				  struct inode *dir, struct dentry *dentry,
>   				  struct btrfs_qgroup_inherit *inherit)
>   {
> -	const char *name = dentry->d_name.name;
> -	int namelen = dentry->d_name.len;
>   	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
> +	unsigned int trans_num_items;
>   	struct btrfs_trans_handle *trans;
>   	struct btrfs_key key;
>   	struct btrfs_root_item *root_item;
> @@ -560,11 +585,14 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   	struct btrfs_root *new_root;
>   	struct btrfs_block_rsv block_rsv;
>   	struct timespec64 cur_time = current_time(dir);
> -	struct inode *inode;
> +	struct btrfs_new_inode_args new_inode_args = {
> +		.dir = dir,
> +		.dentry = dentry,
> +		.subvol = true,
> +	};
>   	int ret;
> -	dev_t anon_dev = 0;
> +	dev_t anon_dev;
>   	u64 objectid;
> -	u64 index = 0;
>   
>   	root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
>   	if (!root_item)
> @@ -572,11 +600,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   
>   	ret = btrfs_get_free_objectid(fs_info->tree_root, &objectid);
>   	if (ret)
> -		goto fail_free;
> -
> -	ret = get_anon_bdev(&anon_dev);
> -	if (ret < 0)
> -		goto fail_free;
> +		goto out_root_item;
>   
>   	/*
>   	 * Don't create subvolume whose level is not zero. Or qgroup will be
> @@ -584,36 +608,47 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   	 */
>   	if (btrfs_qgroup_level(objectid)) {
>   		ret = -ENOSPC;
> -		goto fail_free;
> +		goto out_root_item;
>   	}
>   
> -	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
> -	/*
> -	 * The same as the snapshot creation, please see the comment
> -	 * of create_snapshot().
> -	 */
> -	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 8, false);
> +	ret = get_anon_bdev(&anon_dev);
> +	if (ret < 0)
> +		goto out_root_item;
> +
> +	new_inode_args.inode = btrfs_new_subvol_inode(mnt_userns, dir);
> +	if (!new_inode_args.inode) {
> +		ret = -ENOMEM;
> +		goto out_anon_dev;
> +	}
> +	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
>   	if (ret)
> -		goto fail_free;
> +		goto out_inode;
> +	trans_num_items += create_subvol_num_items(inherit);
> +
> +	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
> +	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv,
> +					       trans_num_items, false);
> +	if (ret)
> +		goto out_new_inode_args;
>   
>   	trans = btrfs_start_transaction(root, 0);
>   	if (IS_ERR(trans)) {
>   		ret = PTR_ERR(trans);
>   		btrfs_subvolume_release_metadata(root, &block_rsv);
> -		goto fail_free;
> +		goto out_new_inode_args;
>   	}
>   	trans->block_rsv = &block_rsv;
>   	trans->bytes_reserved = block_rsv.size;
>   
>   	ret = btrfs_qgroup_inherit(trans, 0, objectid, inherit);
>   	if (ret)
> -		goto fail;
> +		goto out;
>   
>   	leaf = btrfs_alloc_tree_block(trans, root, 0, objectid, NULL, 0, 0, 0,
>   				      BTRFS_NESTING_NORMAL);
>   	if (IS_ERR(leaf)) {
>   		ret = PTR_ERR(leaf);
> -		goto fail;
> +		goto out;
>   	}
>   
>   	btrfs_mark_buffer_dirty(leaf);
> @@ -668,75 +703,47 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   		btrfs_tree_unlock(leaf);
>   		btrfs_free_tree_block(trans, objectid, leaf, 0, 1);
>   		free_extent_buffer(leaf);
> -		goto fail;
> +		goto out;
>   	}
>   
>   	free_extent_buffer(leaf);
>   	leaf = NULL;
>   
> -	key.offset = (u64)-1;
>   	new_root = btrfs_get_new_fs_root(fs_info, objectid, anon_dev);
>   	if (IS_ERR(new_root)) {
> -		free_anon_bdev(anon_dev);
>   		ret = PTR_ERR(new_root);
>   		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> +		goto out;
>   	}
> -	/* Freeing will be done in btrfs_put_root() of new_root */
> +	/* anon_dev is owned by new_root now. */
>   	anon_dev = 0;
> +	BTRFS_I(new_inode_args.inode)->root = new_root;
> +	/* ... and new_root is owned by new_inode.inode now. */
>   
>   	ret = btrfs_record_root_in_trans(trans, new_root);
>   	if (ret) {
>   		btrfs_put_root(new_root);
>   		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> -	}
> -
> -	ret = btrfs_create_subvol_root(trans, new_root, root, mnt_userns);
> -	btrfs_put_root(new_root);
> -	if (ret) {
> -		/* We potentially lose an unused inode item here */
> -		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> -	}
> -
> -	/*
> -	 * insert the directory item
> -	 */
> -	ret = btrfs_set_inode_index(BTRFS_I(dir), &index);
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> -	}
> -
> -	ret = btrfs_insert_dir_item(trans, name, namelen, BTRFS_I(dir), &key,
> -				    BTRFS_FT_DIR, index);
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> -	}
> -
> -	btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
> -	ret = btrfs_update_inode(trans, root, BTRFS_I(dir));
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> -	}
> -
> -	ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
> -				 btrfs_ino(BTRFS_I(dir)), index, name, namelen);
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto fail;
> +		goto out;
>   	}
>   
>   	ret = btrfs_uuid_tree_add(trans, root_item->uuid,
>   				  BTRFS_UUID_KEY_SUBVOL, objectid);
> -	if (ret)
> +	if (ret) {
>   		btrfs_abort_transaction(trans, ret);
> +		goto out;
> +	}
>   
> -fail:
> -	kfree(root_item);
> +	ret = btrfs_create_new_inode(trans, &new_inode_args);
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto out;
> +	}
> +
> +	d_instantiate_new(dentry, new_inode_args.inode);
> +	new_inode_args.inode = NULL;
> +
> +out:
>   	trans->block_rsv = NULL;
>   	trans->bytes_reserved = 0;
>   	btrfs_subvolume_release_metadata(root, &block_rsv);
> @@ -745,18 +752,14 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   		btrfs_end_transaction(trans);
>   	else
>   		ret = btrfs_commit_transaction(trans);
> -
> -	if (!ret) {
> -		inode = btrfs_lookup_dentry(dir, dentry);
> -		if (IS_ERR(inode))
> -			return PTR_ERR(inode);
> -		d_instantiate(dentry, inode);
> -	}
> -	return ret;
> -
> -fail_free:
> +out_new_inode_args:
> +	btrfs_new_inode_args_free(&new_inode_args);
> +out_inode:
> +	iput(new_inode_args.inode);
> +out_anon_dev:
>   	if (anon_dev)
>   		free_anon_bdev(anon_dev);
> +out_root_item:
>   	kfree(root_item);
>   	return ret;
>   }
> @@ -769,6 +772,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
>   	struct inode *inode;
>   	struct btrfs_pending_snapshot *pending_snapshot;
>   	struct btrfs_trans_handle *trans;
> +	int trans_num_items;
>   	int ret;
>   
>   	/* We do not support snapshotting right now. */
> @@ -805,16 +809,14 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
>   	btrfs_init_block_rsv(&pending_snapshot->block_rsv,
>   			     BTRFS_BLOCK_RSV_TEMP);
>   	/*
> -	 * 1 - parent dir inode
> -	 * 2 - dir entries
> -	 * 1 - root item
> -	 * 2 - root ref/backref
> -	 * 1 - root of snapshot
> -	 * 1 - UUID item
> +	 * 1 to add dir item
> +	 * 1 to add dir index
> +	 * 1 to update parent inode item
>   	 */
> +	trans_num_items = create_subvol_num_items(inherit) + 3;
>   	ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root,
> -					&pending_snapshot->block_rsv, 8,
> -					false);
> +					       &pending_snapshot->block_rsv,
> +					       trans_num_items, false);
>   	if (ret)
>   		goto free_pending;
>   
> diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
> index 1a6d2d5b4b33..f5565c296898 100644
> --- a/fs/btrfs/props.c
> +++ b/fs/btrfs/props.c
> @@ -334,9 +334,8 @@ static struct prop_handler prop_handlers[] = {
>   	},
>   };
>   
> -static int inherit_props(struct btrfs_trans_handle *trans,
> -			 struct inode *inode,
> -			 struct inode *parent)
> +int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
> +			      struct inode *inode, struct inode *parent)
>   {
>   	struct btrfs_root *root = BTRFS_I(inode)->root;
>   	struct btrfs_fs_info *fs_info = root->fs_info;
> @@ -408,41 +407,6 @@ static int inherit_props(struct btrfs_trans_handle *trans,
>   	return 0;
>   }
>   
> -int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
> -			      struct inode *inode,
> -			      struct inode *dir)
> -{
> -	if (!dir)
> -		return 0;
> -
> -	return inherit_props(trans, inode, dir);
> -}
> -
> -int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
> -			       struct btrfs_root *root,
> -			       struct btrfs_root *parent_root)
> -{
> -	struct super_block *sb = root->fs_info->sb;
> -	struct inode *parent_inode, *child_inode;
> -	int ret;
> -
> -	parent_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, parent_root);
> -	if (IS_ERR(parent_inode))
> -		return PTR_ERR(parent_inode);
> -
> -	child_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, root);
> -	if (IS_ERR(child_inode)) {
> -		iput(parent_inode);
> -		return PTR_ERR(child_inode);
> -	}
> -
> -	ret = inherit_props(trans, child_inode, parent_inode);
> -	iput(child_inode);
> -	iput(parent_inode);
> -
> -	return ret;
> -}
> -
>   void __init btrfs_props_init(void)
>   {
>   	int i;
> diff --git a/fs/btrfs/props.h b/fs/btrfs/props.h
> index 40b2c65b518c..1dcd5daa3b22 100644
> --- a/fs/btrfs/props.h
> +++ b/fs/btrfs/props.h
> @@ -21,8 +21,4 @@ int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
>   			      struct inode *inode,
>   			      struct inode *dir);
>   
> -int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
> -			       struct btrfs_root *root,
> -			       struct btrfs_root *parent_root);
> -
>   #endif

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/12] btrfs: inode creation cleanups and fixes
  2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (11 preceding siblings ...)
  2022-03-03 23:19 ` [PATCH 12/12] btrfs: rework inode creation to fix several issues Omar Sandoval
@ 2022-03-07 14:41 ` Sweet Tea Dorminy
  12 siblings, 0 replies; 18+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-07 14:41 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team

For 1-11, please feel free to add

Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>

On 3/3/22 18:18, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> This series contains several cleanups and fixes for our inode creation
> codepaths. The main motivation for this is preparation for fscrypt
> support (in particular, setting up the fscrypt context at inode creation
> time). But, it also reduces a lot of code duplication and fixes some
> minor bugs, so it's worth getting in ahead of time.
>
> Patch 12 is the main change, which unifies and simplifies all of our
> inode creation codepaths. Patches 1-11 are small cleanups that I was
> able to peel off of the big patch.
>
> Thanks!
>
> Omar Sandoval (12):
>    btrfs: reserve correct number of items for unlink and rmdir
>    btrfs: reserve correct number of items for rename
>    btrfs: get rid of btrfs_add_nondir()
>    btrfs: remove unnecessary btrfs_i_size_write(0) calls
>    btrfs: remove unnecessary inode_set_bytes(0) call
>    btrfs: remove unnecessary set_nlink() in btrfs_create_subvol_root()
>    btrfs: remove unused mnt_userns parameter from __btrfs_set_acl
>    btrfs: remove redundant name and name_len parameters to create_subvol
>    btrfs: don't pass parent objectid to btrfs_new_inode() explicitly
>    btrfs: move btrfs_get_free_objectid() call into btrfs_new_inode()
>    btrfs: set inode flags earlier in btrfs_new_inode()
>    btrfs: rework inode creation to fix several issues
>
>   fs/btrfs/acl.c   |  39 +-
>   fs/btrfs/ctree.h |  39 +-
>   fs/btrfs/inode.c | 944 +++++++++++++++++++++++------------------------
>   fs/btrfs/ioctl.c | 175 ++++-----
>   fs/btrfs/props.c |  40 +-
>   fs/btrfs/props.h |   4 -
>   6 files changed, 581 insertions(+), 660 deletions(-)
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 12/12] btrfs: rework inode creation to fix several issues
  2022-03-07 12:28   ` David Sterba
@ 2022-03-10  0:18     ` Omar Sandoval
  0 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-10  0:18 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On Mon, Mar 07, 2022 at 01:28:03PM +0100, David Sterba wrote:
> On Thu, Mar 03, 2022 at 03:19:02PM -0800, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> 
> >  fs/btrfs/acl.c   |  36 +--
> >  fs/btrfs/ctree.h |  39 ++-
> >  fs/btrfs/inode.c | 801 +++++++++++++++++++++++------------------------
> >  fs/btrfs/ioctl.c | 174 +++++-----
> >  fs/btrfs/props.c |  40 +--
> >  fs/btrfs/props.h |   4 -
> >  6 files changed, 508 insertions(+), 586 deletions(-)
> 
> Can this be split into more patches? All fine from 1 to 11 and now this,
> it's just too much code change and I don't want to take risk by yet
> another rewrite.

Yeah, at first I thought it'd be too hard to split up the last patch
further, but after you and Sweet Tea both asked me to, I managed to
split out another 4 patches. The end result is almost identical, but
that'll hopefully be easier to review. I'll send it out once xfstests
pass.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 12/12] btrfs: rework inode creation to fix several issues
  2022-03-07 13:25   ` Sweet Tea Dorminy
@ 2022-03-10  0:20     ` Omar Sandoval
  0 siblings, 0 replies; 18+ messages in thread
From: Omar Sandoval @ 2022-03-10  0:20 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, kernel-team

On Mon, Mar 07, 2022 at 08:25:01AM -0500, Sweet Tea Dorminy wrote:
> I like the changes overall. However, I was hoping:
> 
> a) would it be possible to just refactor to use btrfs_new_inode_prepare()
> first and then fixup the non-refactoring parts?

Yup, I managed to split that part out.

> b) the naming is a bit confusing to me: I don't usually think of ..free() as
> the inverse action of ...prepare(). Also, ...free() feels weird to be taking
> a stack pointer. Perhaps _{init,uninit}() or _{prepare,destroy}() might be a
> clearer set of names?

I like the _{prepare,destroy}() naming, I'll go with that.

Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-03-10  0:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03 23:18 [PATCH 00/12] btrfs: inode creation cleanups and fixes Omar Sandoval
2022-03-03 23:18 ` [PATCH 01/12] btrfs: reserve correct number of items for unlink and rmdir Omar Sandoval
2022-03-03 23:18 ` [PATCH 02/12] btrfs: reserve correct number of items for rename Omar Sandoval
2022-03-03 23:18 ` [PATCH 03/12] btrfs: get rid of btrfs_add_nondir() Omar Sandoval
2022-03-03 23:18 ` [PATCH 04/12] btrfs: remove unnecessary btrfs_i_size_write(0) calls Omar Sandoval
2022-03-03 23:18 ` [PATCH 05/12] btrfs: remove unnecessary inode_set_bytes(0) call Omar Sandoval
2022-03-03 23:18 ` [PATCH 06/12] btrfs: remove unnecessary set_nlink() in btrfs_create_subvol_root() Omar Sandoval
2022-03-03 23:18 ` [PATCH 07/12] btrfs: remove unused mnt_userns parameter from __btrfs_set_acl Omar Sandoval
2022-03-03 23:18 ` [PATCH 08/12] btrfs: remove redundant name and name_len parameters to create_subvol Omar Sandoval
2022-03-03 23:18 ` [PATCH 09/12] btrfs: don't pass parent objectid to btrfs_new_inode() explicitly Omar Sandoval
2022-03-03 23:19 ` [PATCH 10/12] btrfs: move btrfs_get_free_objectid() call into btrfs_new_inode() Omar Sandoval
2022-03-03 23:19 ` [PATCH 11/12] btrfs: set inode flags earlier in btrfs_new_inode() Omar Sandoval
2022-03-03 23:19 ` [PATCH 12/12] btrfs: rework inode creation to fix several issues Omar Sandoval
2022-03-07 12:28   ` David Sterba
2022-03-10  0:18     ` Omar Sandoval
2022-03-07 13:25   ` Sweet Tea Dorminy
2022-03-10  0:20     ` Omar Sandoval
2022-03-07 14:41 ` [PATCH 00/12] btrfs: inode creation cleanups and fixes Sweet Tea Dorminy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.