All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] btrfs: inode creation cleanups and fixes
@ 2022-03-15  1:12 Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 1/4] btrfs: allocate inode outside of btrfs_new_inode() Omar Sandoval
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Omar Sandoval @ 2022-03-15  1:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

This series contains minor updates of the final four patches of my
previous series [1].

Based on misc-next with the previous version of "btrfs: allocate inode
outside of btrfs_new_inode()" removed.

Changes since v3 [2]:

- Fixed a struct btrfs_root leak [3] in patch 1 which was fixed later in
  the same series by patch 3 for the sake of git bisect.

Changes since v2:

- Mentioned reason for removal of btrfs_lookup_dentry() call in
  create_subvol() in commit message of patch 1.
- Made btrfs_init_inode_security() also take btrfs_new_inode_args in
  patch 2.

Thanks!

1: https://lore.kernel.org/linux-btrfs/cover.1646875648.git.osandov@fb.com/
2: https://lore.kernel.org/linux-btrfs/cover.1647288019.git.osandov@fb.com/
3: https://lore.kernel.org/linux-btrfs/CAL3q7H5ACv3ej1=7P2y7mA1vCJoAcHkCqro6_VBuAUxeaw25rw@mail.gmail.com/

Omar Sandoval (4):
  btrfs: allocate inode outside of btrfs_new_inode()
  btrfs: factor out common part of btrfs_{mknod,create,mkdir}()
  btrfs: reserve correct number of items for inode creation
  btrfs: move common inode creation code into btrfs_create_new_inode()

 fs/btrfs/acl.c   |  36 +--
 fs/btrfs/ctree.h |  37 ++-
 fs/btrfs/inode.c | 800 +++++++++++++++++++++++------------------------
 fs/btrfs/ioctl.c | 140 +++++----
 fs/btrfs/props.c |  40 +--
 fs/btrfs/props.h |   4 -
 6 files changed, 489 insertions(+), 568 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/4] btrfs: allocate inode outside of btrfs_new_inode()
  2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
@ 2022-03-15  1:12 ` Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 2/4] btrfs: factor out common part of btrfs_{mknod,create,mkdir}() Omar Sandoval
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Omar Sandoval @ 2022-03-15  1:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

Instead of calling new_inode() and inode_init_owner() inside of
btrfs_new_inode(), do it in the callers. This allows us to pass in just
the inode instead of the mnt_userns and mode and removes the need for
memalloc_nofs_{save,restores}() since we do it before starting a
transaction. In create_subvol(), it also means we no longer have to look
up the inode again to instantiate it. This also paves the way for some
more cleanups in later patches.

This also removes the comments about Smack checking i_op, which are no
longer true since commit 5d6c31910bc0 ("xattr: Add
__vfs_{get,set,remove}xattr helpers"). Now it checks inode->i_opflags &
IOP_XATTR, which is set based on sb->s_xattr.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/ctree.h |   5 +-
 fs/btrfs/inode.c | 287 +++++++++++++++++++++++++----------------------
 fs/btrfs/ioctl.c |  22 ++--
 3 files changed, 169 insertions(+), 145 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4db17bd05a21..f39730420e8a 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3254,10 +3254,11 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
 int btrfs_set_extent_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 			      unsigned int extra_bits,
 			      struct extent_state **cached_state);
+struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
+				     struct inode *dir);
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *new_root,
 			     struct btrfs_root *parent_root,
-			     struct user_namespace *mnt_userns);
+			     struct inode *inode);
  void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
 			       unsigned *bits);
 void btrfs_clear_delalloc_extent(struct inode *inode,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c47bdada2440..d616a3a35e83 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6090,15 +6090,12 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 	btrfs_sync_inode_flags_to_i_flags(inode);
 }
 
-static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
-				     struct btrfs_root *root,
-				     struct user_namespace *mnt_userns,
-				     struct inode *dir,
-				     const char *name, int name_len,
-				     umode_t mode, u64 *index)
+static int btrfs_new_inode(struct btrfs_trans_handle *trans,
+			   struct btrfs_root *root, struct inode *inode,
+			   struct inode *dir, const char *name, int name_len,
+			   u64 *index)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct inode *inode;
 	struct btrfs_inode_item *inode_item;
 	struct btrfs_key *location;
 	struct btrfs_path *path;
@@ -6108,20 +6105,11 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	u32 sizes[2];
 	struct btrfs_item_batch batch;
 	unsigned long ptr;
-	unsigned int nofs_flag;
 	int ret;
 
 	path = btrfs_alloc_path();
 	if (!path)
-		return ERR_PTR(-ENOMEM);
-
-	nofs_flag = memalloc_nofs_save();
-	inode = new_inode(fs_info->sb);
-	memalloc_nofs_restore(nofs_flag);
-	if (!inode) {
-		btrfs_free_path(path);
-		return ERR_PTR(-ENOMEM);
-	}
+		return -ENOMEM;
 
 	/*
 	 * O_TMPFILE, set link count to 0, so that after this point,
@@ -6133,8 +6121,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	ret = btrfs_get_free_objectid(root, &objectid);
 	if (ret) {
 		btrfs_free_path(path);
-		iput(inode);
-		return ERR_PTR(ret);
+		return ret;
 	}
 	inode->i_ino = objectid;
 
@@ -6144,8 +6131,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 		ret = btrfs_set_inode_index(BTRFS_I(dir), index);
 		if (ret) {
 			btrfs_free_path(path);
-			iput(inode);
-			return ERR_PTR(ret);
+			return ret;
 		}
 	} else if (dir) {
 		*index = 0;
@@ -6157,13 +6143,14 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	 */
 	BTRFS_I(inode)->index_cnt = 2;
 	BTRFS_I(inode)->dir_index = *index;
-	BTRFS_I(inode)->root = btrfs_grab_root(root);
+	if (!BTRFS_I(inode)->root)
+		BTRFS_I(inode)->root = btrfs_grab_root(root);
 	BTRFS_I(inode)->generation = trans->transid;
 	inode->i_generation = BTRFS_I(inode)->generation;
 
 	btrfs_inherit_iflags(inode, dir);
 
-	if (S_ISREG(mode)) {
+	if (S_ISREG(inode->i_mode)) {
 		if (btrfs_test_opt(fs_info, NODATASUM))
 			BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
 		if (btrfs_test_opt(fs_info, NODATACOW))
@@ -6208,10 +6195,8 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	location->type = BTRFS_INODE_ITEM_KEY;
 
 	ret = btrfs_insert_inode_locked(inode);
-	if (ret < 0) {
-		iput(inode);
+	if (ret < 0)
 		goto fail;
-	}
 
 	batch.keys = &key[0];
 	batch.data_sizes = &sizes[0];
@@ -6221,8 +6206,6 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	if (ret != 0)
 		goto fail_unlock;
 
-	inode_init_owner(mnt_userns, inode, dir, mode);
-
 	inode->i_mtime = current_time(inode);
 	inode->i_atime = inode->i_mtime;
 	inode->i_ctime = inode->i_mtime;
@@ -6259,15 +6242,20 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 			  "error inheriting props for ino %llu (root %llu): %d",
 			btrfs_ino(BTRFS_I(inode)), root->root_key.objectid, ret);
 
-	return inode;
+	return 0;
 
 fail_unlock:
+	/*
+	 * discard_new_inode() calls iput(), but the caller owns the reference
+	 * to the inode.
+	 */
+	ihold(inode);
 	discard_new_inode(inode);
 fail:
 	if (dir && name)
 		BTRFS_I(dir)->index_cnt--;
 	btrfs_free_path(path);
-	return ERR_PTR(ret);
+	return ret;
 }
 
 /*
@@ -6365,37 +6353,36 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode = NULL;
+	struct inode *inode;
 	int err;
 	u64 index = 0;
 
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
+	inode->i_op = &btrfs_special_inode_operations;
+	init_special_inode(inode, inode->i_mode, rdev);
+
 	/*
 	 * 2 for inode item and ref
 	 * 2 for dir items
 	 * 1 for xattr if selinux is on
 	 */
 	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
+	if (IS_ERR(trans)) {
+		iput(inode);
 		return PTR_ERR(trans);
+	}
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-			dentry->d_name.name, dentry->d_name.len,
-			mode, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
+	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
+			      dentry->d_name.len, &index);
+	if (err) {
+		iput(inode);
 		inode = NULL;
 		goto out_unlock;
 	}
 
-	/*
-	* If the active LSM wants to access the inode during
-	* d_instantiate it needs these. Smack checks to see
-	* if the filesystem supports xattrs by looking at the
-	* ops vector.
-	*/
-	inode->i_op = &btrfs_special_inode_operations;
-	init_special_inode(inode, inode->i_mode, rdev);
-
 	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
 	if (err)
 		goto out_unlock;
@@ -6424,36 +6411,36 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode = NULL;
+	struct inode *inode;
 	int err;
 	u64 index = 0;
 
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
+	inode->i_fop = &btrfs_file_operations;
+	inode->i_op = &btrfs_file_inode_operations;
+	inode->i_mapping->a_ops = &btrfs_aops;
+
 	/*
 	 * 2 for inode item and ref
 	 * 2 for dir items
 	 * 1 for xattr if selinux is on
 	 */
 	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
+	if (IS_ERR(trans)) {
+		iput(inode);
 		return PTR_ERR(trans);
+	}
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-			dentry->d_name.name, dentry->d_name.len,
-			mode, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
+	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
+			      dentry->d_name.len, &index);
+	if (err) {
+		iput(inode);
 		inode = NULL;
 		goto out_unlock;
 	}
-	/*
-	* If the active LSM wants to access the inode during
-	* d_instantiate it needs these. Smack checks to see
-	* if the filesystem supports xattrs by looking at the
-	* ops vector.
-	*/
-	inode->i_fop = &btrfs_file_operations;
-	inode->i_op = &btrfs_file_inode_operations;
-	inode->i_mapping->a_ops = &btrfs_aops;
 
 	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
 	if (err)
@@ -6562,34 +6549,38 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 		       struct dentry *dentry, umode_t mode)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct inode *inode = NULL;
+	struct inode *inode;
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
-	int err = 0;
+	int err;
 	u64 index = 0;
 
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, S_IFDIR | mode);
+	inode->i_op = &btrfs_dir_inode_operations;
+	inode->i_fop = &btrfs_dir_file_operations;
+
 	/*
 	 * 2 items for inode and ref
 	 * 2 items for dir items
 	 * 1 for xattr if selinux is on
 	 */
 	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
+	if (IS_ERR(trans)) {
+		iput(inode);
 		return PTR_ERR(trans);
+	}
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-			dentry->d_name.name, dentry->d_name.len,
-			S_IFDIR | mode, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
+	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
+			      dentry->d_name.len, &index);
+	if (err) {
+		iput(inode);
 		inode = NULL;
 		goto out_fail;
 	}
 
-	/* these must be set before we unlock the inode */
-	inode->i_op = &btrfs_dir_inode_operations;
-	inode->i_fop = &btrfs_dir_file_operations;
-
 	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
 	if (err)
 		goto out_fail;
@@ -8747,25 +8738,39 @@ static int btrfs_truncate(struct inode *inode, bool skip_writeback)
 	return ret;
 }
 
+struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
+				     struct inode *dir)
+{
+	struct inode *inode;
+
+	inode = new_inode(dir->i_sb);
+	if (inode) {
+		/*
+		 * Subvolumes don't inherit the sgid bit or the parent's gid if
+		 * the parent's sgid bit is set. This is probably a bug.
+		 */
+		inode_init_owner(mnt_userns, inode, NULL,
+				 S_IFDIR | (~current_umask() & S_IRWXUGO));
+		inode->i_op = &btrfs_dir_inode_operations;
+		inode->i_fop = &btrfs_dir_file_operations;
+	}
+	return inode;
+}
+
 /*
  * create a new subvolume directory/inode (helper for the ioctl).
  */
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *new_root,
 			     struct btrfs_root *parent_root,
-			     struct user_namespace *mnt_userns)
+			     struct inode *inode)
 {
-	struct inode *inode;
+	struct btrfs_root *new_root = BTRFS_I(inode)->root;
 	int err;
 	u64 index = 0;
 
-	inode = btrfs_new_inode(trans, new_root, mnt_userns, NULL, "..", 2,
-				S_IFDIR | (~current_umask() & S_IRWXUGO),
-				&index);
-	if (IS_ERR(inode))
-		return PTR_ERR(inode);
-	inode->i_op = &btrfs_dir_inode_operations;
-	inode->i_fop = &btrfs_dir_file_operations;
+	err = btrfs_new_inode(trans, new_root, inode, NULL, "..", 2, &index);
+	if (err)
+		return err;
 
 	unlock_new_inode(inode);
 
@@ -8776,8 +8781,6 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 			  new_root->root_key.objectid, err);
 
 	err = btrfs_update_inode(trans, new_root, BTRFS_I(inode));
-
-	iput(inode);
 	return err;
 }
 
@@ -9254,31 +9257,36 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 	return ret;
 }
 
+static struct inode *new_whiteout_inode(struct user_namespace *mnt_userns,
+					struct inode *dir)
+{
+	struct inode *inode;
+
+	inode = new_inode(dir->i_sb);
+	if (inode) {
+		inode_init_owner(mnt_userns, inode, dir,
+				 S_IFCHR | WHITEOUT_MODE);
+		inode->i_op = &btrfs_special_inode_operations;
+		init_special_inode(inode, inode->i_mode, WHITEOUT_DEV);
+	}
+	return inode;
+}
+
 static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 				     struct btrfs_root *root,
-				     struct user_namespace *mnt_userns,
-				     struct inode *dir,
+				     struct inode *inode, struct inode *dir,
 				     struct dentry *dentry)
 {
 	int ret;
-	struct inode *inode;
 	u64 index;
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-				dentry->d_name.name,
-				dentry->d_name.len,
-				S_IFCHR | WHITEOUT_MODE,
-				&index);
-
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
+	ret = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
+			      dentry->d_name.len, &index);
+	if (ret) {
+		iput(inode);
 		return ret;
 	}
 
-	inode->i_op = &btrfs_special_inode_operations;
-	init_special_inode(inode, inode->i_mode,
-		WHITEOUT_DEV);
-
 	ret = btrfs_init_inode_security(trans, inode, dir,
 				&dentry->d_name);
 	if (ret)
@@ -9305,6 +9313,7 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 			unsigned int flags)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
+	struct inode *whiteout_inode;
 	struct btrfs_trans_handle *trans;
 	unsigned int trans_num_items;
 	struct btrfs_root *root = BTRFS_I(old_dir)->root;
@@ -9359,6 +9368,12 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 	if (new_inode && S_ISREG(old_inode->i_mode) && new_inode->i_size)
 		filemap_flush(old_inode->i_mapping);
 
+	if (flags & RENAME_WHITEOUT) {
+		whiteout_inode = new_whiteout_inode(mnt_userns, old_dir);
+		if (!whiteout_inode)
+			return -ENOMEM;
+	}
+
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
 		/* close the racy window with snapshot create/destroy ioctl */
 		down_read(&fs_info->subvol_sem);
@@ -9495,9 +9510,9 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 				   rename_ctx.index, new_dentry->d_parent);
 
 	if (flags & RENAME_WHITEOUT) {
-		ret = btrfs_whiteout_for_rename(trans, root, mnt_userns,
+		ret = btrfs_whiteout_for_rename(trans, root, whiteout_inode,
 						old_dir, old_dentry);
-
+		whiteout_inode = NULL;
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
 			goto out_fail;
@@ -9509,7 +9524,8 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 out_notrans:
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
 		up_read(&fs_info->subvol_sem);
-
+	if (flags & RENAME_WHITEOUT)
+		iput(whiteout_inode);
 	return ret;
 }
 
@@ -9728,7 +9744,7 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct btrfs_path *path;
 	struct btrfs_key key;
-	struct inode *inode = NULL;
+	struct inode *inode;
 	int err;
 	u64 index = 0;
 	int name_len;
@@ -9741,6 +9757,14 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	if (name_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info))
 		return -ENAMETOOLONG;
 
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, S_IFLNK | S_IRWXUGO);
+	inode->i_op = &btrfs_symlink_inode_operations;
+	inode_nohighmem(inode);
+	inode->i_mapping->a_ops = &btrfs_aops;
+
 	/*
 	 * 2 items for inode item and ref
 	 * 2 items for dir items
@@ -9749,28 +9773,19 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	 * 1 item for xattr if selinux is on
 	 */
 	trans = btrfs_start_transaction(root, 7);
-	if (IS_ERR(trans))
+	if (IS_ERR(trans)) {
+		iput(inode);
 		return PTR_ERR(trans);
+	}
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
-				dentry->d_name.name, dentry->d_name.len,
-				S_IFLNK | S_IRWXUGO, &index);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
+	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
+			      dentry->d_name.len, &index);
+	if (err) {
+		iput(inode);
 		inode = NULL;
 		goto out_unlock;
 	}
 
-	/*
-	* If the active LSM wants to access the inode during
-	* d_instantiate it needs these. Smack checks to see
-	* if the filesystem supports xattrs by looking at the
-	* ops vector.
-	*/
-	inode->i_fop = &btrfs_file_operations;
-	inode->i_op = &btrfs_file_inode_operations;
-	inode->i_mapping->a_ops = &btrfs_aops;
-
 	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
 	if (err)
 		goto out_unlock;
@@ -9806,8 +9821,6 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_free_path(path);
 
-	inode->i_op = &btrfs_symlink_inode_operations;
-	inode_nohighmem(inode);
 	inode_set_bytes(inode, name_len);
 	btrfs_i_size_write(BTRFS_I(inode), name_len);
 	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
@@ -10087,30 +10100,34 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode = NULL;
+	struct inode *inode;
 	u64 index;
-	int ret = 0;
+	int ret;
+
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
+	inode->i_fop = &btrfs_file_operations;
+	inode->i_op = &btrfs_file_inode_operations;
+	inode->i_mapping->a_ops = &btrfs_aops;
 
 	/*
 	 * 5 units required for adding orphan entry
 	 */
 	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans))
+	if (IS_ERR(trans)) {
+		iput(inode);
 		return PTR_ERR(trans);
+	}
 
-	inode = btrfs_new_inode(trans, root, mnt_userns, dir, NULL, 0,
-			mode, &index);
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
+	ret = btrfs_new_inode(trans, root, inode, dir, NULL, 0, &index);
+	if (ret) {
+		iput(inode);
 		inode = NULL;
 		goto out;
 	}
 
-	inode->i_fop = &btrfs_file_operations;
-	inode->i_op = &btrfs_file_inode_operations;
-
-	inode->i_mapping->a_ops = &btrfs_aops;
-
 	ret = btrfs_init_inode_security(trans, inode, dir, NULL);
 	if (ret)
 		goto out;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 891352fd6d0f..60c907b14547 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -587,6 +587,12 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	if (ret < 0)
 		goto out_root_item;
 
+	inode = btrfs_new_subvol_inode(mnt_userns, dir);
+	if (!inode) {
+		ret = -ENOMEM;
+		goto out_anon_dev;
+	}
+
 	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
 	/*
 	 * The same as the snapshot creation, please see the comment
@@ -594,13 +600,13 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	 */
 	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 8, false);
 	if (ret)
-		goto out_anon_dev;
+		goto out_inode;
 
 	trans = btrfs_start_transaction(root, 0);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
 		btrfs_subvolume_release_metadata(root, &block_rsv);
-		goto out_anon_dev;
+		goto out_inode;
 	}
 	trans->block_rsv = &block_rsv;
 	trans->bytes_reserved = block_rsv.size;
@@ -683,16 +689,16 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	}
 	/* anon_dev is owned by new_root now. */
 	anon_dev = 0;
+	BTRFS_I(inode)->root = new_root;
+	/* ... and new_root is owned by inode now. */
 
 	ret = btrfs_record_root_in_trans(trans, new_root);
 	if (ret) {
-		btrfs_put_root(new_root);
 		btrfs_abort_transaction(trans, ret);
 		goto out;
 	}
 
-	ret = btrfs_create_subvol_root(trans, new_root, root, mnt_userns);
-	btrfs_put_root(new_root);
+	ret = btrfs_create_subvol_root(trans, root, inode);
 	if (ret) {
 		/* We potentially lose an unused inode item here */
 		btrfs_abort_transaction(trans, ret);
@@ -745,11 +751,11 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		ret = btrfs_commit_transaction(trans);
 
 	if (!ret) {
-		inode = btrfs_lookup_dentry(dir, dentry);
-		if (IS_ERR(inode))
-			return PTR_ERR(inode);
 		d_instantiate(dentry, inode);
+		inode = NULL;
 	}
+out_inode:
+	iput(inode);
 out_anon_dev:
 	if (anon_dev)
 		free_anon_bdev(anon_dev);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/4] btrfs: factor out common part of btrfs_{mknod,create,mkdir}()
  2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 1/4] btrfs: allocate inode outside of btrfs_new_inode() Omar Sandoval
@ 2022-03-15  1:12 ` Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 3/4] btrfs: reserve correct number of items for inode creation Omar Sandoval
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Omar Sandoval @ 2022-03-15  1:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

btrfs_{mknod,create,mkdir}() are now identical other than the inode
initialization and some inconsequential function call order differences.
Factor out the common code to reduce code duplication.

Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/inode.c | 152 ++++++++++-------------------------------------
 1 file changed, 33 insertions(+), 119 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d616a3a35e83..4ed6157335c4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6347,82 +6347,15 @@ int btrfs_add_link(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
-		       struct dentry *dentry, umode_t mode, dev_t rdev)
+static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
+			       struct inode *inode)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode;
+	struct btrfs_trans_handle *trans;
 	int err;
 	u64 index = 0;
 
-	inode = new_inode(dir->i_sb);
-	if (!inode)
-		return -ENOMEM;
-	inode_init_owner(mnt_userns, inode, dir, mode);
-	inode->i_op = &btrfs_special_inode_operations;
-	init_special_inode(inode, inode->i_mode, rdev);
-
-	/*
-	 * 2 for inode item and ref
-	 * 2 for dir items
-	 * 1 for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans)) {
-		iput(inode);
-		return PTR_ERR(trans);
-	}
-
-	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
-			      dentry->d_name.len, &index);
-	if (err) {
-		iput(inode);
-		inode = NULL;
-		goto out_unlock;
-	}
-
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
-	if (err)
-		goto out_unlock;
-
-	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			     dentry->d_name.name, dentry->d_name.len, 0, index);
-	if (err)
-		goto out_unlock;
-
-	btrfs_update_inode(trans, root, BTRFS_I(inode));
-	d_instantiate_new(dentry, inode);
-
-out_unlock:
-	btrfs_end_transaction(trans);
-	btrfs_btree_balance_dirty(fs_info);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
-	return err;
-}
-
-static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
-			struct dentry *dentry, umode_t mode, bool excl)
-{
-	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
-	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct inode *inode;
-	int err;
-	u64 index = 0;
-
-	inode = new_inode(dir->i_sb);
-	if (!inode)
-		return -ENOMEM;
-	inode_init_owner(mnt_userns, inode, dir, mode);
-	inode->i_fop = &btrfs_file_operations;
-	inode->i_op = &btrfs_file_inode_operations;
-	inode->i_mapping->a_ops = &btrfs_aops;
-
 	/*
 	 * 2 for inode item and ref
 	 * 2 for dir items
@@ -6467,6 +6400,35 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	return err;
 }
 
+static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
+		       struct dentry *dentry, umode_t mode, dev_t rdev)
+{
+	struct inode *inode;
+
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
+	inode->i_op = &btrfs_special_inode_operations;
+	init_special_inode(inode, inode->i_mode, rdev);
+	return btrfs_create_common(dir, dentry, inode);
+}
+
+static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
+			struct dentry *dentry, umode_t mode, bool excl)
+{
+	struct inode *inode;
+
+	inode = new_inode(dir->i_sb);
+	if (!inode)
+		return -ENOMEM;
+	inode_init_owner(mnt_userns, inode, dir, mode);
+	inode->i_fop = &btrfs_file_operations;
+	inode->i_op = &btrfs_file_inode_operations;
+	inode->i_mapping->a_ops = &btrfs_aops;
+	return btrfs_create_common(dir, dentry, inode);
+}
+
 static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
 		      struct dentry *dentry)
 {
@@ -6548,12 +6510,7 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
 static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 		       struct dentry *dentry, umode_t mode)
 {
-	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct inode *inode;
-	struct btrfs_trans_handle *trans;
-	struct btrfs_root *root = BTRFS_I(dir)->root;
-	int err;
-	u64 index = 0;
 
 	inode = new_inode(dir->i_sb);
 	if (!inode)
@@ -6561,50 +6518,7 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 	inode_init_owner(mnt_userns, inode, dir, S_IFDIR | mode);
 	inode->i_op = &btrfs_dir_inode_operations;
 	inode->i_fop = &btrfs_dir_file_operations;
-
-	/*
-	 * 2 items for inode and ref
-	 * 2 items for dir items
-	 * 1 for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans)) {
-		iput(inode);
-		return PTR_ERR(trans);
-	}
-
-	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
-			      dentry->d_name.len, &index);
-	if (err) {
-		iput(inode);
-		inode = NULL;
-		goto out_fail;
-	}
-
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
-	if (err)
-		goto out_fail;
-
-	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	if (err)
-		goto out_fail;
-
-	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			dentry->d_name.name,
-			dentry->d_name.len, 0, index);
-	if (err)
-		goto out_fail;
-
-	d_instantiate_new(dentry, inode);
-
-out_fail:
-	btrfs_end_transaction(trans);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
-	btrfs_btree_balance_dirty(fs_info);
-	return err;
+	return btrfs_create_common(dir, dentry, inode);
 }
 
 static noinline int uncompress_inline(struct btrfs_path *path,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/4] btrfs: reserve correct number of items for inode creation
  2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 1/4] btrfs: allocate inode outside of btrfs_new_inode() Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 2/4] btrfs: factor out common part of btrfs_{mknod,create,mkdir}() Omar Sandoval
@ 2022-03-15  1:12 ` Omar Sandoval
  2022-03-15  1:12 ` [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode() Omar Sandoval
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Omar Sandoval @ 2022-03-15  1:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

The various inode creation code paths do not account for the compression
property, POSIX ACLs, or the parent inode item when starting a
transaction. Fix it by refactoring all of these code paths to use a new
function, btrfs_new_inode_prepare(), which computes the correct number
of items. To do so, it needs to know whether POSIX ACLs will be created,
so move the ACL creation into that function. To reduce the number of
arguments that need to be passed around for inode creation, define
struct btrfs_new_inode_args containing all of the relevant information.

btrfs_new_inode_prepare() will also be a good place to set up the
fscrypt context and encrypted filename in the future.

Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/acl.c   |  36 +------
 fs/btrfs/ctree.h |  34 +++++--
 fs/btrfs/inode.c | 251 ++++++++++++++++++++++++++++++++++-------------
 fs/btrfs/ioctl.c |  83 +++++++++++-----
 4 files changed, 270 insertions(+), 134 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index a6909ec9bc38..548d6a5477b4 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -55,8 +55,8 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu)
 	return acl;
 }
 
-static int __btrfs_set_acl(struct btrfs_trans_handle *trans,
-			   struct inode *inode, struct posix_acl *acl, int type)
+int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
+		    struct posix_acl *acl, int type)
 {
 	int ret, size = 0;
 	const char *name;
@@ -127,35 +127,3 @@ int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
 		inode->i_mode = old_mode;
 	return ret;
 }
-
-int btrfs_init_acl(struct btrfs_trans_handle *trans,
-		   struct inode *inode, struct inode *dir)
-{
-	struct posix_acl *default_acl, *acl;
-	int ret = 0;
-
-	/* this happens with subvols */
-	if (!dir)
-		return 0;
-
-	ret = posix_acl_create(dir, &inode->i_mode, &default_acl, &acl);
-	if (ret)
-		return ret;
-
-	if (default_acl) {
-		ret = __btrfs_set_acl(trans, inode, default_acl,
-				      ACL_TYPE_DEFAULT);
-		posix_acl_release(default_acl);
-	}
-
-	if (acl) {
-		if (!ret)
-			ret = __btrfs_set_acl(trans, inode, acl,
-					      ACL_TYPE_ACCESS);
-		posix_acl_release(acl);
-	}
-
-	if (!default_acl && !acl)
-		cache_no_acl(inode);
-	return ret;
-}
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index f39730420e8a..322c02610e9e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3254,11 +3254,32 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
 int btrfs_set_extent_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 			      unsigned int extra_bits,
 			      struct extent_state **cached_state);
+struct btrfs_new_inode_args {
+	/* Input */
+	struct inode *dir;
+	struct dentry *dentry;
+	struct inode *inode;
+	bool orphan;
+	bool subvol;
+
+	/*
+	 * Output from btrfs_new_inode_prepare(), input to
+	 * btrfs_create_new_inode().
+	 */
+	struct posix_acl *default_acl;
+	struct posix_acl *acl;
+};
+int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
+			    unsigned int *trans_num_items);
+int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
+			   struct btrfs_new_inode_args *args,
+			   u64 *index);
+void btrfs_new_inode_args_destroy(struct btrfs_new_inode_args *args);
 struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
 				     struct inode *dir);
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 			     struct btrfs_root *parent_root,
-			     struct inode *inode);
+			     struct btrfs_new_inode_args *args);
  void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
 			       unsigned *bits);
 void btrfs_clear_delalloc_extent(struct inode *inode,
@@ -3816,15 +3837,16 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag)
 struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu);
 int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
 		  struct posix_acl *acl, int type);
-int btrfs_init_acl(struct btrfs_trans_handle *trans,
-		   struct inode *inode, struct inode *dir);
+int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
+		    struct posix_acl *acl, int type);
 #else
 #define btrfs_get_acl NULL
 #define btrfs_set_acl NULL
-static inline int btrfs_init_acl(struct btrfs_trans_handle *trans,
-				 struct inode *inode, struct inode *dir)
+static inline int __btrfs_set_acl(struct btrfs_trans_handle *trans,
+				  struct inode *inode, struct posix_acl *acl,
+				  int type)
 {
-	return 0;
+	return -EOPNOTSUPP;
 }
 #endif
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4ed6157335c4..3ce02378480f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -222,15 +222,26 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode,
 static int btrfs_dirty_inode(struct inode *inode);
 
 static int btrfs_init_inode_security(struct btrfs_trans_handle *trans,
-				     struct inode *inode,  struct inode *dir,
-				     const struct qstr *qstr)
+				     struct btrfs_new_inode_args *args)
 {
 	int err;
 
-	err = btrfs_init_acl(trans, inode, dir);
-	if (!err)
-		err = btrfs_xattr_security_init(trans, inode, dir, qstr);
-	return err;
+	if (args->default_acl) {
+		err = __btrfs_set_acl(trans, args->inode, args->default_acl,
+				      ACL_TYPE_DEFAULT);
+		if (err)
+			return err;
+	}
+	if (args->acl) {
+		err = __btrfs_set_acl(trans, args->inode, args->acl,
+				      ACL_TYPE_ACCESS);
+		if (err)
+			return err;
+	}
+	if (!args->default_acl && !args->acl)
+		cache_no_acl(args->inode);
+	return btrfs_xattr_security_init(trans, args->inode, args->dir,
+					 &args->dentry->d_name);
 }
 
 /*
@@ -6059,6 +6070,49 @@ static int btrfs_insert_inode_locked(struct inode *inode)
 		   btrfs_find_actor, &args);
 }
 
+int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
+			    unsigned int *trans_num_items)
+{
+	struct inode *dir = args->dir;
+	struct inode *inode = args->inode;
+	int ret;
+
+	ret = posix_acl_create(dir, &inode->i_mode, &args->default_acl,
+			       &args->acl);
+	if (ret)
+		return ret;
+
+	*trans_num_items = 1; /* 1 to add inode item */
+	if (BTRFS_I(dir)->prop_compress)
+		(*trans_num_items)++; /* 1 to add compression property */
+	if (args->default_acl)
+		(*trans_num_items)++; /* 1 to add default ACL xattr */
+	if (args->acl)
+		(*trans_num_items)++; /* 1 to add access ACL xattr */
+#ifdef CONFIG_SECURITY
+	if (dir->i_security)
+		(*trans_num_items)++; /* 1 to add LSM xattr */
+#endif
+	if (args->orphan) {
+		(*trans_num_items)++; /* 1 to add orphan item */
+	} else {
+		/*
+		 * 1 to add inode ref
+		 * 1 to add dir item
+		 * 1 to add dir index
+		 * 1 to update parent inode item
+		 */
+		*trans_num_items += 4;
+	}
+	return 0;
+}
+
+void btrfs_new_inode_args_destroy(struct btrfs_new_inode_args *args)
+{
+	posix_acl_release(args->acl);
+	posix_acl_release(args->default_acl);
+}
+
 /*
  * Inherit flags from the parent inode.
  *
@@ -6090,12 +6144,16 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 	btrfs_sync_inode_flags_to_i_flags(inode);
 }
 
-static int btrfs_new_inode(struct btrfs_trans_handle *trans,
-			   struct btrfs_root *root, struct inode *inode,
-			   struct inode *dir, const char *name, int name_len,
+int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
+			   struct btrfs_new_inode_args *args,
 			   u64 *index)
 {
-	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct inode *dir = args->subvol ? NULL : args->dir;
+	struct inode *inode = args->inode;
+	const char *name;
+	int name_len;
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	struct btrfs_root *root;
 	struct btrfs_inode_item *inode_item;
 	struct btrfs_key *location;
 	struct btrfs_path *path;
@@ -6107,6 +6165,17 @@ static int btrfs_new_inode(struct btrfs_trans_handle *trans,
 	unsigned long ptr;
 	int ret;
 
+	if (args->subvol) {
+		name = "..";
+		name_len = 2;
+	} else if (args->orphan) {
+		name = NULL;
+		name_len = 0;
+	} else {
+		name = args->dentry->d_name.name;
+		name_len = args->dentry->d_name.len;
+	}
+
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
@@ -6118,6 +6187,10 @@ static int btrfs_new_inode(struct btrfs_trans_handle *trans,
 	if (!name)
 		set_nlink(inode, 0);
 
+	if (!args->subvol)
+		BTRFS_I(inode)->root = btrfs_grab_root(BTRFS_I(dir)->root);
+	root = BTRFS_I(inode)->root;
+
 	ret = btrfs_get_free_objectid(root, &objectid);
 	if (ret) {
 		btrfs_free_path(path);
@@ -6143,8 +6216,6 @@ static int btrfs_new_inode(struct btrfs_trans_handle *trans,
 	 */
 	BTRFS_I(inode)->index_cnt = 2;
 	BTRFS_I(inode)->dir_index = *index;
-	if (!BTRFS_I(inode)->root)
-		BTRFS_I(inode)->root = btrfs_grab_root(root);
 	BTRFS_I(inode)->generation = trans->transid;
 	inode->i_generation = BTRFS_I(inode)->generation;
 
@@ -6352,30 +6423,37 @@ static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_root *root = BTRFS_I(dir)->root;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+		.inode = inode,
+	};
+	unsigned int trans_num_items;
 	struct btrfs_trans_handle *trans;
 	int err;
 	u64 index = 0;
 
-	/*
-	 * 2 for inode item and ref
-	 * 2 for dir items
-	 * 1 for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans)) {
+	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
+	if (err) {
 		iput(inode);
-		return PTR_ERR(trans);
+		return err;
 	}
 
-	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
-			      dentry->d_name.len, &index);
+	trans = btrfs_start_transaction(root, trans_num_items);
+	if (IS_ERR(trans)) {
+		iput(inode);
+		err = PTR_ERR(trans);
+		goto out_new_inode_args;
+	}
+
+	err = btrfs_create_new_inode(trans, &new_inode_args, &index);
 	if (err) {
 		iput(inode);
 		inode = NULL;
 		goto out_unlock;
 	}
 
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
+	err = btrfs_init_inode_security(trans, &new_inode_args);
 	if (err)
 		goto out_unlock;
 
@@ -6397,6 +6475,8 @@ static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
 		discard_new_inode(inode);
 	}
 	btrfs_btree_balance_dirty(fs_info);
+out_new_inode_args:
+	btrfs_new_inode_args_destroy(&new_inode_args);
 	return err;
 }
 
@@ -8676,13 +8756,14 @@ struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
  */
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 			     struct btrfs_root *parent_root,
-			     struct inode *inode)
+			     struct btrfs_new_inode_args *args)
 {
+	struct inode *inode = args->inode;
 	struct btrfs_root *new_root = BTRFS_I(inode)->root;
 	int err;
 	u64 index = 0;
 
-	err = btrfs_new_inode(trans, new_root, inode, NULL, "..", 2, &index);
+	err = btrfs_create_new_inode(trans, args, &index);
 	if (err)
 		return err;
 
@@ -9187,22 +9268,22 @@ static struct inode *new_whiteout_inode(struct user_namespace *mnt_userns,
 }
 
 static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
-				     struct btrfs_root *root,
-				     struct inode *inode, struct inode *dir,
-				     struct dentry *dentry)
+				     struct btrfs_new_inode_args *args)
 {
+	struct inode *inode = args->inode;
+	struct inode *dir = args->dir;
+	struct btrfs_root *root = BTRFS_I(dir)->root;
+	struct dentry *dentry = args->dentry;
 	int ret;
 	u64 index;
 
-	ret = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
-			      dentry->d_name.len, &index);
+	ret = btrfs_create_new_inode(trans, args, &index);
 	if (ret) {
 		iput(inode);
 		return ret;
 	}
 
-	ret = btrfs_init_inode_security(trans, inode, dir,
-				&dentry->d_name);
+	ret = btrfs_init_inode_security(trans, args);
 	if (ret)
 		goto out;
 
@@ -9227,7 +9308,10 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 			unsigned int flags)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
-	struct inode *whiteout_inode;
+	struct btrfs_new_inode_args whiteout_args = {
+		.dir = old_dir,
+		.dentry = old_dentry,
+	};
 	struct btrfs_trans_handle *trans;
 	unsigned int trans_num_items;
 	struct btrfs_root *root = BTRFS_I(old_dir)->root;
@@ -9283,9 +9367,15 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 		filemap_flush(old_inode->i_mapping);
 
 	if (flags & RENAME_WHITEOUT) {
-		whiteout_inode = new_whiteout_inode(mnt_userns, old_dir);
-		if (!whiteout_inode)
+		whiteout_args.inode = new_whiteout_inode(mnt_userns, old_dir);
+		if (!whiteout_args.inode)
 			return -ENOMEM;
+		ret = btrfs_new_inode_prepare(&whiteout_args, &trans_num_items);
+		if (ret)
+			goto out_whiteout_inode;
+	} else {
+		/* 1 to update the old parent inode. */
+		trans_num_items = 1;
 	}
 
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID) {
@@ -9297,24 +9387,25 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 		 * 1 to add new root ref
 		 * 1 to add new root backref
 		 */
-		trans_num_items = 4;
+		trans_num_items += 4;
 	} else {
 		/*
 		 * 1 to update inode
 		 * 1 to remove old inode ref
 		 * 1 to add new inode ref
 		 */
-		trans_num_items = 3;
+		trans_num_items += 3;
 	}
 	/*
 	 * 1 to remove old dir item
 	 * 1 to remove old dir index
-	 * 1 to update old parent inode
 	 * 1 to add new dir item
 	 * 1 to add new dir index
-	 * 1 to update new parent inode (if it's not the same as the old parent)
 	 */
-	trans_num_items += 6;
+	trans_num_items += 4;
+	/*
+	 * 1 to update new parent inode if it's not the same as the old parent
+	 */
 	if (new_dir != old_dir)
 		trans_num_items++;
 	if (new_inode) {
@@ -9327,8 +9418,6 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 		 */
 		trans_num_items += 5;
 	}
-	if (flags & RENAME_WHITEOUT)
-		trans_num_items += 5;
 	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
@@ -9424,9 +9513,8 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 				   rename_ctx.index, new_dentry->d_parent);
 
 	if (flags & RENAME_WHITEOUT) {
-		ret = btrfs_whiteout_for_rename(trans, root, whiteout_inode,
-						old_dir, old_dentry);
-		whiteout_inode = NULL;
+		ret = btrfs_whiteout_for_rename(trans, &whiteout_args);
+		whiteout_args.inode = NULL;
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
 			goto out_fail;
@@ -9439,7 +9527,10 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
 		up_read(&fs_info->subvol_sem);
 	if (flags & RENAME_WHITEOUT)
-		iput(whiteout_inode);
+		btrfs_new_inode_args_destroy(&whiteout_args);
+out_whiteout_inode:
+	if (flags & RENAME_WHITEOUT)
+		iput(whiteout_args.inode);
 	return ret;
 }
 
@@ -9659,6 +9750,11 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_path *path;
 	struct btrfs_key key;
 	struct inode *inode;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+	};
+	unsigned int trans_num_items;
 	int err;
 	u64 index = 0;
 	int name_len;
@@ -9679,28 +9775,30 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	inode_nohighmem(inode);
 	inode->i_mapping->a_ops = &btrfs_aops;
 
-	/*
-	 * 2 items for inode item and ref
-	 * 2 items for dir items
-	 * 1 item for updating parent inode item
-	 * 1 item for the inline extent item
-	 * 1 item for xattr if selinux is on
-	 */
-	trans = btrfs_start_transaction(root, 7);
+	new_inode_args.inode = inode;
+	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
+	if (err) {
+		iput(inode);
+		return err;
+	}
+	/* 1 additional item for the inline extent */
+	trans_num_items++;
+
+	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
 		iput(inode);
-		return PTR_ERR(trans);
+		err = PTR_ERR(trans);
+		goto out_new_inode_args;
 	}
 
-	err = btrfs_new_inode(trans, root, inode, dir, dentry->d_name.name,
-			      dentry->d_name.len, &index);
+	err = btrfs_create_new_inode(trans, &new_inode_args, &index);
 	if (err) {
 		iput(inode);
 		inode = NULL;
 		goto out_unlock;
 	}
 
-	err = btrfs_init_inode_security(trans, inode, dir, &dentry->d_name);
+	err = btrfs_init_inode_security(trans, &new_inode_args);
 	if (err)
 		goto out_unlock;
 
@@ -9759,6 +9857,8 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 		discard_new_inode(inode);
 	}
 	btrfs_btree_balance_dirty(fs_info);
+out_new_inode_args:
+	btrfs_new_inode_args_destroy(&new_inode_args);
 	return err;
 }
 
@@ -10015,6 +10115,12 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct inode *inode;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+		.orphan = true,
+	};
+	unsigned int trans_num_items;
 	u64 index;
 	int ret;
 
@@ -10026,23 +10132,28 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	inode->i_op = &btrfs_file_inode_operations;
 	inode->i_mapping->a_ops = &btrfs_aops;
 
-	/*
-	 * 5 units required for adding orphan entry
-	 */
-	trans = btrfs_start_transaction(root, 5);
-	if (IS_ERR(trans)) {
+	new_inode_args.inode = inode;
+	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
+	if (ret) {
 		iput(inode);
-		return PTR_ERR(trans);
+		return ret;
 	}
 
-	ret = btrfs_new_inode(trans, root, inode, dir, NULL, 0, &index);
+	trans = btrfs_start_transaction(root, trans_num_items);
+	if (IS_ERR(trans)) {
+		iput(inode);
+		ret = PTR_ERR(trans);
+		goto out_new_inode_args;
+	}
+
+	ret = btrfs_create_new_inode(trans, &new_inode_args, &index);
 	if (ret) {
 		iput(inode);
 		inode = NULL;
 		goto out;
 	}
 
-	ret = btrfs_init_inode_security(trans, inode, dir, NULL);
+	ret = btrfs_init_inode_security(trans, &new_inode_args);
 	if (ret)
 		goto out;
 
@@ -10054,9 +10165,9 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 		goto out;
 
 	/*
-	 * We set number of links to 0 in btrfs_new_inode(), and here we set
-	 * it to 1 because d_tmpfile() will issue a warning if the count is 0,
-	 * through:
+	 * We set number of links to 0 in btrfs_create_new_inode(), and here we
+	 * set it to 1 because d_tmpfile() will issue a warning if the count is
+	 * 0, through:
 	 *
 	 *    d_tmpfile() -> inode_dec_link_count() -> drop_nlink()
 	 */
@@ -10069,6 +10180,8 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	if (ret && inode)
 		discard_new_inode(inode);
 	btrfs_btree_balance_dirty(fs_info);
+out_new_inode_args:
+	btrfs_new_inode_args_destroy(&new_inode_args);
 	return ret;
 }
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 60c907b14547..07a74bbe3d84 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -544,6 +544,32 @@ int __pure btrfs_is_empty_uuid(u8 *uuid)
 	return 1;
 }
 
+/*
+ * Calculate the number of transaction items to reserve for creating a subvolume
+ * or snapshot, not including the inode, directory entries, or parent directory.
+ */
+static unsigned int create_subvol_num_items(struct btrfs_qgroup_inherit *inherit)
+{
+	/*
+	 * 1 to add root block
+	 * 1 to add root item
+	 * 1 to add root ref
+	 * 1 to add root backref
+	 * 1 to add UUID item
+	 * 1 to add qgroup info
+	 * 1 to add qgroup limit
+	 * (Ideally the last two would only be accounted if qgroups are enabled,
+	 * but that can change between now and the time we would insert them)
+	 */
+	unsigned int num_items = 7;
+
+	if (inherit) {
+		/* 2 to add qgroup relations for each inherited qgroup */
+		num_items += 2 * inherit->num_qgroups;
+	}
+	return num_items;
+}
+
 static noinline int create_subvol(struct user_namespace *mnt_userns,
 				  struct inode *dir, struct dentry *dentry,
 				  struct btrfs_qgroup_inherit *inherit)
@@ -560,7 +586,12 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	struct btrfs_root *new_root;
 	struct btrfs_block_rsv block_rsv;
 	struct timespec64 cur_time = current_time(dir);
-	struct inode *inode;
+	struct btrfs_new_inode_args new_inode_args = {
+		.dir = dir,
+		.dentry = dentry,
+		.subvol = true,
+	};
+	unsigned int trans_num_items;
 	int ret;
 	dev_t anon_dev;
 	u64 objectid;
@@ -587,26 +618,27 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	if (ret < 0)
 		goto out_root_item;
 
-	inode = btrfs_new_subvol_inode(mnt_userns, dir);
-	if (!inode) {
+	new_inode_args.inode = btrfs_new_subvol_inode(mnt_userns, dir);
+	if (!new_inode_args.inode) {
 		ret = -ENOMEM;
 		goto out_anon_dev;
 	}
-
-	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
-	/*
-	 * The same as the snapshot creation, please see the comment
-	 * of create_snapshot().
-	 */
-	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 8, false);
+	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
 	if (ret)
 		goto out_inode;
+	trans_num_items += create_subvol_num_items(inherit);
+
+	btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
+	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv,
+					       trans_num_items, false);
+	if (ret)
+		goto out_new_inode_args;
 
 	trans = btrfs_start_transaction(root, 0);
 	if (IS_ERR(trans)) {
 		ret = PTR_ERR(trans);
 		btrfs_subvolume_release_metadata(root, &block_rsv);
-		goto out_inode;
+		goto out_new_inode_args;
 	}
 	trans->block_rsv = &block_rsv;
 	trans->bytes_reserved = block_rsv.size;
@@ -689,8 +721,8 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	}
 	/* anon_dev is owned by new_root now. */
 	anon_dev = 0;
-	BTRFS_I(inode)->root = new_root;
-	/* ... and new_root is owned by inode now. */
+	BTRFS_I(new_inode_args.inode)->root = new_root;
+	/* ... and new_root is owned by new_inode_args.inode now. */
 
 	ret = btrfs_record_root_in_trans(trans, new_root);
 	if (ret) {
@@ -698,7 +730,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		goto out;
 	}
 
-	ret = btrfs_create_subvol_root(trans, root, inode);
+	ret = btrfs_create_subvol_root(trans, root, &new_inode_args);
 	if (ret) {
 		/* We potentially lose an unused inode item here */
 		btrfs_abort_transaction(trans, ret);
@@ -751,11 +783,13 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		ret = btrfs_commit_transaction(trans);
 
 	if (!ret) {
-		d_instantiate(dentry, inode);
-		inode = NULL;
+		d_instantiate(dentry, new_inode_args.inode);
+		new_inode_args.inode = NULL;
 	}
+out_new_inode_args:
+	btrfs_new_inode_args_destroy(&new_inode_args);
 out_inode:
-	iput(inode);
+	iput(new_inode_args.inode);
 out_anon_dev:
 	if (anon_dev)
 		free_anon_bdev(anon_dev);
@@ -771,6 +805,7 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct inode *inode;
 	struct btrfs_pending_snapshot *pending_snapshot;
+	unsigned int trans_num_items;
 	struct btrfs_trans_handle *trans;
 	int ret;
 
@@ -808,16 +843,14 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	btrfs_init_block_rsv(&pending_snapshot->block_rsv,
 			     BTRFS_BLOCK_RSV_TEMP);
 	/*
-	 * 1 - parent dir inode
-	 * 2 - dir entries
-	 * 1 - root item
-	 * 2 - root ref/backref
-	 * 1 - root of snapshot
-	 * 1 - UUID item
+	 * 1 to add dir item
+	 * 1 to add dir index
+	 * 1 to update parent inode item
 	 */
+	trans_num_items = create_subvol_num_items(inherit) + 3;
 	ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root,
-					&pending_snapshot->block_rsv, 8,
-					false);
+					       &pending_snapshot->block_rsv,
+					       trans_num_items, false);
 	if (ret)
 		goto free_pending;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode()
  2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (2 preceding siblings ...)
  2022-03-15  1:12 ` [PATCH v4 3/4] btrfs: reserve correct number of items for inode creation Omar Sandoval
@ 2022-03-15  1:12 ` Omar Sandoval
  2022-04-06 17:18   ` Sweet Tea Dorminy
  2022-03-15 15:00 ` [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Sweet Tea Dorminy
  2022-03-17 20:21 ` David Sterba
  5 siblings, 1 reply; 11+ messages in thread
From: Omar Sandoval @ 2022-03-15  1:12 UTC (permalink / raw)
  To: linux-btrfs; +Cc: kernel-team

From: Omar Sandoval <osandov@fb.com>

All of our inode creation code paths duplicate the calls to
btrfs_init_inode_security() and btrfs_add_link(). Subvolume creation
additionally duplicates property inheritance and the call to
btrfs_set_inode_index(). Fix this by moving the common code into
btrfs_create_new_inode(). This accomplishes a few things at once:

1. It reduces code duplication.
2. It allows us to set up the inode completely before inserting the
   inode item, removing calls to btrfs_update_inode().
3. It fixes a leak of an inode on disk in some error cases. For example,
   in btrfs_create(), if btrfs_new_inode() succeeds, then we have
   inserted an inode item and its inode ref. However, if something after
   that fails (e.g., btrfs_init_inode_security()), then we end the
   transaction and then decrement the link count on the inode. If the
   transaction is committed and the system crashes before the failed
   inode is deleted, then we leak that inode on disk. Instead, this
   refactoring aborts the transaction when we can't recover more
   gracefully.
4. It exposes various ways that subvolume creation diverges from mkdir
   in terms of inheriting flags, properties, permissions, and POSIX
   ACLs, a lot of which appears to be accidental. This patch explicitly
   does _not_ change the existing non-standard behavior, but it makes
   those differences more clear in the code and documents them so that
   we can discuss whether they should be changed.

Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 fs/btrfs/ctree.h |   6 +-
 fs/btrfs/inode.c | 404 +++++++++++++++++++----------------------------
 fs/btrfs/ioctl.c |  59 ++-----
 fs/btrfs/props.c |  40 +----
 fs/btrfs/props.h |   4 -
 5 files changed, 180 insertions(+), 333 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 322c02610e9e..3a348b26df57 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3272,14 +3272,10 @@ struct btrfs_new_inode_args {
 int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
 			    unsigned int *trans_num_items);
 int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
-			   struct btrfs_new_inode_args *args,
-			   u64 *index);
+			   struct btrfs_new_inode_args *args);
 void btrfs_new_inode_args_destroy(struct btrfs_new_inode_args *args);
 struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
 				     struct inode *dir);
-int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *parent_root,
-			     struct btrfs_new_inode_args *args);
  void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
 			       unsigned *bits);
 void btrfs_clear_delalloc_extent(struct inode *inode,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3ce02378480f..856e2d13e4f7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6122,9 +6122,6 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 {
 	unsigned int flags;
 
-	if (!dir)
-		return;
-
 	flags = BTRFS_I(dir)->flags;
 
 	if (flags & BTRFS_INODE_NOCOMPRESS) {
@@ -6145,14 +6142,13 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 }
 
 int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
-			   struct btrfs_new_inode_args *args,
-			   u64 *index)
+			   struct btrfs_new_inode_args *args)
 {
-	struct inode *dir = args->subvol ? NULL : args->dir;
+	struct inode *dir = args->dir;
 	struct inode *inode = args->inode;
-	const char *name;
-	int name_len;
-	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	const char *name = args->orphan ? NULL : args->dentry->d_name.name;
+	int name_len = args->orphan ? 0 : args->dentry->d_name.len;
+	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_root *root;
 	struct btrfs_inode_item *inode_item;
 	struct btrfs_key *location;
@@ -6165,49 +6161,32 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
 	unsigned long ptr;
 	int ret;
 
-	if (args->subvol) {
-		name = "..";
-		name_len = 2;
-	} else if (args->orphan) {
-		name = NULL;
-		name_len = 0;
-	} else {
-		name = args->dentry->d_name.name;
-		name_len = args->dentry->d_name.len;
-	}
-
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
 
-	/*
-	 * O_TMPFILE, set link count to 0, so that after this point,
-	 * we fill in an inode item with the correct link count.
-	 */
-	if (!name)
-		set_nlink(inode, 0);
-
 	if (!args->subvol)
 		BTRFS_I(inode)->root = btrfs_grab_root(BTRFS_I(dir)->root);
 	root = BTRFS_I(inode)->root;
 
 	ret = btrfs_get_free_objectid(root, &objectid);
-	if (ret) {
-		btrfs_free_path(path);
-		return ret;
-	}
+	if (ret)
+		goto out;
 	inode->i_ino = objectid;
 
-	if (dir && name) {
+	if (args->orphan) {
+		/*
+		 * O_TMPFILE, set link count to 0, so that after this point, we
+		 * fill in an inode item with the correct link count.
+		 */
+		set_nlink(inode, 0);
+	} else {
 		trace_btrfs_inode_request(dir);
 
-		ret = btrfs_set_inode_index(BTRFS_I(dir), index);
-		if (ret) {
-			btrfs_free_path(path);
-			return ret;
-		}
-	} else if (dir) {
-		*index = 0;
+		ret = btrfs_set_inode_index(BTRFS_I(dir),
+					    &BTRFS_I(inode)->dir_index);
+		if (ret)
+			goto out;
 	}
 	/*
 	 * index_cnt is ignored for everything but a dir,
@@ -6215,11 +6194,16 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
 	 * number
 	 */
 	BTRFS_I(inode)->index_cnt = 2;
-	BTRFS_I(inode)->dir_index = *index;
 	BTRFS_I(inode)->generation = trans->transid;
 	inode->i_generation = BTRFS_I(inode)->generation;
 
-	btrfs_inherit_iflags(inode, dir);
+	/*
+	 * Subvolumes don't inherit flags from their parent directory.
+	 * Originally this was probably by accident, but we probably can't
+	 * change it now.
+	 */
+	if (!args->subvol)
+		btrfs_inherit_iflags(inode, dir);
 
 	if (S_ISREG(inode->i_mode)) {
 		if (btrfs_test_opt(fs_info, NODATASUM))
@@ -6229,6 +6213,55 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
 				BTRFS_INODE_NODATASUM;
 	}
 
+	location = &BTRFS_I(inode)->location;
+	location->objectid = objectid;
+	location->offset = 0;
+	location->type = BTRFS_INODE_ITEM_KEY;
+
+	ret = btrfs_insert_inode_locked(inode);
+	if (ret < 0) {
+		if (!args->orphan)
+			BTRFS_I(dir)->index_cnt--;
+		goto out;
+	}
+
+	if (args->subvol) {
+		struct inode *parent;
+
+		/*
+		 * Subvolumes inherit properties from their parent subvolume,
+		 * not the directory they were created in.
+		 */
+		parent = btrfs_iget(fs_info->sb, BTRFS_FIRST_FREE_OBJECTID,
+				    BTRFS_I(dir)->root);
+		if (IS_ERR(parent)) {
+			ret = PTR_ERR(parent);
+		} else {
+			ret = btrfs_inode_inherit_props(trans, inode, parent);
+			iput(parent);
+		}
+	} else {
+		ret = btrfs_inode_inherit_props(trans, inode, dir);
+	}
+	if (ret) {
+		btrfs_err(fs_info,
+			  "error inheriting props for ino %llu (root %llu): %d",
+			  btrfs_ino(BTRFS_I(inode)), root->root_key.objectid,
+			  ret);
+	}
+
+	/*
+	 * Subvolumes don't inherit ACLs or get passed to the LSM. This is
+	 * probably a bug.
+	 */
+	if (!args->subvol) {
+		ret = btrfs_init_inode_security(trans, args);
+		if (ret) {
+			btrfs_abort_transaction(trans, ret);
+			goto discard;
+		}
+	}
+
 	/*
 	 * We could have gotten an inode number from somebody who was fsynced
 	 * and then removed in this same transaction, so let's just set full
@@ -6243,7 +6276,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
 
 	sizes[0] = sizeof(struct btrfs_inode_item);
 
-	if (name) {
+	if (!args->orphan) {
 		/*
 		 * Start new inodes with an inode_ref. This is slightly more
 		 * efficient for small numbers of hard links since they will
@@ -6252,53 +6285,61 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
 		 */
 		key[1].objectid = objectid;
 		key[1].type = BTRFS_INODE_REF_KEY;
-		if (dir)
-			key[1].offset = btrfs_ino(BTRFS_I(dir));
-		else
+		if (args->subvol) {
 			key[1].offset = objectid;
-
-		sizes[1] = name_len + sizeof(*ref);
+			sizes[1] = 2 + sizeof(*ref);
+		} else {
+			key[1].offset = btrfs_ino(BTRFS_I(dir));
+			sizes[1] = name_len + sizeof(*ref);
+		}
 	}
 
-	location = &BTRFS_I(inode)->location;
-	location->objectid = objectid;
-	location->offset = 0;
-	location->type = BTRFS_INODE_ITEM_KEY;
-
-	ret = btrfs_insert_inode_locked(inode);
-	if (ret < 0)
-		goto fail;
-
 	batch.keys = &key[0];
 	batch.data_sizes = &sizes[0];
-	batch.total_data_size = sizes[0] + (name ? sizes[1] : 0);
-	batch.nr = name ? 2 : 1;
+	batch.total_data_size = sizes[0] + (args->orphan ? 0 : sizes[1]);
+	batch.nr = args->orphan ? 1 : 2;
 	ret = btrfs_insert_empty_items(trans, root, path, &batch);
-	if (ret != 0)
-		goto fail_unlock;
+	if (ret != 0) {
+		btrfs_abort_transaction(trans, ret);
+		goto discard;
+	}
 
 	inode->i_mtime = current_time(inode);
 	inode->i_atime = inode->i_mtime;
 	inode->i_ctime = inode->i_mtime;
 	BTRFS_I(inode)->i_otime = inode->i_mtime;
 
+	/*
+	 * We're going to fill the inode item now, so at this point the inode
+	 * must be fully initialized.
+	 */
+
 	inode_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
 				  struct btrfs_inode_item);
 	memzero_extent_buffer(path->nodes[0], (unsigned long)inode_item,
 			     sizeof(*inode_item));
 	fill_inode_item(trans, path->nodes[0], inode_item, inode);
 
-	if (name) {
+	if (!args->orphan) {
 		ref = btrfs_item_ptr(path->nodes[0], path->slots[0] + 1,
 				     struct btrfs_inode_ref);
-		btrfs_set_inode_ref_name_len(path->nodes[0], ref, name_len);
-		btrfs_set_inode_ref_index(path->nodes[0], ref, *index);
 		ptr = (unsigned long)(ref + 1);
-		write_extent_buffer(path->nodes[0], name, ptr, name_len);
+		if (args->subvol) {
+			btrfs_set_inode_ref_name_len(path->nodes[0], ref, 2);
+			btrfs_set_inode_ref_index(path->nodes[0], ref, 0);
+			write_extent_buffer(path->nodes[0], "..", ptr, 2);
+		} else {
+			btrfs_set_inode_ref_name_len(path->nodes[0], ref,
+						     name_len);
+			btrfs_set_inode_ref_index(path->nodes[0], ref,
+						  BTRFS_I(inode)->dir_index);
+			write_extent_buffer(path->nodes[0], name, ptr,
+					    name_len);
+		}
 	}
 
 	btrfs_mark_buffer_dirty(path->nodes[0]);
-	btrfs_free_path(path);
+	btrfs_release_path(path);
 
 	inode_tree_add(inode);
 
@@ -6307,24 +6348,28 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
 
 	btrfs_update_root_times(trans, root);
 
-	ret = btrfs_inode_inherit_props(trans, inode, dir);
-	if (ret)
-		btrfs_err(fs_info,
-			  "error inheriting props for ino %llu (root %llu): %d",
-			btrfs_ino(BTRFS_I(inode)), root->root_key.objectid, ret);
+	if (args->orphan) {
+		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
+	} else {
+		ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode), name,
+				     name_len, 0, BTRFS_I(inode)->dir_index);
+	}
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto discard;
+	}
 
-	return 0;
+	ret = 0;
+	goto out;
 
-fail_unlock:
+discard:
 	/*
 	 * discard_new_inode() calls iput(), but the caller owns the reference
 	 * to the inode.
 	 */
 	ihold(inode);
 	discard_new_inode(inode);
-fail:
-	if (dir && name)
-		BTRFS_I(dir)->index_cnt--;
+out:
 	btrfs_free_path(path);
 	return ret;
 }
@@ -6431,52 +6476,28 @@ static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
 	unsigned int trans_num_items;
 	struct btrfs_trans_handle *trans;
 	int err;
-	u64 index = 0;
 
 	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
-	if (err) {
-		iput(inode);
-		return err;
-	}
+	if (err)
+		goto out_inode;
 
 	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
-		iput(inode);
 		err = PTR_ERR(trans);
 		goto out_new_inode_args;
 	}
 
-	err = btrfs_create_new_inode(trans, &new_inode_args, &index);
-	if (err) {
-		iput(inode);
-		inode = NULL;
-		goto out_unlock;
-	}
+	err = btrfs_create_new_inode(trans, &new_inode_args);
+	if (!err)
+		d_instantiate_new(dentry, inode);
 
-	err = btrfs_init_inode_security(trans, &new_inode_args);
-	if (err)
-		goto out_unlock;
-
-	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	if (err)
-		goto out_unlock;
-
-	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			     dentry->d_name.name, dentry->d_name.len, 0, index);
-	if (err)
-		goto out_unlock;
-
-	d_instantiate_new(dentry, inode);
-
-out_unlock:
 	btrfs_end_transaction(trans);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
 	btrfs_btree_balance_dirty(fs_info);
 out_new_inode_args:
 	btrfs_new_inode_args_destroy(&new_inode_args);
+out_inode:
+	if (err)
+		iput(inode);
 	return err;
 }
 
@@ -8751,34 +8772,6 @@ struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
 	return inode;
 }
 
-/*
- * create a new subvolume directory/inode (helper for the ioctl).
- */
-int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
-			     struct btrfs_root *parent_root,
-			     struct btrfs_new_inode_args *args)
-{
-	struct inode *inode = args->inode;
-	struct btrfs_root *new_root = BTRFS_I(inode)->root;
-	int err;
-	u64 index = 0;
-
-	err = btrfs_create_new_inode(trans, args, &index);
-	if (err)
-		return err;
-
-	unlock_new_inode(inode);
-
-	err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
-	if (err)
-		btrfs_err(new_root->fs_info,
-			  "error inheriting subvolume %llu properties: %d",
-			  new_root->root_key.objectid, err);
-
-	err = btrfs_update_inode(trans, new_root, BTRFS_I(inode));
-	return err;
-}
-
 struct inode *btrfs_alloc_inode(struct super_block *sb)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
@@ -9267,41 +9260,6 @@ static struct inode *new_whiteout_inode(struct user_namespace *mnt_userns,
 	return inode;
 }
 
-static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
-				     struct btrfs_new_inode_args *args)
-{
-	struct inode *inode = args->inode;
-	struct inode *dir = args->dir;
-	struct btrfs_root *root = BTRFS_I(dir)->root;
-	struct dentry *dentry = args->dentry;
-	int ret;
-	u64 index;
-
-	ret = btrfs_create_new_inode(trans, args, &index);
-	if (ret) {
-		iput(inode);
-		return ret;
-	}
-
-	ret = btrfs_init_inode_security(trans, args);
-	if (ret)
-		goto out;
-
-	ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-			     dentry->d_name.name, dentry->d_name.len, 0, index);
-	if (ret)
-		goto out;
-
-	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
-out:
-	unlock_new_inode(inode);
-	if (ret)
-		inode_dec_link_count(inode);
-	iput(inode);
-
-	return ret;
-}
-
 static int btrfs_rename(struct user_namespace *mnt_userns,
 			struct inode *old_dir, struct dentry *old_dentry,
 			struct inode *new_dir, struct dentry *new_dentry,
@@ -9513,11 +9471,14 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
 				   rename_ctx.index, new_dentry->d_parent);
 
 	if (flags & RENAME_WHITEOUT) {
-		ret = btrfs_whiteout_for_rename(trans, &whiteout_args);
-		whiteout_args.inode = NULL;
+		ret = btrfs_create_new_inode(trans, &whiteout_args);
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
 			goto out_fail;
+		} else {
+			unlock_new_inode(whiteout_args.inode);
+			iput(whiteout_args.inode);
+			whiteout_args.inode = NULL;
 		}
 	}
 out_fail:
@@ -9756,7 +9717,6 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	};
 	unsigned int trans_num_items;
 	int err;
-	u64 index = 0;
 	int name_len;
 	int datasize;
 	unsigned long ptr;
@@ -9774,38 +9734,33 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	inode->i_op = &btrfs_symlink_inode_operations;
 	inode_nohighmem(inode);
 	inode->i_mapping->a_ops = &btrfs_aops;
+	btrfs_i_size_write(BTRFS_I(inode), name_len);
+	inode_set_bytes(inode, name_len);
 
 	new_inode_args.inode = inode;
 	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
-	if (err) {
-		iput(inode);
-		return err;
-	}
+	if (err)
+		goto out_inode;
 	/* 1 additional item for the inline extent */
 	trans_num_items++;
 
 	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
-		iput(inode);
 		err = PTR_ERR(trans);
 		goto out_new_inode_args;
 	}
 
-	err = btrfs_create_new_inode(trans, &new_inode_args, &index);
-	if (err) {
-		iput(inode);
-		inode = NULL;
-		goto out_unlock;
-	}
-
-	err = btrfs_init_inode_security(trans, &new_inode_args);
+	err = btrfs_create_new_inode(trans, &new_inode_args);
 	if (err)
-		goto out_unlock;
+		goto out;
 
 	path = btrfs_alloc_path();
 	if (!path) {
 		err = -ENOMEM;
-		goto out_unlock;
+		btrfs_abort_transaction(trans, err);
+		discard_new_inode(inode);
+		inode = NULL;
+		goto out;
 	}
 	key.objectid = btrfs_ino(BTRFS_I(inode));
 	key.offset = 0;
@@ -9814,8 +9769,11 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	err = btrfs_insert_empty_item(trans, root, path, &key,
 				      datasize);
 	if (err) {
+		btrfs_abort_transaction(trans, err);
 		btrfs_free_path(path);
-		goto out_unlock;
+		discard_new_inode(inode);
+		inode = NULL;
+		goto out;
 	}
 	leaf = path->nodes[0];
 	ei = btrfs_item_ptr(leaf, path->slots[0],
@@ -9833,32 +9791,16 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_free_path(path);
 
-	inode_set_bytes(inode, name_len);
-	btrfs_i_size_write(BTRFS_I(inode), name_len);
-	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	/*
-	 * Last step, add directory indexes for our symlink inode. This is the
-	 * last step to avoid extra cleanup of these indexes if an error happens
-	 * elsewhere above.
-	 */
-	if (!err)
-		err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
-				     dentry->d_name.name, dentry->d_name.len, 0,
-				     index);
-	if (err)
-		goto out_unlock;
-
 	d_instantiate_new(dentry, inode);
-
-out_unlock:
+	err = 0;
+out:
 	btrfs_end_transaction(trans);
-	if (err && inode) {
-		inode_dec_link_count(inode);
-		discard_new_inode(inode);
-	}
 	btrfs_btree_balance_dirty(fs_info);
 out_new_inode_args:
 	btrfs_new_inode_args_destroy(&new_inode_args);
+out_inode:
+	if (err)
+		iput(inode);
 	return err;
 }
 
@@ -10121,7 +10063,6 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 		.orphan = true,
 	};
 	unsigned int trans_num_items;
-	u64 index;
 	int ret;
 
 	inode = new_inode(dir->i_sb);
@@ -10134,35 +10075,16 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 
 	new_inode_args.inode = inode;
 	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
-	if (ret) {
-		iput(inode);
-		return ret;
-	}
+	if (ret)
+		goto out_inode;
 
 	trans = btrfs_start_transaction(root, trans_num_items);
 	if (IS_ERR(trans)) {
-		iput(inode);
 		ret = PTR_ERR(trans);
 		goto out_new_inode_args;
 	}
 
-	ret = btrfs_create_new_inode(trans, &new_inode_args, &index);
-	if (ret) {
-		iput(inode);
-		inode = NULL;
-		goto out;
-	}
-
-	ret = btrfs_init_inode_security(trans, &new_inode_args);
-	if (ret)
-		goto out;
-
-	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
-	if (ret)
-		goto out;
-	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
-	if (ret)
-		goto out;
+	ret = btrfs_create_new_inode(trans, &new_inode_args);
 
 	/*
 	 * We set number of links to 0 in btrfs_create_new_inode(), and here we
@@ -10172,16 +10094,20 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	 *    d_tmpfile() -> inode_dec_link_count() -> drop_nlink()
 	 */
 	set_nlink(inode, 1);
-	d_tmpfile(dentry, inode);
-	unlock_new_inode(inode);
-	mark_inode_dirty(inode);
-out:
+
+	if (!ret) {
+		d_tmpfile(dentry, inode);
+		unlock_new_inode(inode);
+		mark_inode_dirty(inode);
+	}
+
 	btrfs_end_transaction(trans);
-	if (ret && inode)
-		discard_new_inode(inode);
 	btrfs_btree_balance_dirty(fs_info);
 out_new_inode_args:
 	btrfs_new_inode_args_destroy(&new_inode_args);
+out_inode:
+	if (ret)
+		iput(inode);
 	return ret;
 }
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 07a74bbe3d84..24b3e384aa8f 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -574,8 +574,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 				  struct inode *dir, struct dentry *dentry,
 				  struct btrfs_qgroup_inherit *inherit)
 {
-	const char *name = dentry->d_name.name;
-	int namelen = dentry->d_name.len;
 	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
 	struct btrfs_trans_handle *trans;
 	struct btrfs_key key;
@@ -595,7 +593,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	int ret;
 	dev_t anon_dev;
 	u64 objectid;
-	u64 index = 0;
 
 	root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
 	if (!root_item)
@@ -712,7 +709,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 	free_extent_buffer(leaf);
 	leaf = NULL;
 
-	key.offset = (u64)-1;
 	new_root = btrfs_get_new_fs_root(fs_info, objectid, anon_dev);
 	if (IS_ERR(new_root)) {
 		ret = PTR_ERR(new_root);
@@ -730,47 +726,21 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		goto out;
 	}
 
-	ret = btrfs_create_subvol_root(trans, root, &new_inode_args);
-	if (ret) {
-		/* We potentially lose an unused inode item here */
-		btrfs_abort_transaction(trans, ret);
-		goto out;
-	}
-
-	/*
-	 * insert the directory item
-	 */
-	ret = btrfs_set_inode_index(BTRFS_I(dir), &index);
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto out;
-	}
-
-	ret = btrfs_insert_dir_item(trans, name, namelen, BTRFS_I(dir), &key,
-				    BTRFS_FT_DIR, index);
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto out;
-	}
-
-	btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
-	ret = btrfs_update_inode(trans, root, BTRFS_I(dir));
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto out;
-	}
-
-	ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
-				 btrfs_ino(BTRFS_I(dir)), index, name, namelen);
-	if (ret) {
-		btrfs_abort_transaction(trans, ret);
-		goto out;
-	}
-
 	ret = btrfs_uuid_tree_add(trans, root_item->uuid,
 				  BTRFS_UUID_KEY_SUBVOL, objectid);
-	if (ret)
+	if (ret) {
 		btrfs_abort_transaction(trans, ret);
+		goto out;
+	}
+
+	ret = btrfs_create_new_inode(trans, &new_inode_args);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto out;
+	}
+
+	d_instantiate_new(dentry, new_inode_args.inode);
+	new_inode_args.inode = NULL;
 
 out:
 	trans->block_rsv = NULL;
@@ -781,11 +751,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
 		btrfs_end_transaction(trans);
 	else
 		ret = btrfs_commit_transaction(trans);
-
-	if (!ret) {
-		d_instantiate(dentry, new_inode_args.inode);
-		new_inode_args.inode = NULL;
-	}
 out_new_inode_args:
 	btrfs_new_inode_args_destroy(&new_inode_args);
 out_inode:
diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index 1a6d2d5b4b33..f5565c296898 100644
--- a/fs/btrfs/props.c
+++ b/fs/btrfs/props.c
@@ -334,9 +334,8 @@ static struct prop_handler prop_handlers[] = {
 	},
 };
 
-static int inherit_props(struct btrfs_trans_handle *trans,
-			 struct inode *inode,
-			 struct inode *parent)
+int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
+			      struct inode *inode, struct inode *parent)
 {
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
@@ -408,41 +407,6 @@ static int inherit_props(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
-int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
-			      struct inode *inode,
-			      struct inode *dir)
-{
-	if (!dir)
-		return 0;
-
-	return inherit_props(trans, inode, dir);
-}
-
-int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
-			       struct btrfs_root *root,
-			       struct btrfs_root *parent_root)
-{
-	struct super_block *sb = root->fs_info->sb;
-	struct inode *parent_inode, *child_inode;
-	int ret;
-
-	parent_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, parent_root);
-	if (IS_ERR(parent_inode))
-		return PTR_ERR(parent_inode);
-
-	child_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, root);
-	if (IS_ERR(child_inode)) {
-		iput(parent_inode);
-		return PTR_ERR(child_inode);
-	}
-
-	ret = inherit_props(trans, child_inode, parent_inode);
-	iput(child_inode);
-	iput(parent_inode);
-
-	return ret;
-}
-
 void __init btrfs_props_init(void)
 {
 	int i;
diff --git a/fs/btrfs/props.h b/fs/btrfs/props.h
index 40b2c65b518c..1dcd5daa3b22 100644
--- a/fs/btrfs/props.h
+++ b/fs/btrfs/props.h
@@ -21,8 +21,4 @@ int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
 			      struct inode *inode,
 			      struct inode *dir);
 
-int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
-			       struct btrfs_root *root,
-			       struct btrfs_root *parent_root);
-
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/4] btrfs: inode creation cleanups and fixes
  2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (3 preceding siblings ...)
  2022-03-15  1:12 ` [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode() Omar Sandoval
@ 2022-03-15 15:00 ` Sweet Tea Dorminy
  2022-03-17 20:21 ` David Sterba
  5 siblings, 0 replies; 11+ messages in thread
From: Sweet Tea Dorminy @ 2022-03-15 15:00 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs; +Cc: kernel-team

All looks good to me still. Thanks!

On 3/14/22 21:12, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> This series contains minor updates of the final four patches of my
> previous series [1].
>
> Based on misc-next with the previous version of "btrfs: allocate inode
> outside of btrfs_new_inode()" removed.
>
> Changes since v3 [2]:
>
> - Fixed a struct btrfs_root leak [3] in patch 1 which was fixed later in
>    the same series by patch 3 for the sake of git bisect.
>
> Changes since v2:
>
> - Mentioned reason for removal of btrfs_lookup_dentry() call in
>    create_subvol() in commit message of patch 1.
> - Made btrfs_init_inode_security() also take btrfs_new_inode_args in
>    patch 2.
>
> Thanks!
>
> 1: https://lore.kernel.org/linux-btrfs/cover.1646875648.git.osandov@fb.com/
> 2: https://lore.kernel.org/linux-btrfs/cover.1647288019.git.osandov@fb.com/
> 3: https://lore.kernel.org/linux-btrfs/CAL3q7H5ACv3ej1=7P2y7mA1vCJoAcHkCqro6_VBuAUxeaw25rw@mail.gmail.com/
>
> Omar Sandoval (4):
>    btrfs: allocate inode outside of btrfs_new_inode()
>    btrfs: factor out common part of btrfs_{mknod,create,mkdir}()
>    btrfs: reserve correct number of items for inode creation
>    btrfs: move common inode creation code into btrfs_create_new_inode()
>
>   fs/btrfs/acl.c   |  36 +--
>   fs/btrfs/ctree.h |  37 ++-
>   fs/btrfs/inode.c | 800 +++++++++++++++++++++++------------------------
>   fs/btrfs/ioctl.c | 140 +++++----
>   fs/btrfs/props.c |  40 +--
>   fs/btrfs/props.h |   4 -
>   6 files changed, 489 insertions(+), 568 deletions(-)
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/4] btrfs: inode creation cleanups and fixes
  2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
                   ` (4 preceding siblings ...)
  2022-03-15 15:00 ` [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Sweet Tea Dorminy
@ 2022-03-17 20:21 ` David Sterba
  5 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2022-03-17 20:21 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs, kernel-team

On Mon, Mar 14, 2022 at 06:12:31PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> This series contains minor updates of the final four patches of my
> previous series [1].
> 
> Based on misc-next with the previous version of "btrfs: allocate inode
> outside of btrfs_new_inode()" removed.
> 
> Changes since v3 [2]:
> 
> - Fixed a struct btrfs_root leak [3] in patch 1 which was fixed later in
>   the same series by patch 3 for the sake of git bisect.
> 
> Changes since v2:
> 
> - Mentioned reason for removal of btrfs_lookup_dentry() call in
>   create_subvol() in commit message of patch 1.
> - Made btrfs_init_inode_security() also take btrfs_new_inode_args in
>   patch 2.
> 
> Thanks!
> 
> 1: https://lore.kernel.org/linux-btrfs/cover.1646875648.git.osandov@fb.com/
> 2: https://lore.kernel.org/linux-btrfs/cover.1647288019.git.osandov@fb.com/
> 3: https://lore.kernel.org/linux-btrfs/CAL3q7H5ACv3ej1=7P2y7mA1vCJoAcHkCqro6_VBuAUxeaw25rw@mail.gmail.com/
> 
> Omar Sandoval (4):
>   btrfs: allocate inode outside of btrfs_new_inode()
>   btrfs: factor out common part of btrfs_{mknod,create,mkdir}()
>   btrfs: reserve correct number of items for inode creation
>   btrfs: move common inode creation code into btrfs_create_new_inode()

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode()
  2022-03-15  1:12 ` [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode() Omar Sandoval
@ 2022-04-06 17:18   ` Sweet Tea Dorminy
  2022-04-06 18:31     ` Omar Sandoval
  0 siblings, 1 reply; 11+ messages in thread
From: Sweet Tea Dorminy @ 2022-04-06 17:18 UTC (permalink / raw)
  To: Omar Sandoval, linux-btrfs, Naohiro Aota; +Cc: kernel-team

As Naohiro Aota noted in slack, this change turns out to make 
generic/650 and others have write-time corruption. It appears to be from 
moving btrfs_init_inode_security() (which inserts xattrs for the inode) 
to before inserting the new inode into the tree; if the xattrs for an 
acl or for selinux get written out before the inode they belong to, the 
write-time corruption checker complains that an xattr item appears 
before its requisite inode.

I'm not sure if it reintroduces the leak, but moving 
btrfs_init_inode_security() to after the inode is fully inserted into 
the tree seems to fix the issue, something like:

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6238,18 +6238,6 @@ int btrfs_create_new_inode(struct 
btrfs_trans_handle *trans,
                           ret);
         }

-       /*
-        * Subvolumes don't inherit ACLs or get passed to the LSM. This is
-        * probably a bug.
-        */
-       if (!args->subvol) {
-               ret = btrfs_init_inode_security(trans, args);
-               if (ret) {
-                       btrfs_abort_transaction(trans, ret);
-                       goto discard;
-               }
-       }
-
         /*
          * We could have gotten an inode number from somebody who was 
fsynced
          * and then removed in this same transaction, so let's just set 
full
@@ -6327,6 +6315,18 @@ int btrfs_create_new_inode(struct 
btrfs_trans_handle *trans,
         btrfs_mark_buffer_dirty(path->nodes[0]);
         btrfs_release_path(path);

+       /*
+        * Subvolumes don't inherit ACLs or get passed to the LSM. This is
+        * probably a bug.
+        */
+       if (!args->subvol) {
+               ret = btrfs_init_inode_security(trans, args);
+               if (ret) {
+                       btrfs_abort_transaction(trans, ret);
+                       goto discard;
+               }
+       }
+
         inode_tree_add(inode);

         trace_btrfs_inode_new(inode);
-- 


I don't know what the right process to follow here is. Naohiro, might 
you be able to test with this incremental patch and confirm that it 
fixes? Omar, does this have the inode leak problem you were trying to fix?

Sweet Tea

On 3/14/22 21:12, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> All of our inode creation code paths duplicate the calls to
> btrfs_init_inode_security() and btrfs_add_link(). Subvolume creation
> additionally duplicates property inheritance and the call to
> btrfs_set_inode_index(). Fix this by moving the common code into
> btrfs_create_new_inode(). This accomplishes a few things at once:
> 
> 1. It reduces code duplication.
> 2. It allows us to set up the inode completely before inserting the
>     inode item, removing calls to btrfs_update_inode().
> 3. It fixes a leak of an inode on disk in some error cases. For example,
>     in btrfs_create(), if btrfs_new_inode() succeeds, then we have
>     inserted an inode item and its inode ref. However, if something after
>     that fails (e.g., btrfs_init_inode_security()), then we end the
>     transaction and then decrement the link count on the inode. If the
>     transaction is committed and the system crashes before the failed
>     inode is deleted, then we leak that inode on disk. Instead, this
>     refactoring aborts the transaction when we can't recover more
>     gracefully.
> 4. It exposes various ways that subvolume creation diverges from mkdir
>     in terms of inheriting flags, properties, permissions, and POSIX
>     ACLs, a lot of which appears to be accidental. This patch explicitly
>     does _not_ change the existing non-standard behavior, but it makes
>     those differences more clear in the code and documents them so that
>     we can discuss whether they should be changed.
> 
> Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>   fs/btrfs/ctree.h |   6 +-
>   fs/btrfs/inode.c | 404 +++++++++++++++++++----------------------------
>   fs/btrfs/ioctl.c |  59 ++-----
>   fs/btrfs/props.c |  40 +----
>   fs/btrfs/props.h |   4 -
>   5 files changed, 180 insertions(+), 333 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 322c02610e9e..3a348b26df57 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3272,14 +3272,10 @@ struct btrfs_new_inode_args {
>   int btrfs_new_inode_prepare(struct btrfs_new_inode_args *args,
>   			    unsigned int *trans_num_items);
>   int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
> -			   struct btrfs_new_inode_args *args,
> -			   u64 *index);
> +			   struct btrfs_new_inode_args *args);
>   void btrfs_new_inode_args_destroy(struct btrfs_new_inode_args *args);
>   struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
>   				     struct inode *dir);
> -int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
> -			     struct btrfs_root *parent_root,
> -			     struct btrfs_new_inode_args *args);
>    void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
>   			       unsigned *bits);
>   void btrfs_clear_delalloc_extent(struct inode *inode,
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 3ce02378480f..856e2d13e4f7 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -6122,9 +6122,6 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
>   {
>   	unsigned int flags;
>   
> -	if (!dir)
> -		return;
> -
>   	flags = BTRFS_I(dir)->flags;
>   
>   	if (flags & BTRFS_INODE_NOCOMPRESS) {
> @@ -6145,14 +6142,13 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
>   }
>   
>   int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
> -			   struct btrfs_new_inode_args *args,
> -			   u64 *index)
> +			   struct btrfs_new_inode_args *args)
>   {
> -	struct inode *dir = args->subvol ? NULL : args->dir;
> +	struct inode *dir = args->dir;
>   	struct inode *inode = args->inode;
> -	const char *name;
> -	int name_len;
> -	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> +	const char *name = args->orphan ? NULL : args->dentry->d_name.name;
> +	int name_len = args->orphan ? 0 : args->dentry->d_name.len;
> +	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
>   	struct btrfs_root *root;
>   	struct btrfs_inode_item *inode_item;
>   	struct btrfs_key *location;
> @@ -6165,49 +6161,32 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
>   	unsigned long ptr;
>   	int ret;
>   
> -	if (args->subvol) {
> -		name = "..";
> -		name_len = 2;
> -	} else if (args->orphan) {
> -		name = NULL;
> -		name_len = 0;
> -	} else {
> -		name = args->dentry->d_name.name;
> -		name_len = args->dentry->d_name.len;
> -	}
> -
>   	path = btrfs_alloc_path();
>   	if (!path)
>   		return -ENOMEM;
>   
> -	/*
> -	 * O_TMPFILE, set link count to 0, so that after this point,
> -	 * we fill in an inode item with the correct link count.
> -	 */
> -	if (!name)
> -		set_nlink(inode, 0);
> -
>   	if (!args->subvol)
>   		BTRFS_I(inode)->root = btrfs_grab_root(BTRFS_I(dir)->root);
>   	root = BTRFS_I(inode)->root;
>   
>   	ret = btrfs_get_free_objectid(root, &objectid);
> -	if (ret) {
> -		btrfs_free_path(path);
> -		return ret;
> -	}
> +	if (ret)
> +		goto out;
>   	inode->i_ino = objectid;
>   
> -	if (dir && name) {
> +	if (args->orphan) {
> +		/*
> +		 * O_TMPFILE, set link count to 0, so that after this point, we
> +		 * fill in an inode item with the correct link count.
> +		 */
> +		set_nlink(inode, 0);
> +	} else {
>   		trace_btrfs_inode_request(dir);
>   
> -		ret = btrfs_set_inode_index(BTRFS_I(dir), index);
> -		if (ret) {
> -			btrfs_free_path(path);
> -			return ret;
> -		}
> -	} else if (dir) {
> -		*index = 0;
> +		ret = btrfs_set_inode_index(BTRFS_I(dir),
> +					    &BTRFS_I(inode)->dir_index);
> +		if (ret)
> +			goto out;
>   	}
>   	/*
>   	 * index_cnt is ignored for everything but a dir,
> @@ -6215,11 +6194,16 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
>   	 * number
>   	 */
>   	BTRFS_I(inode)->index_cnt = 2;
> -	BTRFS_I(inode)->dir_index = *index;
>   	BTRFS_I(inode)->generation = trans->transid;
>   	inode->i_generation = BTRFS_I(inode)->generation;
>   
> -	btrfs_inherit_iflags(inode, dir);
> +	/*
> +	 * Subvolumes don't inherit flags from their parent directory.
> +	 * Originally this was probably by accident, but we probably can't
> +	 * change it now.
> +	 */
> +	if (!args->subvol)
> +		btrfs_inherit_iflags(inode, dir);
>   
>   	if (S_ISREG(inode->i_mode)) {
>   		if (btrfs_test_opt(fs_info, NODATASUM))
> @@ -6229,6 +6213,55 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
>   				BTRFS_INODE_NODATASUM;
>   	}
>   
> +	location = &BTRFS_I(inode)->location;
> +	location->objectid = objectid;
> +	location->offset = 0;
> +	location->type = BTRFS_INODE_ITEM_KEY;
> +
> +	ret = btrfs_insert_inode_locked(inode);
> +	if (ret < 0) {
> +		if (!args->orphan)
> +			BTRFS_I(dir)->index_cnt--;
> +		goto out;
> +	}
> +
> +	if (args->subvol) {
> +		struct inode *parent;
> +
> +		/*
> +		 * Subvolumes inherit properties from their parent subvolume,
> +		 * not the directory they were created in.
> +		 */
> +		parent = btrfs_iget(fs_info->sb, BTRFS_FIRST_FREE_OBJECTID,
> +				    BTRFS_I(dir)->root);
> +		if (IS_ERR(parent)) {
> +			ret = PTR_ERR(parent);
> +		} else {
> +			ret = btrfs_inode_inherit_props(trans, inode, parent);
> +			iput(parent);
> +		}
> +	} else {
> +		ret = btrfs_inode_inherit_props(trans, inode, dir);
> +	}
> +	if (ret) {
> +		btrfs_err(fs_info,
> +			  "error inheriting props for ino %llu (root %llu): %d",
> +			  btrfs_ino(BTRFS_I(inode)), root->root_key.objectid,
> +			  ret);
> +	}
> +
> +	/*
> +	 * Subvolumes don't inherit ACLs or get passed to the LSM. This is
> +	 * probably a bug.
> +	 */
> +	if (!args->subvol) {
> +		ret = btrfs_init_inode_security(trans, args);
> +		if (ret) {
> +			btrfs_abort_transaction(trans, ret);
> +			goto discard;
> +		}
> +	}
> +
>   	/*
>   	 * We could have gotten an inode number from somebody who was fsynced
>   	 * and then removed in this same transaction, so let's just set full
> @@ -6243,7 +6276,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
>   
>   	sizes[0] = sizeof(struct btrfs_inode_item);
>   
> -	if (name) {
> +	if (!args->orphan) {
>   		/*
>   		 * Start new inodes with an inode_ref. This is slightly more
>   		 * efficient for small numbers of hard links since they will
> @@ -6252,53 +6285,61 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
>   		 */
>   		key[1].objectid = objectid;
>   		key[1].type = BTRFS_INODE_REF_KEY;
> -		if (dir)
> -			key[1].offset = btrfs_ino(BTRFS_I(dir));
> -		else
> +		if (args->subvol) {
>   			key[1].offset = objectid;
> -
> -		sizes[1] = name_len + sizeof(*ref);
> +			sizes[1] = 2 + sizeof(*ref);
> +		} else {
> +			key[1].offset = btrfs_ino(BTRFS_I(dir));
> +			sizes[1] = name_len + sizeof(*ref);
> +		}
>   	}
>   
> -	location = &BTRFS_I(inode)->location;
> -	location->objectid = objectid;
> -	location->offset = 0;
> -	location->type = BTRFS_INODE_ITEM_KEY;
> -
> -	ret = btrfs_insert_inode_locked(inode);
> -	if (ret < 0)
> -		goto fail;
> -
>   	batch.keys = &key[0];
>   	batch.data_sizes = &sizes[0];
> -	batch.total_data_size = sizes[0] + (name ? sizes[1] : 0);
> -	batch.nr = name ? 2 : 1;
> +	batch.total_data_size = sizes[0] + (args->orphan ? 0 : sizes[1]);
> +	batch.nr = args->orphan ? 1 : 2;
>   	ret = btrfs_insert_empty_items(trans, root, path, &batch);
> -	if (ret != 0)
> -		goto fail_unlock;
> +	if (ret != 0) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto discard;
> +	}
>   
>   	inode->i_mtime = current_time(inode);
>   	inode->i_atime = inode->i_mtime;
>   	inode->i_ctime = inode->i_mtime;
>   	BTRFS_I(inode)->i_otime = inode->i_mtime;
>   
> +	/*
> +	 * We're going to fill the inode item now, so at this point the inode
> +	 * must be fully initialized.
> +	 */
> +
>   	inode_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
>   				  struct btrfs_inode_item);
>   	memzero_extent_buffer(path->nodes[0], (unsigned long)inode_item,
>   			     sizeof(*inode_item));
>   	fill_inode_item(trans, path->nodes[0], inode_item, inode);
>   
> -	if (name) {
> +	if (!args->orphan) {
>   		ref = btrfs_item_ptr(path->nodes[0], path->slots[0] + 1,
>   				     struct btrfs_inode_ref);
> -		btrfs_set_inode_ref_name_len(path->nodes[0], ref, name_len);
> -		btrfs_set_inode_ref_index(path->nodes[0], ref, *index);
>   		ptr = (unsigned long)(ref + 1);
> -		write_extent_buffer(path->nodes[0], name, ptr, name_len);
> +		if (args->subvol) {
> +			btrfs_set_inode_ref_name_len(path->nodes[0], ref, 2);
> +			btrfs_set_inode_ref_index(path->nodes[0], ref, 0);
> +			write_extent_buffer(path->nodes[0], "..", ptr, 2);
> +		} else {
> +			btrfs_set_inode_ref_name_len(path->nodes[0], ref,
> +						     name_len);
> +			btrfs_set_inode_ref_index(path->nodes[0], ref,
> +						  BTRFS_I(inode)->dir_index);
> +			write_extent_buffer(path->nodes[0], name, ptr,
> +					    name_len);
> +		}
>   	}
>   
>   	btrfs_mark_buffer_dirty(path->nodes[0]);
> -	btrfs_free_path(path);
> +	btrfs_release_path(path);
>   
>   	inode_tree_add(inode);
>   
> @@ -6307,24 +6348,28 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans,
>   
>   	btrfs_update_root_times(trans, root);
>   
> -	ret = btrfs_inode_inherit_props(trans, inode, dir);
> -	if (ret)
> -		btrfs_err(fs_info,
> -			  "error inheriting props for ino %llu (root %llu): %d",
> -			btrfs_ino(BTRFS_I(inode)), root->root_key.objectid, ret);
> +	if (args->orphan) {
> +		ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> +	} else {
> +		ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode), name,
> +				     name_len, 0, BTRFS_I(inode)->dir_index);
> +	}
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto discard;
> +	}
>   
> -	return 0;
> +	ret = 0;
> +	goto out;
>   
> -fail_unlock:
> +discard:
>   	/*
>   	 * discard_new_inode() calls iput(), but the caller owns the reference
>   	 * to the inode.
>   	 */
>   	ihold(inode);
>   	discard_new_inode(inode);
> -fail:
> -	if (dir && name)
> -		BTRFS_I(dir)->index_cnt--;
> +out:
>   	btrfs_free_path(path);
>   	return ret;
>   }
> @@ -6431,52 +6476,28 @@ static int btrfs_create_common(struct inode *dir, struct dentry *dentry,
>   	unsigned int trans_num_items;
>   	struct btrfs_trans_handle *trans;
>   	int err;
> -	u64 index = 0;
>   
>   	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
> -	if (err) {
> -		iput(inode);
> -		return err;
> -	}
> +	if (err)
> +		goto out_inode;
>   
>   	trans = btrfs_start_transaction(root, trans_num_items);
>   	if (IS_ERR(trans)) {
> -		iput(inode);
>   		err = PTR_ERR(trans);
>   		goto out_new_inode_args;
>   	}
>   
> -	err = btrfs_create_new_inode(trans, &new_inode_args, &index);
> -	if (err) {
> -		iput(inode);
> -		inode = NULL;
> -		goto out_unlock;
> -	}
> +	err = btrfs_create_new_inode(trans, &new_inode_args);
> +	if (!err)
> +		d_instantiate_new(dentry, inode);
>   
> -	err = btrfs_init_inode_security(trans, &new_inode_args);
> -	if (err)
> -		goto out_unlock;
> -
> -	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	if (err)
> -		goto out_unlock;
> -
> -	err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -			     dentry->d_name.name, dentry->d_name.len, 0, index);
> -	if (err)
> -		goto out_unlock;
> -
> -	d_instantiate_new(dentry, inode);
> -
> -out_unlock:
>   	btrfs_end_transaction(trans);
> -	if (err && inode) {
> -		inode_dec_link_count(inode);
> -		discard_new_inode(inode);
> -	}
>   	btrfs_btree_balance_dirty(fs_info);
>   out_new_inode_args:
>   	btrfs_new_inode_args_destroy(&new_inode_args);
> +out_inode:
> +	if (err)
> +		iput(inode);
>   	return err;
>   }
>   
> @@ -8751,34 +8772,6 @@ struct inode *btrfs_new_subvol_inode(struct user_namespace *mnt_userns,
>   	return inode;
>   }
>   
> -/*
> - * create a new subvolume directory/inode (helper for the ioctl).
> - */
> -int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
> -			     struct btrfs_root *parent_root,
> -			     struct btrfs_new_inode_args *args)
> -{
> -	struct inode *inode = args->inode;
> -	struct btrfs_root *new_root = BTRFS_I(inode)->root;
> -	int err;
> -	u64 index = 0;
> -
> -	err = btrfs_create_new_inode(trans, args, &index);
> -	if (err)
> -		return err;
> -
> -	unlock_new_inode(inode);
> -
> -	err = btrfs_subvol_inherit_props(trans, new_root, parent_root);
> -	if (err)
> -		btrfs_err(new_root->fs_info,
> -			  "error inheriting subvolume %llu properties: %d",
> -			  new_root->root_key.objectid, err);
> -
> -	err = btrfs_update_inode(trans, new_root, BTRFS_I(inode));
> -	return err;
> -}
> -
>   struct inode *btrfs_alloc_inode(struct super_block *sb)
>   {
>   	struct btrfs_fs_info *fs_info = btrfs_sb(sb);
> @@ -9267,41 +9260,6 @@ static struct inode *new_whiteout_inode(struct user_namespace *mnt_userns,
>   	return inode;
>   }
>   
> -static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
> -				     struct btrfs_new_inode_args *args)
> -{
> -	struct inode *inode = args->inode;
> -	struct inode *dir = args->dir;
> -	struct btrfs_root *root = BTRFS_I(dir)->root;
> -	struct dentry *dentry = args->dentry;
> -	int ret;
> -	u64 index;
> -
> -	ret = btrfs_create_new_inode(trans, args, &index);
> -	if (ret) {
> -		iput(inode);
> -		return ret;
> -	}
> -
> -	ret = btrfs_init_inode_security(trans, args);
> -	if (ret)
> -		goto out;
> -
> -	ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -			     dentry->d_name.name, dentry->d_name.len, 0, index);
> -	if (ret)
> -		goto out;
> -
> -	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -out:
> -	unlock_new_inode(inode);
> -	if (ret)
> -		inode_dec_link_count(inode);
> -	iput(inode);
> -
> -	return ret;
> -}
> -
>   static int btrfs_rename(struct user_namespace *mnt_userns,
>   			struct inode *old_dir, struct dentry *old_dentry,
>   			struct inode *new_dir, struct dentry *new_dentry,
> @@ -9513,11 +9471,14 @@ static int btrfs_rename(struct user_namespace *mnt_userns,
>   				   rename_ctx.index, new_dentry->d_parent);
>   
>   	if (flags & RENAME_WHITEOUT) {
> -		ret = btrfs_whiteout_for_rename(trans, &whiteout_args);
> -		whiteout_args.inode = NULL;
> +		ret = btrfs_create_new_inode(trans, &whiteout_args);
>   		if (ret) {
>   			btrfs_abort_transaction(trans, ret);
>   			goto out_fail;
> +		} else {
> +			unlock_new_inode(whiteout_args.inode);
> +			iput(whiteout_args.inode);
> +			whiteout_args.inode = NULL;
>   		}
>   	}
>   out_fail:
> @@ -9756,7 +9717,6 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	};
>   	unsigned int trans_num_items;
>   	int err;
> -	u64 index = 0;
>   	int name_len;
>   	int datasize;
>   	unsigned long ptr;
> @@ -9774,38 +9734,33 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	inode->i_op = &btrfs_symlink_inode_operations;
>   	inode_nohighmem(inode);
>   	inode->i_mapping->a_ops = &btrfs_aops;
> +	btrfs_i_size_write(BTRFS_I(inode), name_len);
> +	inode_set_bytes(inode, name_len);
>   
>   	new_inode_args.inode = inode;
>   	err = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
> -	if (err) {
> -		iput(inode);
> -		return err;
> -	}
> +	if (err)
> +		goto out_inode;
>   	/* 1 additional item for the inline extent */
>   	trans_num_items++;
>   
>   	trans = btrfs_start_transaction(root, trans_num_items);
>   	if (IS_ERR(trans)) {
> -		iput(inode);
>   		err = PTR_ERR(trans);
>   		goto out_new_inode_args;
>   	}
>   
> -	err = btrfs_create_new_inode(trans, &new_inode_args, &index);
> -	if (err) {
> -		iput(inode);
> -		inode = NULL;
> -		goto out_unlock;
> -	}
> -
> -	err = btrfs_init_inode_security(trans, &new_inode_args);
> +	err = btrfs_create_new_inode(trans, &new_inode_args);
>   	if (err)
> -		goto out_unlock;
> +		goto out;
>   
>   	path = btrfs_alloc_path();
>   	if (!path) {
>   		err = -ENOMEM;
> -		goto out_unlock;
> +		btrfs_abort_transaction(trans, err);
> +		discard_new_inode(inode);
> +		inode = NULL;
> +		goto out;
>   	}
>   	key.objectid = btrfs_ino(BTRFS_I(inode));
>   	key.offset = 0;
> @@ -9814,8 +9769,11 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	err = btrfs_insert_empty_item(trans, root, path, &key,
>   				      datasize);
>   	if (err) {
> +		btrfs_abort_transaction(trans, err);
>   		btrfs_free_path(path);
> -		goto out_unlock;
> +		discard_new_inode(inode);
> +		inode = NULL;
> +		goto out;
>   	}
>   	leaf = path->nodes[0];
>   	ei = btrfs_item_ptr(leaf, path->slots[0],
> @@ -9833,32 +9791,16 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
>   	btrfs_mark_buffer_dirty(leaf);
>   	btrfs_free_path(path);
>   
> -	inode_set_bytes(inode, name_len);
> -	btrfs_i_size_write(BTRFS_I(inode), name_len);
> -	err = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	/*
> -	 * Last step, add directory indexes for our symlink inode. This is the
> -	 * last step to avoid extra cleanup of these indexes if an error happens
> -	 * elsewhere above.
> -	 */
> -	if (!err)
> -		err = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
> -				     dentry->d_name.name, dentry->d_name.len, 0,
> -				     index);
> -	if (err)
> -		goto out_unlock;
> -
>   	d_instantiate_new(dentry, inode);
> -
> -out_unlock:
> +	err = 0;
> +out:
>   	btrfs_end_transaction(trans);
> -	if (err && inode) {
> -		inode_dec_link_count(inode);
> -		discard_new_inode(inode);
> -	}
>   	btrfs_btree_balance_dirty(fs_info);
>   out_new_inode_args:
>   	btrfs_new_inode_args_destroy(&new_inode_args);
> +out_inode:
> +	if (err)
> +		iput(inode);
>   	return err;
>   }
>   
> @@ -10121,7 +10063,6 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
>   		.orphan = true,
>   	};
>   	unsigned int trans_num_items;
> -	u64 index;
>   	int ret;
>   
>   	inode = new_inode(dir->i_sb);
> @@ -10134,35 +10075,16 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
>   
>   	new_inode_args.inode = inode;
>   	ret = btrfs_new_inode_prepare(&new_inode_args, &trans_num_items);
> -	if (ret) {
> -		iput(inode);
> -		return ret;
> -	}
> +	if (ret)
> +		goto out_inode;
>   
>   	trans = btrfs_start_transaction(root, trans_num_items);
>   	if (IS_ERR(trans)) {
> -		iput(inode);
>   		ret = PTR_ERR(trans);
>   		goto out_new_inode_args;
>   	}
>   
> -	ret = btrfs_create_new_inode(trans, &new_inode_args, &index);
> -	if (ret) {
> -		iput(inode);
> -		inode = NULL;
> -		goto out;
> -	}
> -
> -	ret = btrfs_init_inode_security(trans, &new_inode_args);
> -	if (ret)
> -		goto out;
> -
> -	ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
> -	if (ret)
> -		goto out;
> -	ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> -	if (ret)
> -		goto out;
> +	ret = btrfs_create_new_inode(trans, &new_inode_args);
>   
>   	/*
>   	 * We set number of links to 0 in btrfs_create_new_inode(), and here we
> @@ -10172,16 +10094,20 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
>   	 *    d_tmpfile() -> inode_dec_link_count() -> drop_nlink()
>   	 */
>   	set_nlink(inode, 1);
> -	d_tmpfile(dentry, inode);
> -	unlock_new_inode(inode);
> -	mark_inode_dirty(inode);
> -out:
> +
> +	if (!ret) {
> +		d_tmpfile(dentry, inode);
> +		unlock_new_inode(inode);
> +		mark_inode_dirty(inode);
> +	}
> +
>   	btrfs_end_transaction(trans);
> -	if (ret && inode)
> -		discard_new_inode(inode);
>   	btrfs_btree_balance_dirty(fs_info);
>   out_new_inode_args:
>   	btrfs_new_inode_args_destroy(&new_inode_args);
> +out_inode:
> +	if (ret)
> +		iput(inode);
>   	return ret;
>   }
>   
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 07a74bbe3d84..24b3e384aa8f 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -574,8 +574,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   				  struct inode *dir, struct dentry *dentry,
>   				  struct btrfs_qgroup_inherit *inherit)
>   {
> -	const char *name = dentry->d_name.name;
> -	int namelen = dentry->d_name.len;
>   	struct btrfs_fs_info *fs_info = btrfs_sb(dir->i_sb);
>   	struct btrfs_trans_handle *trans;
>   	struct btrfs_key key;
> @@ -595,7 +593,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   	int ret;
>   	dev_t anon_dev;
>   	u64 objectid;
> -	u64 index = 0;
>   
>   	root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
>   	if (!root_item)
> @@ -712,7 +709,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   	free_extent_buffer(leaf);
>   	leaf = NULL;
>   
> -	key.offset = (u64)-1;
>   	new_root = btrfs_get_new_fs_root(fs_info, objectid, anon_dev);
>   	if (IS_ERR(new_root)) {
>   		ret = PTR_ERR(new_root);
> @@ -730,47 +726,21 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   		goto out;
>   	}
>   
> -	ret = btrfs_create_subvol_root(trans, root, &new_inode_args);
> -	if (ret) {
> -		/* We potentially lose an unused inode item here */
> -		btrfs_abort_transaction(trans, ret);
> -		goto out;
> -	}
> -
> -	/*
> -	 * insert the directory item
> -	 */
> -	ret = btrfs_set_inode_index(BTRFS_I(dir), &index);
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto out;
> -	}
> -
> -	ret = btrfs_insert_dir_item(trans, name, namelen, BTRFS_I(dir), &key,
> -				    BTRFS_FT_DIR, index);
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto out;
> -	}
> -
> -	btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
> -	ret = btrfs_update_inode(trans, root, BTRFS_I(dir));
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto out;
> -	}
> -
> -	ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
> -				 btrfs_ino(BTRFS_I(dir)), index, name, namelen);
> -	if (ret) {
> -		btrfs_abort_transaction(trans, ret);
> -		goto out;
> -	}
> -
>   	ret = btrfs_uuid_tree_add(trans, root_item->uuid,
>   				  BTRFS_UUID_KEY_SUBVOL, objectid);
> -	if (ret)
> +	if (ret) {
>   		btrfs_abort_transaction(trans, ret);
> +		goto out;
> +	}
> +
> +	ret = btrfs_create_new_inode(trans, &new_inode_args);
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto out;
> +	}
> +
> +	d_instantiate_new(dentry, new_inode_args.inode);
> +	new_inode_args.inode = NULL;
>   
>   out:
>   	trans->block_rsv = NULL;
> @@ -781,11 +751,6 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
>   		btrfs_end_transaction(trans);
>   	else
>   		ret = btrfs_commit_transaction(trans);
> -
> -	if (!ret) {
> -		d_instantiate(dentry, new_inode_args.inode);
> -		new_inode_args.inode = NULL;
> -	}
>   out_new_inode_args:
>   	btrfs_new_inode_args_destroy(&new_inode_args);
>   out_inode:
> diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
> index 1a6d2d5b4b33..f5565c296898 100644
> --- a/fs/btrfs/props.c
> +++ b/fs/btrfs/props.c
> @@ -334,9 +334,8 @@ static struct prop_handler prop_handlers[] = {
>   	},
>   };
>   
> -static int inherit_props(struct btrfs_trans_handle *trans,
> -			 struct inode *inode,
> -			 struct inode *parent)
> +int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
> +			      struct inode *inode, struct inode *parent)
>   {
>   	struct btrfs_root *root = BTRFS_I(inode)->root;
>   	struct btrfs_fs_info *fs_info = root->fs_info;
> @@ -408,41 +407,6 @@ static int inherit_props(struct btrfs_trans_handle *trans,
>   	return 0;
>   }
>   
> -int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
> -			      struct inode *inode,
> -			      struct inode *dir)
> -{
> -	if (!dir)
> -		return 0;
> -
> -	return inherit_props(trans, inode, dir);
> -}
> -
> -int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
> -			       struct btrfs_root *root,
> -			       struct btrfs_root *parent_root)
> -{
> -	struct super_block *sb = root->fs_info->sb;
> -	struct inode *parent_inode, *child_inode;
> -	int ret;
> -
> -	parent_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, parent_root);
> -	if (IS_ERR(parent_inode))
> -		return PTR_ERR(parent_inode);
> -
> -	child_inode = btrfs_iget(sb, BTRFS_FIRST_FREE_OBJECTID, root);
> -	if (IS_ERR(child_inode)) {
> -		iput(parent_inode);
> -		return PTR_ERR(child_inode);
> -	}
> -
> -	ret = inherit_props(trans, child_inode, parent_inode);
> -	iput(child_inode);
> -	iput(parent_inode);
> -
> -	return ret;
> -}
> -
>   void __init btrfs_props_init(void)
>   {
>   	int i;
> diff --git a/fs/btrfs/props.h b/fs/btrfs/props.h
> index 40b2c65b518c..1dcd5daa3b22 100644
> --- a/fs/btrfs/props.h
> +++ b/fs/btrfs/props.h
> @@ -21,8 +21,4 @@ int btrfs_inode_inherit_props(struct btrfs_trans_handle *trans,
>   			      struct inode *inode,
>   			      struct inode *dir);
>   
> -int btrfs_subvol_inherit_props(struct btrfs_trans_handle *trans,
> -			       struct btrfs_root *root,
> -			       struct btrfs_root *parent_root);
> -
>   #endif

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode()
  2022-04-06 17:18   ` Sweet Tea Dorminy
@ 2022-04-06 18:31     ` Omar Sandoval
  2022-04-06 20:18       ` Omar Sandoval
  0 siblings, 1 reply; 11+ messages in thread
From: Omar Sandoval @ 2022-04-06 18:31 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, Naohiro Aota, kernel-team

On Wed, Apr 06, 2022 at 01:18:52PM -0400, Sweet Tea Dorminy wrote:
> As Naohiro Aota noted in slack, this change turns out to make generic/650
> and others have write-time corruption. It appears to be from moving
> btrfs_init_inode_security() (which inserts xattrs for the inode) to before
> inserting the new inode into the tree; if the xattrs for an acl or for
> selinux get written out before the inode they belong to, the write-time
> corruption checker complains that an xattr item appears before its requisite
> inode.

This may be the write-time tree checker being overzealous. Both the
xattr insertions and the inode insertion happen within one transaction,
so we won't end up with the xattr but no inode actually being committed.
I don't see anything that would legitimately be upset if it came across
this intermediate state in the meantime.

> I'm not sure if it reintroduces the leak, but moving
> btrfs_init_inode_security() to after the inode is fully inserted into the
> tree seems to fix the issue, something like:
> 
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -6238,18 +6238,6 @@ int btrfs_create_new_inode(struct btrfs_trans_handle
> *trans,
>                           ret);
>         }
> 
> -       /*
> -        * Subvolumes don't inherit ACLs or get passed to the LSM. This is
> -        * probably a bug.
> -        */
> -       if (!args->subvol) {
> -               ret = btrfs_init_inode_security(trans, args);
> -               if (ret) {
> -                       btrfs_abort_transaction(trans, ret);
> -                       goto discard;
> -               }
> -       }
> -
>         /*
>          * We could have gotten an inode number from somebody who was
> fsynced
>          * and then removed in this same transaction, so let's just set full
> @@ -6327,6 +6315,18 @@ int btrfs_create_new_inode(struct btrfs_trans_handle
> *trans,
>         btrfs_mark_buffer_dirty(path->nodes[0]);
>         btrfs_release_path(path);
> 
> +       /*
> +        * Subvolumes don't inherit ACLs or get passed to the LSM. This is
> +        * probably a bug.
> +        */
> +       if (!args->subvol) {
> +               ret = btrfs_init_inode_security(trans, args);
> +               if (ret) {
> +                       btrfs_abort_transaction(trans, ret);
> +                       goto discard;
> +               }
> +       }
> +
>         inode_tree_add(inode);
> 
>         trace_btrfs_inode_new(inode);
> -- 
> 
> 
> I don't know what the right process to follow here is. Naohiro, might you be
> able to test with this incremental patch and confirm that it fixes? Omar,
> does this have the inode leak problem you were trying to fix?

I _think_ this fix is fine, though. The only thing I'm concerned about
is if btrfs_init_inode_security() changes the inode, then we won't write
out those changes to disk. But I don't see any code paths in
btrfs_init_inode_security() that change the inode.

I'd say that if this fixes Naohiro's issue, we should go with it.

> Sweet Tea

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode()
  2022-04-06 18:31     ` Omar Sandoval
@ 2022-04-06 20:18       ` Omar Sandoval
  2022-04-08 16:05         ` David Sterba
  0 siblings, 1 reply; 11+ messages in thread
From: Omar Sandoval @ 2022-04-06 20:18 UTC (permalink / raw)
  To: Sweet Tea Dorminy; +Cc: linux-btrfs, Naohiro Aota, kernel-team

On Wed, Apr 06, 2022 at 11:31:22AM -0700, Omar Sandoval wrote:
> On Wed, Apr 06, 2022 at 01:18:52PM -0400, Sweet Tea Dorminy wrote:
> > As Naohiro Aota noted in slack, this change turns out to make generic/650
> > and others have write-time corruption. It appears to be from moving
> > btrfs_init_inode_security() (which inserts xattrs for the inode) to before
> > inserting the new inode into the tree; if the xattrs for an acl or for
> > selinux get written out before the inode they belong to, the write-time
> > corruption checker complains that an xattr item appears before its requisite
> > inode.
> 
> This may be the write-time tree checker being overzealous. Both the
> xattr insertions and the inode insertion happen within one transaction,
> so we won't end up with the xattr but no inode actually being committed.
> I don't see anything that would legitimately be upset if it came across
> this intermediate state in the meantime.
> 
> > I'm not sure if it reintroduces the leak, but moving
> > btrfs_init_inode_security() to after the inode is fully inserted into the
> > tree seems to fix the issue, something like:
> > 
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -6238,18 +6238,6 @@ int btrfs_create_new_inode(struct btrfs_trans_handle
> > *trans,
> >                           ret);
> >         }
> > 
> > -       /*
> > -        * Subvolumes don't inherit ACLs or get passed to the LSM. This is
> > -        * probably a bug.
> > -        */
> > -       if (!args->subvol) {
> > -               ret = btrfs_init_inode_security(trans, args);
> > -               if (ret) {
> > -                       btrfs_abort_transaction(trans, ret);
> > -                       goto discard;
> > -               }
> > -       }
> > -
> >         /*
> >          * We could have gotten an inode number from somebody who was
> > fsynced
> >          * and then removed in this same transaction, so let's just set full
> > @@ -6327,6 +6315,18 @@ int btrfs_create_new_inode(struct btrfs_trans_handle
> > *trans,
> >         btrfs_mark_buffer_dirty(path->nodes[0]);
> >         btrfs_release_path(path);
> > 
> > +       /*
> > +        * Subvolumes don't inherit ACLs or get passed to the LSM. This is
> > +        * probably a bug.
> > +        */
> > +       if (!args->subvol) {
> > +               ret = btrfs_init_inode_security(trans, args);
> > +               if (ret) {
> > +                       btrfs_abort_transaction(trans, ret);
> > +                       goto discard;
> > +               }
> > +       }
> > +
> >         inode_tree_add(inode);
> > 
> >         trace_btrfs_inode_new(inode);
> > -- 
> > 
> > 
> > I don't know what the right process to follow here is. Naohiro, might you be
> > able to test with this incremental patch and confirm that it fixes? Omar,
> > does this have the inode leak problem you were trying to fix?
> 
> I _think_ this fix is fine, though. The only thing I'm concerned about
> is if btrfs_init_inode_security() changes the inode, then we won't write
> out those changes to disk. But I don't see any code paths in
> btrfs_init_inode_security() that change the inode.
> 
> I'd say that if this fixes Naohiro's issue, we should go with it.

As I just discussed with Sweet Tea offline, property inheritance also
needs to move after inode creation for the same reason.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode()
  2022-04-06 20:18       ` Omar Sandoval
@ 2022-04-08 16:05         ` David Sterba
  0 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2022-04-08 16:05 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Sweet Tea Dorminy, linux-btrfs, Naohiro Aota, kernel-team

On Wed, Apr 06, 2022 at 01:18:30PM -0700, Omar Sandoval wrote:
> > I _think_ this fix is fine, though. The only thing I'm concerned about
> > is if btrfs_init_inode_security() changes the inode, then we won't write
> > out those changes to disk. But I don't see any code paths in
> > btrfs_init_inode_security() that change the inode.
> > 
> > I'd say that if this fixes Naohiro's issue, we should go with it.
> 
> As I just discussed with Sweet Tea offline, property inheritance also
> needs to move after inode creation for the same reason.

Can you please resend the full updated patch, based on the one that's in
misc-next once it's tested? Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-08 16:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-15  1:12 [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Omar Sandoval
2022-03-15  1:12 ` [PATCH v4 1/4] btrfs: allocate inode outside of btrfs_new_inode() Omar Sandoval
2022-03-15  1:12 ` [PATCH v4 2/4] btrfs: factor out common part of btrfs_{mknod,create,mkdir}() Omar Sandoval
2022-03-15  1:12 ` [PATCH v4 3/4] btrfs: reserve correct number of items for inode creation Omar Sandoval
2022-03-15  1:12 ` [PATCH v4 4/4] btrfs: move common inode creation code into btrfs_create_new_inode() Omar Sandoval
2022-04-06 17:18   ` Sweet Tea Dorminy
2022-04-06 18:31     ` Omar Sandoval
2022-04-06 20:18       ` Omar Sandoval
2022-04-08 16:05         ` David Sterba
2022-03-15 15:00 ` [PATCH v4 0/4] btrfs: inode creation cleanups and fixes Sweet Tea Dorminy
2022-03-17 20:21 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.