Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: Josef Bacik <josef@toxicpanda.com>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 1/2] btrfs: reserve space for inheriting properties
Date: Fri, 8 Feb 2019 08:30:08 +0200
Message-ID: <12a76243-ae38-974b-957f-c50955d53934@suse.com> (raw)
In-Reply-To: <20190207165426.15866-2-josef@toxicpanda.com>



On 7.02.19 г. 18:54 ч., Josef Bacik wrote:
> We've been seeing errors on our build servers related to failing to
> inherit inode properties.  This is because we do not pre-reserve space
> for them, instead trying to reserve space with NO_FLUSH at inheritance
> time.  NO_FLUSH can transiently fail, but we'll still complain.  It's

Put one or two sentences describing the implications of
BTRFS_RESERVE_NO_FLUSH. I.e that it NO_FLUSH won't try to flush current
metadata reservation which could result in the said transient failure.
Just to give a bit more context for someone who might be reading the
commit message in the future.

> just an extra credit, so simply add that to the places that call
> btrfs_new_inode and call it good enough.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/inode.c | 78 ++++++++++++++++++++++----------------------------------
>  fs/btrfs/ioctl.c | 27 ++++++++++++--------
>  2 files changed, 46 insertions(+), 59 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 6126de9b8b9c..0da4a9d6d9fe 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -59,6 +59,14 @@ struct btrfs_dio_data {
>  	int overwrite;
>  };
>  
> +/*
> + * 2 for inode item and ref
> + * 2 for dir items
> + * 1 for xattr if selinux is on
> + * 1 for inherited properties
> + */
> +#define BTRFS_NEW_INODE_ITEMS 6

Rather than having scattered defines I'd much prefer if we have an enum
and over time add all distinct reservations to that enum and change
btrfs_start_transaction's interface to take this enum. It will be much
more descriptive than having scattered defines.

> +
>  static const struct inode_operations btrfs_dir_inode_operations;
>  static const struct inode_operations btrfs_symlink_inode_operations;
>  static const struct inode_operations btrfs_dir_ro_inode_operations;
> @@ -6479,12 +6487,7 @@ static int btrfs_mknod(struct inode *dir, struct dentry *dentry,
>  	u64 objectid;
>  	u64 index = 0;
>  
> -	/*
> -	 * 2 for inode item and ref
> -	 * 2 for dir items
> -	 * 1 for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> +	trans = btrfs_start_transaction(root, BTRFS_NEW_INODE_ITEMS);
>  	if (IS_ERR(trans))
>  		return PTR_ERR(trans);
>  
> @@ -6543,12 +6546,7 @@ static int btrfs_create(struct inode *dir, struct dentry *dentry,
>  	u64 objectid;
>  	u64 index = 0;
>  
> -	/*
> -	 * 2 for inode item and ref
> -	 * 2 for dir items
> -	 * 1 for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> +	trans = btrfs_start_transaction(root, BTRFS_NEW_INODE_ITEMS);
>  	if (IS_ERR(trans))
>  		return PTR_ERR(trans);
>  
> @@ -6695,12 +6693,7 @@ static int btrfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
>  	u64 objectid = 0;
>  	u64 index = 0;
>  
> -	/*
> -	 * 2 items for inode and ref
> -	 * 2 items for dir items
> -	 * 1 for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> +	trans = btrfs_start_transaction(root, BTRFS_NEW_INODE_ITEMS);
>  	if (IS_ERR(trans))
>  		return PTR_ERR(trans);
>  
> @@ -9428,14 +9421,11 @@ static int btrfs_rename_exchange(struct inode *old_dir,
>  		down_read(&fs_info->subvol_sem);
>  
>  	/*
> -	 * We want to reserve the absolute worst case amount of items.  So if
> -	 * both inodes are subvols and we need to unlink them then that would
> -	 * require 4 item modifications, but if they are both normal inodes it
> -	 * would require 5 item modifications, so we'll assume their normal
> -	 * inodes.  So 5 * 2 is 10, plus 2 for the new links, so 12 total items
> -	 * should cover the worst case number of items we'll modify.
> +	 * The same math from btrfs_rename applies here, except we need an extra
> +	 * 2 items for the new links.
>  	 */
> -	trans = btrfs_start_transaction(root, 12);
> +	trans = btrfs_start_transaction(root,
> +					(BTRFS_NEW_INODE_ITEMS << 1) + 2);
>  	if (IS_ERR(trans)) {
>  		ret = PTR_ERR(trans);
>  		goto out_notrans;
> @@ -9768,19 +9758,19 @@ static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
>  		down_read(&fs_info->subvol_sem);
>  	/*
> -	 * We want to reserve the absolute worst case amount of items.  So if
> -	 * both inodes are subvols and we need to unlink them then that would
> -	 * require 4 item modifications, but if they are both normal inodes it
> -	 * would require 5 item modifications, so we'll assume they are normal
> -	 * inodes.  So 5 * 2 is 10, plus 1 for the new link, so 11 total items
> -	 * should cover the worst case number of items we'll modify.
> -	 * If our rename has the whiteout flag, we need more 5 units for the
> -	 * new inode (1 inode item, 1 inode ref, 2 dir items and 1 xattr item
> -	 * when selinux is enabled).
> +	 * We want to reserve the absolute worst case amount of items.  Subvol
> +	 * inodes don't have an inode item to worry about and don't have a
> +	 * selinux attr, so we use the BTRFS_NEW_INODE_ITEMS counter for how
> +	 * much it costs per inode to modify.  Worse case we'll have to mess
> +	 * with 2 inodes, so 2 x BTRFS_NEW_INODE_ITEMS, and then we need an
> +	 * extra reservation for the new link.
> +	 *
> +	 * If our rename has the whiteout flag we need a full new inode which
> +	 * means another set of BTRFS_NEW_INODE_ITEMS.
>  	 */
> -	trans_num_items = 11;
> +	trans_num_items = (BTRFS_NEW_INODE_ITEMS << 1) + 1;
>  	if (flags & RENAME_WHITEOUT)
> -		trans_num_items += 5;
> +		trans_num_items += BTRFS_NEW_INODE_ITEMS;
>  	trans = btrfs_start_transaction(root, trans_num_items);
>  	if (IS_ERR(trans)) {
>  		ret = PTR_ERR(trans);
> @@ -10149,14 +10139,8 @@ static int btrfs_symlink(struct inode *dir, struct dentry *dentry,
>  	if (name_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info))
>  		return -ENAMETOOLONG;
>  
> -	/*
> -	 * 2 items for inode item and ref
> -	 * 2 items for dir items
> -	 * 1 item for updating parent inode item
> -	 * 1 item for the inline extent item
> -	 * 1 item for xattr if selinux is on
> -	 */
> -	trans = btrfs_start_transaction(root, 7);
> +	/* 1 item for the inline extent item */
> +	trans = btrfs_start_transaction(root, BTRFS_NEW_INODE_ITEMS + 1);
>  	if (IS_ERR(trans))
>  		return PTR_ERR(trans);
>  
> @@ -10427,10 +10411,8 @@ static int btrfs_tmpfile(struct inode *dir, struct dentry *dentry, umode_t mode)
>  	u64 index;
>  	int ret = 0;
>  
> -	/*
> -	 * 5 units required for adding orphan entry
> -	 */
> -	trans = btrfs_start_transaction(root, 5);
> +	/* 1 unit required for adding orphan entry */
> +	trans = btrfs_start_transaction(root, BTRFS_NEW_INODE_ITEMS + 1);
>  	if (IS_ERR(trans))
>  		return PTR_ERR(trans);
>  
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index f38a659c918c..21f8ab2d8570 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -83,6 +83,17 @@ struct btrfs_ioctl_send_args_32 {
>  			       struct btrfs_ioctl_send_args_32)
>  #endif
>  
> +/*
> + * 1 - parent dir inode
> + * 2 - dir entries
> + * 1 - root item
> + * 2 - root ref/backref
> + * 1 - root of snapshot
> + * 1 - UUID item
> + * 1 - properties
> + */
> +#define BTRFS_NEW_ROOT_ITEMS 9

Even if you choose to have defines currently, I insist on them being
grouped in one place e.g. ctree.h

> +
>  static int btrfs_clone(struct inode *src, struct inode *inode,
>  		       u64 off, u64 olen, u64 olen_aligned, u64 destoff,
>  		       int no_time_update);
> @@ -596,7 +607,8 @@ static noinline int create_subvol(struct inode *dir,
>  	 * The same as the snapshot creation, please see the comment
>  	 * of create_snapshot().
>  	 */
> -	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 8, false);
> +	ret = btrfs_subvolume_reserve_metadata(root, &block_rsv,
> +					       BTRFS_NEW_ROOT_ITEMS, false);
>  	if (ret)
>  		goto fail_free;
>  
> @@ -804,17 +816,10 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
>  
>  	btrfs_init_block_rsv(&pending_snapshot->block_rsv,
>  			     BTRFS_BLOCK_RSV_TEMP);
> -	/*
> -	 * 1 - parent dir inode
> -	 * 2 - dir entries
> -	 * 1 - root item
> -	 * 2 - root ref/backref
> -	 * 1 - root of snapshot
> -	 * 1 - UUID item
> -	 */
> +
>  	ret = btrfs_subvolume_reserve_metadata(BTRFS_I(dir)->root,
> -					&pending_snapshot->block_rsv, 8,
> -					false);
> +					&pending_snapshot->block_rsv,
> +					BTRFS_NEW_ROOT_ITEMS, false);
>  	if (ret)
>  		goto dec_and_free;
>  
> 

  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07 16:54 [PATCH 0/2] Reserve more space for property inheritance Josef Bacik
2019-02-07 16:54 ` [PATCH 1/2] btrfs: reserve space for inheriting properties Josef Bacik
2019-02-08  6:30   ` Nikolay Borisov [this message]
2019-03-22 19:11     ` David Sterba
2019-02-13 12:37   ` Filipe Manana
2019-02-07 16:54 ` [PATCH 2/2] btrfs: use the existing credit for our first prop Josef Bacik
2019-02-13 12:38   ` Filipe Manana
2019-04-26 14:29   ` David Sterba

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12a76243-ae38-974b-957f-c50955d53934@suse.com \
    --to=nborisov@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git