All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: Josef Bacik <josef@toxicpanda.com>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 1/5] btrfs: Improve global reserve stealing logic
Date: Mon, 9 Mar 2020 22:48:39 +0200	[thread overview]
Message-ID: <d2955ecf-4ed6-5931-65d3-eddde228b816@suse.com> (raw)
In-Reply-To: <20200309202322.12327-2-josef@toxicpanda.com>



On 9.03.20 г. 22:23 ч., Josef Bacik wrote:
> For unlink transactions and block group removal
> btrfs_start_transaction_fallback_global_rsv will first try to start
> an ordinary transaction and if it fails it will fall back to reserving
> the required amount by stealing from the global reserve. This is sound
> in theory but current code doesn't perform any locking or throttling so
> if there are multiple concurrent unlink() callers they can deplete
> the global reservation which will result in ENOSPC.

Your introduction of the problem and proposed solution are better worded
in the introduction letter so I'd rather see some of that text copied
here, I guess David can also help.

> 
> Fix this behavior by introducing BTRFS_RESERVE_FLUSH_ALL_STEAL. It's
> used to mark unlink reservation. The flushing machinery is modified to
> steal from global reservation when it sees such reservation being on the
> brink of failure (in maybe_fail_all_tickets).>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/block-group.c |  2 +-
>  fs/btrfs/ctree.h       |  1 +
>  fs/btrfs/inode.c       |  2 +-
>  fs/btrfs/space-info.c  | 38 +++++++++++++++++++++++++++++++++++++-
>  fs/btrfs/space-info.h  |  1 +
>  fs/btrfs/transaction.c | 42 +++++-------------------------------------
>  fs/btrfs/transaction.h |  3 +--
>  7 files changed, 47 insertions(+), 42 deletions(-)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 60e9bb136f34..faa04093b6b5 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -1171,7 +1171,7 @@ struct btrfs_trans_handle *btrfs_start_trans_remove_block_group(
>  	free_extent_map(em);
>  
>  	return btrfs_start_transaction_fallback_global_rsv(fs_info->extent_root,
> -							   num_items, 1);
> +							   num_items);
>  }
>  
>  /*
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 2ccb2a090782..782c63f213e9 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -2528,6 +2528,7 @@ enum btrfs_reserve_flush_enum {
>  	BTRFS_RESERVE_FLUSH_DATA,
>  	BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE,
>  	BTRFS_RESERVE_FLUSH_ALL,
> +	BTRFS_RESERVE_FLUSH_ALL_STEAL,
>  };
>  
>  enum btrfs_flush_state {
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index b8dabffac767..4e3b115ef1d7 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -3617,7 +3617,7 @@ static struct btrfs_trans_handle *__unlink_start_trans(struct inode *dir)
>  	 * 1 for the inode ref
>  	 * 1 for the inode
>  	 */
> -	return btrfs_start_transaction_fallback_global_rsv(root, 5, 5);
> +	return btrfs_start_transaction_fallback_global_rsv(root, 5);
>  }
>  
>  static int btrfs_unlink(struct inode *dir, struct dentry *dentry)
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 26e1c492b9b5..9c9a4933f72b 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -810,6 +810,35 @@ static inline int need_do_async_reclaim(struct btrfs_fs_info *fs_info,
>  		!test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state));
>  }
>  
> +static bool steal_from_global_rsv(struct btrfs_fs_info *fs_info,
> +				  struct btrfs_space_info *space_info,
> +				  struct reserve_ticket *ticket)
> +{
> +	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
> +	u64 min_bytes;
> +
> +	if (global_rsv->space_info != space_info)
> +		return false;
> +
> +	spin_lock(&global_rsv->lock);
> +	min_bytes = div_factor(global_rsv->size, 1);

This is a subtle change that is not documented but it should: The old
code ensured we'd steel if at least 50% of the block reservation were
left after stealing, whereas now the code only leaves 10%.

This essentially allows to use up more of the global reservation. I
remember we discussed the fact that 10% on a 512m global reserve is
around 50mb which is "enough". I think this ought to be mentioned.
> +	if (global_rsv->reserved < min_bytes + ticket->bytes) {
> +		spin_unlock(&global_rsv->lock);
> +		return false;
> +	}
> +	global_rsv->reserved -= ticket->bytes;
> +	ticket->bytes = 0;
> +	trace_printk("Satisfied ticket from global rsv\n");
> +	list_del_init(&ticket->list);
> +	wake_up(&ticket->wait);
> +	space_info->tickets_id++;
> +	if (global_rsv->reserved < global_rsv->size)
> +		global_rsv->full = 0;
> +	spin_unlock(&global_rsv->lock);
> +
> +	return true;
> +}
> +

<snip>

  reply	other threads:[~2020-03-09 20:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-09 20:23 [PATCH 0/5] Deal with a few ENOSPC corner cases Josef Bacik
2020-03-09 20:23 ` [PATCH 1/5] btrfs: Improve global reserve stealing logic Josef Bacik
2020-03-09 20:48   ` Nikolay Borisov [this message]
2020-03-10 14:27   ` Nikolay Borisov
2020-03-09 20:23 ` [PATCH 2/5] btrfs: Account for trans_block_rsv in may_commit_transaction Josef Bacik
2020-03-09 20:44   ` Nikolay Borisov
2020-03-09 20:23 ` [PATCH 3/5] btrfs: only take normal tickets into account " Josef Bacik
2020-03-09 20:51   ` Nikolay Borisov
2020-03-09 23:13   ` Nikolay Borisov
2020-03-10 10:27   ` Nikolay Borisov
2020-03-09 20:23 ` [PATCH 4/5] btrfs: only check priority tickets for priority flushing Josef Bacik
2020-03-10 10:30   ` Nikolay Borisov
2020-03-09 20:23 ` [PATCH 5/5] btrfs: run btrfs_try_granting_tickets if a priority ticket fails Josef Bacik
2020-03-10 10:32   ` Nikolay Borisov
2020-03-13 19:54     ` Josef Bacik
2020-03-10 17:28 ` [PATCH 0/5] Deal with a few ENOSPC corner cases Nikolay Borisov
2020-03-11  1:45   ` David Sterba
2020-03-13 12:37     ` Nikolay Borisov
2020-03-13 19:58 [PATCH 0/5][v2] " Josef Bacik
2020-03-13 19:58 ` [PATCH 1/5] btrfs: Improve global reserve stealing logic Josef Bacik
2020-03-17 12:46   ` Nikolay Borisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2955ecf-4ed6-5931-65d3-eddde228b816@suse.com \
    --to=nborisov@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.