All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 2/2] btrfs: relocation: review the call sites which can be interruped by signal
Date: Thu, 9 Jul 2020 18:15:09 +0800	[thread overview]
Message-ID: <76a28b23-f1f6-1dbe-05f1-f642747564f9@suse.com> (raw)
In-Reply-To: <20200709095413.GF28832@twin.jikos.cz>



On 2020/7/9 下午5:54, David Sterba wrote:
> On Thu, Jul 09, 2020 at 04:33:33PM +0800, Qu Wenruo wrote:
>> Since most metadata reservation calls can return -EINTR when get
>> interruped by fatal signal, we need to review the all the metadata
>> reservation call sites.
>>
>> In relocation code, the metadata reservation happens in the following
>> sites:
>> - btrfs_block_rsv_refill() in merge_reloc_root()
>>   merge_reloc_root() is a pretty critial section, we don't want get
>>   interrupted by signal, so change the flush status to
>>   BTRFS_RESERVE_FLUSH_LIMIT, so it won't get interrupted by signal.
>>   Since such change can be ENPSPC-prone, also shrink the amount of
>>   metadata to reserve a little to avoid deadly ENOSPC there.
>>
>> - btrfs_block_rsv_refill() in reserve_metadata_space()
>>   It calls with BTRFS_RESERVE_FLUSH_LIMIT, which won't get interrupred
>>   by signal.
> 
> This semantics of BTRFS_RESERVE_FLUSH_LIMIT regarding signals should be
> documented, right now there's a comment but says something about avoidig
> deadlocks.
> 
>> - btrfs_block_rsv_refill() in prepare_to_relocate()
>> - btrfs_block_rsv_add() in prepare_to_relocate()
>> - btrfs_block_rsv_refill() in relocate_block_group()
>> - btrfs_delalloc_reserve_metadata() in relocate_file_extent_cluster()
>> - btrfs_start_transaction() in relocate_block_group()
>> - btrfs_start_transaction() in create_reloc_inode()
>>   Can be interruped by fatal signal and we can handle it easily.
>>   For these call sites, just catch the -EINTR value in btrfs_balance()
>>   and count them as canceled.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/relocation.c | 13 +++++++++++--
>>  fs/btrfs/volumes.c    |  2 +-
>>  2 files changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 2b869fb2e62c..23914edd4710 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -1686,12 +1686,21 @@ static noinline_for_stack int merge_reloc_root(struct reloc_control *rc,
>>  		btrfs_unlock_up_safe(path, 0);
>>  	}
>>  
>> -	min_reserved = fs_info->nodesize * (BTRFS_MAX_LEVEL - 1) * 2;
>> +	/*
>> +	 * In merge_reloc_root(), we modify the upper level pointer to swap
>> +	 * the tree blocks between reloc tree and subvolume tree.
>> +	 * Thus for tree block COW, we COW at most from level 1 to root level
>> +	 * for each tree.
>> +	 *
>> +	 * Thus the needed metadata space is at most root_level * nodesize,
>> +	 * and * 2 since we have two trees to COW.
>> +	 */
>> +	min_reserved = fs_info->nodesize * btrfs_root_level(root_item) * 2;
>>  	memset(&next_key, 0, sizeof(next_key));
>>  
>>  	while (1) {
>>  		ret = btrfs_block_rsv_refill(root, rc->block_rsv, min_reserved,
>> -					     BTRFS_RESERVE_FLUSH_ALL);
>> +					     BTRFS_RESERVE_FLUSH_LIMIT);
>>  		if (ret) {
>>  			err = ret;
>>  			goto out;
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index aabc6c922e04..d60df30bdc47 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -4135,7 +4135,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
>>  	mutex_lock(&fs_info->balance_mutex);
>>  	if (ret == -ECANCELED && atomic_read(&fs_info->balance_pause_req))
>>  		btrfs_info(fs_info, "balance: paused");
>> -	else if (ret == -ECANCELED && atomic_read(&fs_info->balance_cancel_req))
>> +	else if (ret == -ECANCELED  || ret == -EINTR)
> 
> Why do you remove atomic_read(&fs_info->balance_cancel_req) ?

Because now btrfs_should_cancel_balance() can return ECANCELED without
balance_cancel_req increased due to pending fatal signal.

> 
>>  		btrfs_info(fs_info, "balance: canceled");
> 
> I'm not sure if it would be useful to print the reason, like
> 
> - 'canceled: user request'
> - 'canceled: interrupted'

To me, if user interrupt the balance progress, it's obvious they want to
cancel it.
Thus no need to distinguish btrfs balance cancel and signal cancel.

Thanks,
Qu

> 
>>  	else
>>  		btrfs_info(fs_info, "balance: ended with status: %d", ret);
>> -- 
>> 2.27.0


  reply	other threads:[~2020-07-09 10:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-09  8:33 [PATCH v2 1/2] btrfs: avoid possible signal interruption for btrfs_drop_snapshot() on relocation tree Qu Wenruo
2020-07-09  8:33 ` [PATCH v2 2/2] btrfs: relocation: review the call sites which can be interruped by signal Qu Wenruo
2020-07-09  9:54   ` David Sterba
2020-07-09 10:15     ` Qu Wenruo [this message]
2020-07-09 10:25       ` David Sterba
2020-07-09 10:46     ` Qu Wenruo
2020-07-09 11:05 ` [PATCH v2 1/2] btrfs: avoid possible signal interruption for btrfs_drop_snapshot() on relocation tree David Sterba
2020-07-09 12:11   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76a28b23-f1f6-1dbe-05f1-f642747564f9@suse.com \
    --to=wqu@suse.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.