linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] btrfs-progs: Fix delayed ref leakage
@ 2019-07-08  7:33 Qu Wenruo
  2019-07-08  7:33 ` [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost Qu Wenruo
  2019-07-08  7:33 ` [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed Qu Wenruo
  0 siblings, 2 replies; 10+ messages in thread
From: Qu Wenruo @ 2019-07-08  7:33 UTC (permalink / raw)
  To: linux-btrfs

We have several bug reports of btrfs-convert failure in
btrfs_run_delayed_refs().

The bug turns out to be a incorrect backport of delayed ref handling in
transaction commit.

In kernel we have a loop to exhause both delayed refs and dirty blocks,
as btrfs_write_dirty_block_groups() can generate new delayed refs, while
btrfs_run_delayed_refs() can re-dirty block groups.

This feedback could cause missing delayed refs, and bite later transaction
just like the btrfs-convert report.

Furthermore, such lost delayed refs causes memory leakage and reserved
space leakage, the later one can be detected by commit c31edf610cbe
("btrfs-progs: Fix false ENOSPC alert by tracking used space
correctly").

The first patch will fix the long existing bug.

The second patch will try to reduce the feedback between
btrfs_write_dirty_block_groups() and btrfs_run_delayed_refs().

Changelog:
v2:
- Use a loop to exhaust both delayed refs and dirty block groups
  Thanks Nikolay for pointing out the feedback between them.

- Add a new patch to reduce above feedback 

- Handle the error from btrfs_write_dirty_block_groups()

Qu Wenruo (2):
  btrfs-progs: Exhaust delayed refs and dirty block groups to prevent
    delayed refs lost
  btrfs-progs: Avoid unnecessary block group item COW if the content
    hasn't changed

 extent-tree.c | 26 +++++++++++++++++++++++++-
 transaction.c | 27 +++++++++++++++++++++------
 2 files changed, 46 insertions(+), 7 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost
  2019-07-08  7:33 [PATCH v2 0/2] btrfs-progs: Fix delayed ref leakage Qu Wenruo
@ 2019-07-08  7:33 ` Qu Wenruo
  2019-07-08 10:44   ` Nikolay Borisov
  2019-07-22 12:59   ` David Sterba
  2019-07-08  7:33 ` [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed Qu Wenruo
  1 sibling, 2 replies; 10+ messages in thread
From: Qu Wenruo @ 2019-07-08  7:33 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Klemens Schölhorn

[BUG]
Btrfs-progs sometimes fails to find certain extent backref when
committing transaction.
The most reliable way to reproduce it is fsck-test/013 on 64K page sized
system:
  [...]
  adding new data backref on 315859712 root 287 owner 292 offset 0 found 1
  btrfs unable to find ref byte nr 31850496 parent 0 root 2  owner 0 offset 0
  Failed to find [30867456, 168, 65536]

Also there are some github bug reports related to this problem.

[CAUSE]
Commit 909357e86799 ("btrfs-progs: Wire up delayed refs") introduced
delayed refs in btrfs-progs.

However in that commit, delayed refs are not run at correct timing.
That commit calls btrfs_run_delayed_refs() before
btrfs_write_dirty_block_groups(), which needs to update
BLOCK_GROUP_ITEMs in extent tree, thus could cause new delayed refs.

This means each time we commit a transaction, we may screw up the extent
tree by dropping some pending delayed refs, like:

Transaction 711:
btrfs_commit_transaction()
|- btrfs_run_delayed_refs()
|  Now all delayed refs are written to extent tree
|
|- btrfs_write_dirty_block_groups()
|  Needs to update extent tree root
|  ADD_DELAYED_REF to 315859712.
|  Delayed refs are attached to current trans handle.
|
|- __commit_transaction()
|- write_ctree_super()
|- btrfs_finish_extent_commit()
|- kfree(trans)
   Now delayed ref for 315859712 are lost

Transaction 712:
Tree block 315859712 get dropped
btrfs_commit_transaction()
|- btrfs_run_delayed_refs()
   |- run_one_delayed_ref()
      |- __free_extent()
         As previous ADD_DELAYED_REF to 315859712 is lost, extent tree
         doesn't has any backref for 315859712, causing the bug

In fact, commit c31edf610cbe ("btrfs-progs: Fix false ENOSPC alert by
tracking used space correctly") detects the tree block leakage, but in
the reproducer we have too many noise, thus nobody notices the leakage
warning.

[FIX]
We can't just move btrfs_run_delayed_refs() after
btrfs_write_dirty_block_groups(), as during btrfs_run_delayed_refs(), we
can re-dirty block groups.
Thus we need to exhaust both delayed refs and dirty blocks.

This patch will call btrfs_write_dirty_block_groups() and
btrfs_run_delayed_refs() in a loop until both delayed refs and dirty
blocks are exhausted. Much like what we do in commit_cowonly_roots() in
kernel.

Also, to prevent such problem from happening again (and not to debug
such problem again), add extra check on delayed refs before freeing the
trans handle.

Reported-by: Klemens Schölhorn <klemens@schoelhorn.eu>
Issue: 187
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 transaction.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/transaction.c b/transaction.c
index 551fb24e674d..3b0a428db76e 100644
--- a/transaction.c
+++ b/transaction.c
@@ -193,17 +193,32 @@ commit_tree:
 	ret = commit_tree_roots(trans, fs_info);
 	if (ret < 0)
 		goto error;
+
 	/*
-	 * Ensure that all committed roots are properly accounted in the
-	 * extent tree
+	 * btrfs_write_dirty_block_groups() can cause CoW thus new delayed
+	 * tree refs, while run such delayed tree refs can dirty block groups
+	 * again, we need to exhause both dirty blocks and delayed refs
 	 */
-	ret = btrfs_run_delayed_refs(trans, -1);
-	if (ret < 0)
-		goto error;
-	btrfs_write_dirty_block_groups(trans);
+	while (!RB_EMPTY_ROOT(&trans->delayed_refs.href_root) ||
+		test_range_bit(&fs_info->block_group_cache, 0, (u64)-1,
+			       BLOCK_GROUP_DIRTY, 0))
+	{
+		ret = btrfs_write_dirty_block_groups(trans);
+		if (ret < 0)
+			goto error;
+		ret = btrfs_run_delayed_refs(trans, -1);
+		if (ret < 0)
+			goto error;
+	}
 	__commit_transaction(trans, root);
 	if (ret < 0)
 		goto error;
+
+	/* There should be no pending delayed refs now */
+	if (!RB_EMPTY_ROOT(&trans->delayed_refs.href_root)) {
+		error("Uncommitted delayed refs detected");
+		goto error;
+	}
 	ret = write_ctree_super(trans);
 	btrfs_finish_extent_commit(trans);
 	kfree(trans);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed
  2019-07-08  7:33 [PATCH v2 0/2] btrfs-progs: Fix delayed ref leakage Qu Wenruo
  2019-07-08  7:33 ` [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost Qu Wenruo
@ 2019-07-08  7:33 ` Qu Wenruo
  2019-07-08 10:43   ` Nikolay Borisov
  1 sibling, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2019-07-08  7:33 UTC (permalink / raw)
  To: linux-btrfs

In write_one_cache_group() we always do COW to update BLOCK_GROUP_ITEM.
However under a lot of cases, the cache->item is not changed at all.

E.g:
Transaction 123, block group [1M, 1M + 16M)

tree block 1M + 0 get freed
tree block 1M + 16K get allocated.

Transaction 123 get committed.

In this case, used space of block group [1M, 1M + 16M) doesn't changed
at all, thus we don't need to do COW to update block group item.

This patch will make write_one_cache_group() to do a read-only search
first, then do COW if we really need to update block group item.

This should reduce the btrfs_write_dirty_block_groups() and
btrfs_run_delayed_refs() loop introduced in previous commit.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 extent-tree.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/extent-tree.c b/extent-tree.c
index 932af2c644bd..24d3a1ab3f25 100644
--- a/extent-tree.c
+++ b/extent-tree.c
@@ -1533,10 +1533,34 @@ static int write_one_cache_group(struct btrfs_trans_handle *trans,
 	unsigned long bi;
 	struct extent_buffer *leaf;
 
+	/* Do a read only check to see if we need to update BLOCK_GROUP_ITEM */
+	ret = btrfs_search_slot(NULL, extent_root, &cache->key, path, 0, 0);
+	if (ret < 0)
+		goto fail;
+	if (ret > 0) {
+		ret = -ENOENT;
+		error("failed to find block group %llu in extent tree",
+			cache->key.objectid);
+		goto fail;
+	}
+	leaf = path->nodes[0];
+	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
+	ret = memcmp_extent_buffer(leaf, &cache->item, bi, sizeof(cache->item));
+	btrfs_release_path(path);
+	/* No need to update */
+	if (ret == 0)
+		return ret;
+
+	/* Do the COW to update BLOCK_GROUP_ITEM */
 	ret = btrfs_search_slot(trans, extent_root, &cache->key, path, 0, 1);
 	if (ret < 0)
 		goto fail;
-	BUG_ON(ret);
+	if (ret > 0) {
+		ret = -ENOENT;
+		error("failed to find block group %llu in extent tree",
+			cache->key.objectid);
+		goto fail;
+	}
 
 	leaf = path->nodes[0];
 	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed
  2019-07-08  7:33 ` [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed Qu Wenruo
@ 2019-07-08 10:43   ` Nikolay Borisov
  2019-07-08 12:50     ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: Nikolay Borisov @ 2019-07-08 10:43 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 8.07.19 г. 10:33 ч., Qu Wenruo wrote:
> In write_one_cache_group() we always do COW to update BLOCK_GROUP_ITEM.
> However under a lot of cases, the cache->item is not changed at all.
> 
> E.g:
> Transaction 123, block group [1M, 1M + 16M)
> 
> tree block 1M + 0 get freed
> tree block 1M + 16K get allocated.
> 
> Transaction 123 get committed.
> 
> In this case, used space of block group [1M, 1M + 16M) doesn't changed
> at all, thus we don't need to do COW to update block group item.
> 
> This patch will make write_one_cache_group() to do a read-only search
> first, then do COW if we really need to update block group item.
> 
> This should reduce the btrfs_write_dirty_block_groups() and
> btrfs_run_delayed_refs() loop introduced in previous commit.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

I'm not sure how effective this is going to be and isn't this premature
optimization, have you done any measurements?


> ---
>  extent-tree.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/extent-tree.c b/extent-tree.c
> index 932af2c644bd..24d3a1ab3f25 100644
> --- a/extent-tree.c
> +++ b/extent-tree.c
> @@ -1533,10 +1533,34 @@ static int write_one_cache_group(struct btrfs_trans_handle *trans,
>  	unsigned long bi;
>  	struct extent_buffer *leaf;
>  
> +	/* Do a read only check to see if we need to update BLOCK_GROUP_ITEM */
> +	ret = btrfs_search_slot(NULL, extent_root, &cache->key, path, 0, 0);
> +	if (ret < 0)
> +		goto fail;
> +	if (ret > 0) {
> +		ret = -ENOENT;
> +		error("failed to find block group %llu in extent tree",
> +			cache->key.objectid);
> +		goto fail;
> +	}
> +	leaf = path->nodes[0];
> +	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
> +	ret = memcmp_extent_buffer(leaf, &cache->item, bi, sizeof(cache->item));
> +	btrfs_release_path(path);
> +	/* No need to update */
> +	if (ret == 0)
> +		return ret;
> +
> +	/* Do the COW to update BLOCK_GROUP_ITEM */
>  	ret = btrfs_search_slot(trans, extent_root, &cache->key, path, 0, 1);
>  	if (ret < 0)
>  		goto fail;
> -	BUG_ON(ret);
> +	if (ret > 0) {
> +		ret = -ENOENT;
> +		error("failed to find block group %llu in extent tree",
> +			cache->key.objectid);
> +		goto fail;
> +	}
>  
>  	leaf = path->nodes[0];
>  	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost
  2019-07-08  7:33 ` [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost Qu Wenruo
@ 2019-07-08 10:44   ` Nikolay Borisov
  2019-07-22 12:59   ` David Sterba
  1 sibling, 0 replies; 10+ messages in thread
From: Nikolay Borisov @ 2019-07-08 10:44 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: Klemens Schölhorn



On 8.07.19 г. 10:33 ч., Qu Wenruo wrote:
> [BUG]
> Btrfs-progs sometimes fails to find certain extent backref when
> committing transaction.
> The most reliable way to reproduce it is fsck-test/013 on 64K page sized
> system:
>   [...]
>   adding new data backref on 315859712 root 287 owner 292 offset 0 found 1
>   btrfs unable to find ref byte nr 31850496 parent 0 root 2  owner 0 offset 0
>   Failed to find [30867456, 168, 65536]
> 
> Also there are some github bug reports related to this problem.
> 
> [CAUSE]
> Commit 909357e86799 ("btrfs-progs: Wire up delayed refs") introduced
> delayed refs in btrfs-progs.
> 
> However in that commit, delayed refs are not run at correct timing.
> That commit calls btrfs_run_delayed_refs() before
> btrfs_write_dirty_block_groups(), which needs to update
> BLOCK_GROUP_ITEMs in extent tree, thus could cause new delayed refs.
> 
> This means each time we commit a transaction, we may screw up the extent
> tree by dropping some pending delayed refs, like:
> 
> Transaction 711:
> btrfs_commit_transaction()
> |- btrfs_run_delayed_refs()
> |  Now all delayed refs are written to extent tree
> |
> |- btrfs_write_dirty_block_groups()
> |  Needs to update extent tree root
> |  ADD_DELAYED_REF to 315859712.
> |  Delayed refs are attached to current trans handle.
> |
> |- __commit_transaction()
> |- write_ctree_super()
> |- btrfs_finish_extent_commit()
> |- kfree(trans)
>    Now delayed ref for 315859712 are lost
> 
> Transaction 712:
> Tree block 315859712 get dropped
> btrfs_commit_transaction()
> |- btrfs_run_delayed_refs()
>    |- run_one_delayed_ref()
>       |- __free_extent()
>          As previous ADD_DELAYED_REF to 315859712 is lost, extent tree
>          doesn't has any backref for 315859712, causing the bug
> 
> In fact, commit c31edf610cbe ("btrfs-progs: Fix false ENOSPC alert by
> tracking used space correctly") detects the tree block leakage, but in
> the reproducer we have too many noise, thus nobody notices the leakage
> warning.
> 
> [FIX]
> We can't just move btrfs_run_delayed_refs() after
> btrfs_write_dirty_block_groups(), as during btrfs_run_delayed_refs(), we
> can re-dirty block groups.
> Thus we need to exhaust both delayed refs and dirty blocks.
> 
> This patch will call btrfs_write_dirty_block_groups() and
> btrfs_run_delayed_refs() in a loop until both delayed refs and dirty
> blocks are exhausted. Much like what we do in commit_cowonly_roots() in
> kernel.
> 
> Also, to prevent such problem from happening again (and not to debug
> such problem again), add extra check on delayed refs before freeing the
> trans handle.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> 
> Reported-by: Klemens Schölhorn <klemens@schoelhorn.eu>
> Issue: 187
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  transaction.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/transaction.c b/transaction.c
> index 551fb24e674d..3b0a428db76e 100644
> --- a/transaction.c
> +++ b/transaction.c
> @@ -193,17 +193,32 @@ commit_tree:
>  	ret = commit_tree_roots(trans, fs_info);
>  	if (ret < 0)
>  		goto error;
> +
>  	/*
> -	 * Ensure that all committed roots are properly accounted in the
> -	 * extent tree
> +	 * btrfs_write_dirty_block_groups() can cause CoW thus new delayed
> +	 * tree refs, while run such delayed tree refs can dirty block groups
> +	 * again, we need to exhause both dirty blocks and delayed refs
>  	 */
> -	ret = btrfs_run_delayed_refs(trans, -1);
> -	if (ret < 0)
> -		goto error;
> -	btrfs_write_dirty_block_groups(trans);
> +	while (!RB_EMPTY_ROOT(&trans->delayed_refs.href_root) ||
> +		test_range_bit(&fs_info->block_group_cache, 0, (u64)-1,
> +			       BLOCK_GROUP_DIRTY, 0))
> +	{
> +		ret = btrfs_write_dirty_block_groups(trans);
> +		if (ret < 0)
> +			goto error;
> +		ret = btrfs_run_delayed_refs(trans, -1);
> +		if (ret < 0)
> +			goto error;
> +	}
>  	__commit_transaction(trans, root);
>  	if (ret < 0)
>  		goto error;
> +
> +	/* There should be no pending delayed refs now */
> +	if (!RB_EMPTY_ROOT(&trans->delayed_refs.href_root)) {
> +		error("Uncommitted delayed refs detected");
> +		goto error;
> +	}
>  	ret = write_ctree_super(trans);
>  	btrfs_finish_extent_commit(trans);
>  	kfree(trans);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed
  2019-07-08 10:43   ` Nikolay Borisov
@ 2019-07-08 12:50     ` Qu Wenruo
  2019-07-08 13:07       ` Nikolay Borisov
  0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2019-07-08 12:50 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2019/7/8 下午6:43, Nikolay Borisov wrote:
>
>
> On 8.07.19 г. 10:33 ч., Qu Wenruo wrote:
>> In write_one_cache_group() we always do COW to update BLOCK_GROUP_ITEM.
>> However under a lot of cases, the cache->item is not changed at all.
>>
>> E.g:
>> Transaction 123, block group [1M, 1M + 16M)
>>
>> tree block 1M + 0 get freed
>> tree block 1M + 16K get allocated.
>>
>> Transaction 123 get committed.
>>
>> In this case, used space of block group [1M, 1M + 16M) doesn't changed
>> at all, thus we don't need to do COW to update block group item.
>>
>> This patch will make write_one_cache_group() to do a read-only search
>> first, then do COW if we really need to update block group item.
>>
>> This should reduce the btrfs_write_dirty_block_groups() and
>> btrfs_run_delayed_refs() loop introduced in previous commit.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> I'm not sure how effective this is going to be

The effectiveness is indeed low.

For btrfs/013 test case, 64K page size, it reduces total number of
delayed refs by less than 2% (5/300+)

And similar result for total number of dirty block groups.

> and isn't this premature
> optimization, have you done any measurements?

For the optimization part, I'd say it should be pretty safe.
It just really skips unnecessary CoW.

The only downside to me is the extra tree search, thus killing the
"optimization" part.

Thanks,
Qu
>
>
>> ---
>>  extent-tree.c | 26 +++++++++++++++++++++++++-
>>  1 file changed, 25 insertions(+), 1 deletion(-)
>>
>> diff --git a/extent-tree.c b/extent-tree.c
>> index 932af2c644bd..24d3a1ab3f25 100644
>> --- a/extent-tree.c
>> +++ b/extent-tree.c
>> @@ -1533,10 +1533,34 @@ static int write_one_cache_group(struct btrfs_trans_handle *trans,
>>  	unsigned long bi;
>>  	struct extent_buffer *leaf;
>>
>> +	/* Do a read only check to see if we need to update BLOCK_GROUP_ITEM */
>> +	ret = btrfs_search_slot(NULL, extent_root, &cache->key, path, 0, 0);
>> +	if (ret < 0)
>> +		goto fail;
>> +	if (ret > 0) {
>> +		ret = -ENOENT;
>> +		error("failed to find block group %llu in extent tree",
>> +			cache->key.objectid);
>> +		goto fail;
>> +	}
>> +	leaf = path->nodes[0];
>> +	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
>> +	ret = memcmp_extent_buffer(leaf, &cache->item, bi, sizeof(cache->item));
>> +	btrfs_release_path(path);
>> +	/* No need to update */
>> +	if (ret == 0)
>> +		return ret;
>> +
>> +	/* Do the COW to update BLOCK_GROUP_ITEM */
>>  	ret = btrfs_search_slot(trans, extent_root, &cache->key, path, 0, 1);
>>  	if (ret < 0)
>>  		goto fail;
>> -	BUG_ON(ret);
>> +	if (ret > 0) {
>> +		ret = -ENOENT;
>> +		error("failed to find block group %llu in extent tree",
>> +			cache->key.objectid);
>> +		goto fail;
>> +	}
>>
>>  	leaf = path->nodes[0];
>>  	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed
  2019-07-08 12:50     ` Qu Wenruo
@ 2019-07-08 13:07       ` Nikolay Borisov
  2019-07-08 13:30         ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: Nikolay Borisov @ 2019-07-08 13:07 UTC (permalink / raw)
  To: Qu Wenruo, WenRuo Qu, linux-btrfs



On 8.07.19 г. 15:50 ч., Qu Wenruo wrote:
> 
> 
> On 2019/7/8 下午6:43, Nikolay Borisov wrote:
>>
>>
>> On 8.07.19 г. 10:33 ч., Qu Wenruo wrote:
>>> In write_one_cache_group() we always do COW to update BLOCK_GROUP_ITEM.
>>> However under a lot of cases, the cache->item is not changed at all.
>>>
>>> E.g:
>>> Transaction 123, block group [1M, 1M + 16M)
>>>
>>> tree block 1M + 0 get freed
>>> tree block 1M + 16K get allocated.
>>>
>>> Transaction 123 get committed.
>>>
>>> In this case, used space of block group [1M, 1M + 16M) doesn't changed
>>> at all, thus we don't need to do COW to update block group item.
>>>
>>> This patch will make write_one_cache_group() to do a read-only search
>>> first, then do COW if we really need to update block group item.
>>>
>>> This should reduce the btrfs_write_dirty_block_groups() and
>>> btrfs_run_delayed_refs() loop introduced in previous commit.
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>
>> I'm not sure how effective this is going to be
> 
> The effectiveness is indeed low.
> 
> For btrfs/013 test case, 64K page size, it reduces total number of
> delayed refs by less than 2% (5/300+)
> 
> And similar result for total number of dirty block groups.
> 
>> and isn't this premature
>> optimization, have you done any measurements?
> 
> For the optimization part, I'd say it should be pretty safe.
> It just really skips unnecessary CoW.
> 
> The only downside to me is the extra tree search, thus killing the
> "optimization" part.
> 

If that's the case then I'd rather see the 2nd patch dropped. It adds
more code for no gain.

<snip>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed
  2019-07-08 13:07       ` Nikolay Borisov
@ 2019-07-08 13:30         ` Qu Wenruo
  2019-07-22 12:59           ` David Sterba
  0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2019-07-08 13:30 UTC (permalink / raw)
  To: Nikolay Borisov, WenRuo Qu, linux-btrfs



On 2019/7/8 下午9:07, Nikolay Borisov wrote:
>
>
> On 8.07.19 г. 15:50 ч., Qu Wenruo wrote:
>>
>>
>> On 2019/7/8 下午6:43, Nikolay Borisov wrote:
>>>
>>>
>>> On 8.07.19 г. 10:33 ч., Qu Wenruo wrote:
>>>> In write_one_cache_group() we always do COW to update BLOCK_GROUP_ITEM.
>>>> However under a lot of cases, the cache->item is not changed at all.
>>>>
>>>> E.g:
>>>> Transaction 123, block group [1M, 1M + 16M)
>>>>
>>>> tree block 1M + 0 get freed
>>>> tree block 1M + 16K get allocated.
>>>>
>>>> Transaction 123 get committed.
>>>>
>>>> In this case, used space of block group [1M, 1M + 16M) doesn't changed
>>>> at all, thus we don't need to do COW to update block group item.
>>>>
>>>> This patch will make write_one_cache_group() to do a read-only search
>>>> first, then do COW if we really need to update block group item.
>>>>
>>>> This should reduce the btrfs_write_dirty_block_groups() and
>>>> btrfs_run_delayed_refs() loop introduced in previous commit.
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>
>>> I'm not sure how effective this is going to be
>>
>> The effectiveness is indeed low.
>>
>> For btrfs/013 test case, 64K page size, it reduces total number of
>> delayed refs by less than 2% (5/300+)
>>
>> And similar result for total number of dirty block groups.
>>
>>> and isn't this premature
>>> optimization, have you done any measurements?
>>
>> For the optimization part, I'd say it should be pretty safe.
>> It just really skips unnecessary CoW.
>>
>> The only downside to me is the extra tree search, thus killing the
>> "optimization" part.
>>
>
> If that's the case then I'd rather see the 2nd patch dropped. It adds
> more code for no gain.

Makes sense. I'm OK to drop it.

Thanks,
Qu
>
> <snip>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost
  2019-07-08  7:33 ` [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost Qu Wenruo
  2019-07-08 10:44   ` Nikolay Borisov
@ 2019-07-22 12:59   ` David Sterba
  1 sibling, 0 replies; 10+ messages in thread
From: David Sterba @ 2019-07-22 12:59 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Klemens Schölhorn

On Mon, Jul 08, 2019 at 03:33:51PM +0800, Qu Wenruo wrote:
[...]
> Reported-by: Klemens Schölhorn <klemens@schoelhorn.eu>
> Issue: 187
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed
  2019-07-08 13:30         ` Qu Wenruo
@ 2019-07-22 12:59           ` David Sterba
  0 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2019-07-22 12:59 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Nikolay Borisov, WenRuo Qu, linux-btrfs

On Mon, Jul 08, 2019 at 09:30:04PM +0800, Qu Wenruo wrote:
> On 2019/7/8 下午9:07, Nikolay Borisov wrote:
> > On 8.07.19 г. 15:50 ч., Qu Wenruo wrote:
> >> On 2019/7/8 下午6:43, Nikolay Borisov wrote:
> >>> On 8.07.19 г. 10:33 ч., Qu Wenruo wrote:
> > If that's the case then I'd rather see the 2nd patch dropped. It adds
> > more code for no gain.
> 
> Makes sense. I'm OK to drop it.

Ok, dropped.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-22 12:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08  7:33 [PATCH v2 0/2] btrfs-progs: Fix delayed ref leakage Qu Wenruo
2019-07-08  7:33 ` [PATCH v2 1/2] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost Qu Wenruo
2019-07-08 10:44   ` Nikolay Borisov
2019-07-22 12:59   ` David Sterba
2019-07-08  7:33 ` [PATCH v2 2/2] btrfs-progs: Avoid unnecessary block group item COW if the content hasn't changed Qu Wenruo
2019-07-08 10:43   ` Nikolay Borisov
2019-07-08 12:50     ` Qu Wenruo
2019-07-08 13:07       ` Nikolay Borisov
2019-07-08 13:30         ` Qu Wenruo
2019-07-22 12:59           ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).