All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: update refs for any root except tree log roots
@ 2021-10-01 17:57 Josef Bacik
  2021-10-02  1:11 ` Qu Wenruo
  2021-10-06 12:27 ` David Sterba
  0 siblings, 2 replies; 4+ messages in thread
From: Josef Bacik @ 2021-10-01 17:57 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: stable

I hit a stuck relocation on btrfs/061 during my overnight testing.  This
turned out to be because we had left over extent entries in our extent
root for a data reloc inode that no longer existed.  This happened
because in btrfs_drop_extents() we only update refs if we have SHAREABLE
set or we are the tree_root.  This regression was introduced by
aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree")
where we stopped setting SHAREABLE for the data reloc tree.

The problem here is we actually do want to update extent references for
data extents in the data reloc tree, in fact we only don't want to
update extent references if the file extents are in the log tree.
Update this check to only skip updating references in the case of the
log tree.

This is relatively rare, because you have to be running scrub at the
same time, which is what btrfs/061 does.  The data reloc inode has its
extents pre-allcated, and then we copy the extent into the pre-allocated
chunks.  We theoretically should never be calling btrfs_drop_extents()
on a data reloc inode.  The exception of course is with scrub, if our
pre-allocated extent falls inside of the block group we are scrubbing,
then the block group will be marked read only and we will be forced to
cow that extent.  This means we will call btrfs_drop_extents() on that
range when we cow that file extent.

This isn't really problematic if we do this, the data reloc inode
requires that our extent lengths match exactly with the extent we are
copying, thankfully we validate the extent is correct with
get_new_location(), so if we happen to cow only part of the extent we
won't link it in when we do the relocation, so we are safe from any
other shenanigans that arise because of this interaction with scrub.

cc: stable@vger.kernel.org
Fixes: aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/file.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 04e29b40a38e..b7d3559efcf7 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -734,8 +734,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans,
 	if (args->start >= inode->disk_i_size && !args->replace_extent)
 		modify_tree = 0;
 
-	update_refs = (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
-		       root == fs_info->tree_root);
+	update_refs = root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID;
 	while (1) {
 		recow = 0;
 		ret = btrfs_lookup_file_extent(trans, root, path, ino,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: update refs for any root except tree log roots
  2021-10-01 17:57 [PATCH] btrfs: update refs for any root except tree log roots Josef Bacik
@ 2021-10-02  1:11 ` Qu Wenruo
  2021-10-06 12:27   ` David Sterba
  2021-10-06 12:27 ` David Sterba
  1 sibling, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2021-10-02  1:11 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: stable



On 2021/10/2 01:57, Josef Bacik wrote:
> I hit a stuck relocation on btrfs/061 during my overnight testing.  This
> turned out to be because we had left over extent entries in our extent
> root for a data reloc inode that no longer existed.  This happened
> because in btrfs_drop_extents() we only update refs if we have SHAREABLE
> set or we are the tree_root.  This regression was introduced by
> aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree")
> where we stopped setting SHAREABLE for the data reloc tree.
>
> The problem here is we actually do want to update extent references for
> data extents in the data reloc tree, in fact we only don't want to
> update extent references if the file extents are in the log tree.
> Update this check to only skip updating references in the case of the
> log tree.
>
> This is relatively rare, because you have to be running scrub at the
> same time, which is what btrfs/061 does.  The data reloc inode has its
> extents pre-allcated, and then we copy the extent into the pre-allocated
> chunks.  We theoretically should never be calling btrfs_drop_extents()
> on a data reloc inode.  The exception of course is with scrub, if our
> pre-allocated extent falls inside of the block group we are scrubbing,
> then the block group will be marked read only and we will be forced to
> cow that extent.  This means we will call btrfs_drop_extents() on that
> range when we cow that file extent.

Oh my god, I forgot the corner case here!

>
> This isn't really problematic if we do this, the data reloc inode
> requires that our extent lengths match exactly with the extent we are
> copying, thankfully we validate the extent is correct with
> get_new_location(), so if we happen to cow only part of the extent we
> won't link it in when we do the relocation, so we are safe from any
> other shenanigans that arise because of this interaction with scrub.

But this makes me wonder, can we just leave scrub and balance exclusive?
There are already quite some limitations, like balance and send.

Adding balance and scrub to be exclusive to each other shouldn't cause
too much hassle, and can remove these checks.

>
> cc: stable@vger.kernel.org
> Fixes: aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree")
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks for the cause analyse and fix!
Qu

> ---
>   fs/btrfs/file.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 04e29b40a38e..b7d3559efcf7 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -734,8 +734,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans,
>   	if (args->start >= inode->disk_i_size && !args->replace_extent)
>   		modify_tree = 0;
>
> -	update_refs = (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) ||
> -		       root == fs_info->tree_root);
> +	update_refs = root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID;
>   	while (1) {
>   		recow = 0;
>   		ret = btrfs_lookup_file_extent(trans, root, path, ino,
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: update refs for any root except tree log roots
  2021-10-02  1:11 ` Qu Wenruo
@ 2021-10-06 12:27   ` David Sterba
  0 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2021-10-06 12:27 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Josef Bacik, linux-btrfs, kernel-team, stable

On Sat, Oct 02, 2021 at 09:11:19AM +0800, Qu Wenruo wrote:
> But this makes me wonder, can we just leave scrub and balance exclusive?
> There are already quite some limitations, like balance and send.
> 
> Adding balance and scrub to be exclusive to each other shouldn't cause
> too much hassle, and can remove these checks.

This was suggested in the past in the btrfsmaintenance project, but
rather for performance reasons. Some people run scrub and balance and
let it finish when the time comes, so no perf concerns. As scrub is
read-only the strict exclusion is not necessary as with send where it
can lead to bugs.

I'd like to keep it working as it is now, unless we find a stronger
reason to make it exclusive. As eg. zoned now uses relocation on the
background we'd have either scrub failing or bg reclaim not working.
It's a trade off, there will be always some problematic case. Not
allowing to do eg. a quick balance filter while scrub is running could
be pretty annoying.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] btrfs: update refs for any root except tree log roots
  2021-10-01 17:57 [PATCH] btrfs: update refs for any root except tree log roots Josef Bacik
  2021-10-02  1:11 ` Qu Wenruo
@ 2021-10-06 12:27 ` David Sterba
  1 sibling, 0 replies; 4+ messages in thread
From: David Sterba @ 2021-10-06 12:27 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, stable

On Fri, Oct 01, 2021 at 01:57:18PM -0400, Josef Bacik wrote:
> I hit a stuck relocation on btrfs/061 during my overnight testing.  This
> turned out to be because we had left over extent entries in our extent
> root for a data reloc inode that no longer existed.  This happened
> because in btrfs_drop_extents() we only update refs if we have SHAREABLE
> set or we are the tree_root.  This regression was introduced by
> aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree")
> where we stopped setting SHAREABLE for the data reloc tree.
> 
> The problem here is we actually do want to update extent references for
> data extents in the data reloc tree, in fact we only don't want to
> update extent references if the file extents are in the log tree.
> Update this check to only skip updating references in the case of the
> log tree.
> 
> This is relatively rare, because you have to be running scrub at the
> same time, which is what btrfs/061 does.  The data reloc inode has its
> extents pre-allcated, and then we copy the extent into the pre-allocated
> chunks.  We theoretically should never be calling btrfs_drop_extents()
> on a data reloc inode.  The exception of course is with scrub, if our
> pre-allocated extent falls inside of the block group we are scrubbing,
> then the block group will be marked read only and we will be forced to
> cow that extent.  This means we will call btrfs_drop_extents() on that
> range when we cow that file extent.
> 
> This isn't really problematic if we do this, the data reloc inode
> requires that our extent lengths match exactly with the extent we are
> copying, thankfully we validate the extent is correct with
> get_new_location(), so if we happen to cow only part of the extent we
> won't link it in when we do the relocation, so we are safe from any
> other shenanigans that arise because of this interaction with scrub.
> 
> cc: stable@vger.kernel.org
> Fixes: aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree")
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-06 12:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-01 17:57 [PATCH] btrfs: update refs for any root except tree log roots Josef Bacik
2021-10-02  1:11 ` Qu Wenruo
2021-10-06 12:27   ` David Sterba
2021-10-06 12:27 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.